Signals: Plugging in the data from Quandl

Quandl example model:

Google Colab notebook: Signals_Quandl_EOD_baseline.ipynb

Quandl is a financial, economic, alternative data marketplace which provides premium and free data.

One such data source is End of Day US Stock Prices by QuoteMedia (premium, so need to set API_KEY in

Updated daily, this data feed offers end of day prices, dividends, adjustments and splits for US publicly traded stocks with history to 1996. Prices are provided both adjusted and unadjusted.

This example downloads the whole ‘Time-series’ data in a .zip file and loads Adj_Open and Adj_Close columns from it. However, specific tickers for specific time span can also be loaded iteratively using API(much slower). Getting started with the API.

While the feature extraction and modeling part are very similar to the main, the focus here is to make the data loading flexible so different data sources can be easily ‘plugged’.

Steps to re-arrange the data as in Signals’ main

  1. Find common tickers between EOD data source ticker list and Numerai Signals Universe’s yahoo tickers.
  2. Specify the columns in download_full_and_load with common tickers and rename columns as required by feature extraction setup.
    # column names in the csv file without headers
    cols = [
        "ticker", "date", "Open", "High", "Low", "Close", "Volume", "Dividend",
        "Split", "Adj_Open", "Adj_High", "Adj_Low", "Adj_Close", "Adj_Volume",

    # usecols refers to the column in the csv.
    # using only [ticker, date, adj_open, adj_close]
    # Loading only needed columns as FP32
    print("loading from csv...")
    full_data = pd.read_csv(
        usecols=[0, 1, 9, 12],
        dtype={0: str, 1: str, 9: np.float32, 12: np.float32},

    # renaming the columns
    filter_columns = ["ticker", "date", "Adj_Open", "Adj_Close"]
    full_data.columns = filter_columns
    full_data.set_index("date", inplace=True)
    full_data.index = pd.to_datetime(full_data.index)
  1. Map ticker names to Bloomberg tickers using Numerai’s Bloomberg ticker map.
    full_data = full_data[full_data.ticker.isin(common_tickers)]
    full_data["bloomberg_ticker"] =
        dict(zip(ticker_map["yahoo"], ticker_map["bloomberg_ticker"]))

After creating a day_chg column and applying RSI and SMA on them, features are quintiled and lags are calculated as in main

Validation results:

Thanks @_liamhz for the feedback :slight_smile:


nice! not a bad result for such simple data.

you say this is premium, how much does this data cost you per month?

(also did you train on validation for this great cumsum graph?)

This one costs me USD $49/mo. but I guess some organizational licensing is there.

This wasn’t trained on validation data. However, this has some extra features compared to default yfinance

  • a day_change column
  • RSI: (14, 21)
  • SMA: (14, 21)
  • quintilation factor to 100 from 5

we get 72 features after computing lags.

1 Like