Free or cheap data and tools for Numerai Signals

Please feel free to add any others that you know of. I’m not associate with any of these and I’m also not vouching for the quality of any of them. :slight_smile:

30 Likes

FMP - Free

4 Likes

Alternative data (sentiment)

1 Like
  • Borsdata - Free/Cheap. Focus on Nordic markets. API access requires Pro membership.

I also found a tutorial on how to analyze financial data using the FMP API.

2 Likes

For R users, the tidyquant package has quite a few handy functions to download data from different sources.

1 Like

Sharadar has prices and fundamentals.

1 Like

Anyone have any experience with these?


2 Likes

ISIN codes
Signals ticker mapping challenges have taken a good part of the zoom and twitch sessions recently. Just came across this package, another arrow in our quiver:

# https://pypi.org/project/investpy/
# Financial Data Extraction from Investing.com with Python
!pip install investpy

import investpy
import pandas as pd

pd.DataFrame(
    investpy.stocks.get_stocks_dict(
        country=None,
        columns=["symbol", "country", "name", "full_name", "isin", "currency"],
        as_json=False,
    )
)

voilà, you have name, country, and isin codes for 40,000 stocks

symbol country name full_name isin currency
0 TS argentina Tenaris Tenaris LU0156801721 ARS
1 APBR argentina PETROBRAS ON Petroleo Brasileiro - Petrobras BRPETRACNOR9 ARS
2 GGAL argentina Grupo Financiero Galicia Grupo Financiero Galicia B ARP495251018 ARS
3 TXAR argentina Ternium Argentina Ternium Argentina SA ARSIDE010029 ARS
4 PAMP argentina Pampa Energia Pampa Energia SA ARP432631215 ARS
8 Likes

Hi Degerhan,

That is very helpful as I’m just now approaching Signals and figuring out how to use the bloomberg tickers. Did you go one step further and match them to the investpy list? How did you match them?

1 Like

@sirbradflies, I did not pursue this further.

In real life I only trade commodity futures spreads, where the symbols are exact and data is extremely clean. I was trying to gather a similar quality dataset for Signals until I realized I was trying to boil the data ocean instead of building useful things.

@arbitrage recently advised in rocketchat that he is working only with US stocks for now (and he is getting solid results). If you remove the US suffix from bloomberg_ticker, what remains will be the Yahoo ticker for all except about 10 symbols, and historical data coverage is quite good. I’ve decided to just focus on the US stock universe until I get my act together on the modeling side; and expand the dataset at a later phase.

(actually to speed up iterations, I am only working with the largest cap 900 US stocks for now, list here: SP900 )

hi @degerhan,

Thanks for the tips. I had actually already started focused on US stocks when I realized most of them were easily downloadable (except for Berkshire and a few others :slight_smile: ).

I’ll share an update if I come up with a robust way to match all the bloomberg tickers on Yahoo Finance.

For those who want to get data directly from FINVIZ there is a code implemented which helps to get specific data from each of the companies we want.

Attached code:

# Data that we want to extract from Finviz Table
metric = ['Price', 'EPS next 5Y', 'Beta', 'Shs Outstand']

def fundamental_metric(soup, metric):
    # the table which stores the data in Finviz has html table attribute class of 'snapshot-td2'
    return soup.find(text = metric).find_next(class_='snapshot-td2').text
   
def get_finviz_data(ticker):
    try:
        url = ("http://finviz.com/quote.ashx?t=" + ticker.lower())
        soup = bs(requests.get(url,headers={'User-Agent': 'Mozilla/5.0 (Windows NT 6.1; WOW64; rv:20.0) Gecko/20100101 Firefox/20.0'}).content)
        dict_finviz = {}        
        for m in metric:   
            dict_finviz[m] = fundamental_metric(soup,m)
        for key, value in dict_finviz.items():
            # replace percentages
            if (value[-1]=='%'):
                dict_finviz[key] = value[:-1]
                dict_finviz[key] = float(dict_finviz[key])
            # billion
            if (value[-1]=='B'):
                dict_finviz[key] = value[:-1]
                dict_finviz[key] = float(dict_finviz[key])*1000000000  
            # million
            if (value[-1]=='M'):
                dict_finviz[key] = value[:-1]
                dict_finviz[key] = float(dict_finviz[key])*1000000
            try:
                dict_finviz[key] = float(dict_finviz[key])
            except:
                pass 
    except Exception as e:
        print (e)
        print ('Not successful parsing ' + ticker + ' data.')        
    return dict_finviz

finviz_data = get_finviz_data(ticker)

finviz_data
2 Likes

Any thoughts on Alpaca API v2

curl -X GET -H “APCA-API-KEY-ID: …” -H “APCA-API-SECRET-KEY: …” “https://data.alpaca.markets/v2/stocks/AAPL/trades?start=2021-02-2 4T00:57:47.317087Z&end=2021-02-25T00:57:47.317087Z”

Free account max end time seems to be utcnow - 15min

1 Like

Did anybody find a free data source for order book data?

Thanks

Does anyone have experience with https://eodhistoricaldata.com?

1 Like

This might be an important detail from their disclaimer -

Hi, I want to fetch last 48 days close price by every trick. Is there any free api resource can be suggested ?

I’ve just started using them, along with Altman’s UndocumentedMatLab software. So far, so good, we’ll see how it goes. EOD seems to have a lot of info available, which should take a lot of the more tedious programming tasks off the table.

EOD provides support for other languages, and the data prices aren’t outrageous. It would be great if Numerai could work a deal though them (though I can understand why they might not want to do so); perhaps interested individuals could figure out a way to access the data via some sort of group account. IDK, I’m just throwing spaghetti at the walls now…