I have been experimenting with using various types of daily/weekly open satellite data from NASA for signals and maybe it would make a good example? I would need my models to see more live performance before I would share this, just getting my idea out there for comment. Maybe this could be a good example notebook to show how creative one could possibly get in their search for data.
The notebook/article would cover:
- getting the signals universe
- mapping tickers to street addresses and sectors using yahoo finance (rain/temperature data might only apply to agriculture tickers etc)
- finding applicable coordinates to the satellite data format (HDF usually) by using a geocoding API(with local caching to keep request volume as low as possible) or other and some simple coordinate transforms
- requesting and reading the data from NASA with an API key using an HDF python library and an http library
- possible sampling, models, and filters (what land area do you sample 1km^2, 3km^2…,differentiation, rolling averages, linear modeling, maybe more complex ML models)
- automating everything just mentioned to run weekly and submit (featuring new data frame submissions with NumerAPI )
- possibly getting validation data for the signal if download times allow (for now I have no validation data, it takes quite a while to do 5000 ticker signals for one week)
My goal would be to produce heavily commented code since there would be 3-4 APIs/modules used besides pandas and have an accompanying medium or forum post for deeper explanation. Any feedback (including telling me this just isn’t a good idea) is appreciated.