Crypto Price Dynamics

The report below outlines a multi-factor model developed to predict cryptocurrency prices by integrating market data, sentiment analysis, and macroeconomic indicators. The work is part of the Ocean Protocol competition hosted on Desights. Data was collected from sources including Binance, Yahoo Finance, CoinMarketCap, Google Trends, and macroeconomic databases like the World Bank and FRED.

Data Collection and Preparation:
The dataset includes over 1 million rows of OHLCV data for 1,439 unique cryptocurrency symbols. Supplementary data was gathered on market sentiment through the Fear & Greed Index, coin fundamentals, and Google search trends. Macroeconomic factors such as inflation, GDP, and interest rates were also incorporated.

Feature Engineering:
Features were created from the collected data, including moving averages, liquidity factors, sentiment buckets, and macroeconomic trends. These features were used to train LGBMRegressor models, with the best model achieving an R-squared of 0.645, indicating a reasonable level of predictive accuracy.

Key Findings:

  1. Sentiment Indicators: The Fear & Greed Index showed strong correlations with price trends, with values exceeding 0.94.
  2. Macroeconomic Influence: While valuable, macroeconomic features had a lesser impact compared to sentiment and market data.
  3. Trading Volume Correlation: High trading volumes were consistently aligned with price increases, indicating strong market interest.

Conclusion:
This work is ongoing, with further model development and data preparation planned. Suggestions from the Numerai community are highly valued to improve the model’s accuracy and robustness.

For a detailed methodology and results, the full report is available here.
Additional details and the code can be found in the GitHub repository.

6 Likes

great work Ana as always :100: :boom:

2 Likes

Hi @whiterider
That’s great! :smiley:

1 Like

would you mind share the process and data that supports your claim?

Thank you for your interest! I’m happy to share more details on the process and data behind. Here’s a brief overview:

Data Collection:

For the 1,439 unique cryptocurrency symbols from the train_targets.parquet dataset, the following data was downloaded and processed:

  • OHLCV Data: Collected from Binance and Yahoo Finance The data captures daily open, high, low, close, and volume data for each symbol.
  • Coin Information: Collected from CoinMarketCap coin information: circulating_supply, total_supply, market_cap, is_active, source_code(relevant information for open source projects), name, keywords.
  • Sentiment Data: The Fear & Greed Index was gathered from Alternative.me, providing market sentiment insights, ranging from extreme fear to extreme greed.
  • Google Trends: Search trend data was pulled from Google Trends to capture public interest in specific cryptocurrencies. Note: This is not yet integrated in the features and in the prediction model.
  • Macroeconomic Data: Inflation, GDP, and interest rate data were sourced from the World Bank, FRED (Federal Reserve Economic Data), and Trading Economics, focusing on key global economies.

Feature Engineering:

Several features are engineered to build the model:

  • Moving Averages (MA): Calculated across different time periods (1, 7, 30 days) to capture price trends.

  • Rate of Change (RoC): Measures the percentage change in the close price over different intervals to assess price momentum.

  • Exponential Moving Averages (EMA): Provides a weighted average of close prices, giving more emphasis to recent data for quick trend detection.

  • Price Lag Features: Includes features like close_lag_1, close_lag_7, and close_lag_30, which capture the lagged close prices over different timeframes.

  • Percent Change (Pct Chg): Represents the percentage change in the close price over a 30-day window (pct_chg_30).

  • Volatility Measures: Derived from the variance in the close price over certain periods to assess market risk.

  • Liquidity and Size Factors: Derived from trading volumes and market capitalization data.

  • Sentiment Interaction Features: Combined the Fear & Greed Index with price trend data, which showed strong correlations.


    Screenshot 2024-09-20 at 09.56.52

  • Macroeconomic Factors: Integrated overall inflation, interest rates, and GDP as weighted global metrics to capture external economic influences.

Access to Data and Code:

The code is available at the GitHub repository I shared in the initial post. Running each notebook will give you access to the data itself. PS: some environment variables are needed if you want to run all the notebooks. Specifically, you will need the following:

  • COINMARKETCAP_API_KEY
  • WORLD_BANK_API_KEY
  • STLOUISFED_API_KEY

You can add these keys to the .env file, and everything should run smoothly. These API keys are free to obtain.

I hope my response contains the information you requested. If not, please let me know. Also, if you notice something off, feel free to point it out.

4 Likes

Amazing work, thank you for sharing!

Very interesting, thanks for sharing.