Factor Analysis of Cryptocurrency Returns Using Momentum Indicators

accountnumber1 · August 24, 2024, 11:28am

This is my entry for the Numerai/Ocean competition.
Hope you enjoy.
Let me know if there are any questions!

accountnumber1 · August 24, 2024, 11:31am

Ok, I think that hasn’t worked? Its set to private? Here is the text version:

Factor Analysis of Cryptocurrency Returns Using Momentum Indicators

Abstract

This report investigates the effectiveness of momentum-based indicators in predicting cryptocurrency price movements, using Numerai’s discretized return data. Key methodologies include the application of Simple Moving Average (SMA) and Moving Average Convergence Divergence (MACD) indicators, with a focus on identifying both inductive and anti-inductive price patterns. Through a series of experiments, optimal window sizes for momentum indicators were determined, and the performance of these indicators was evaluated across different subsets of cryptocurrencies. Simulations of future returns suggest a high probability of profitability using the SMA-based MACD (SMACD) strategy. However, potential limitations such as data discretization and coding errors are acknowledged. The results indicate that momentum-based strategies, particularly those employing SMACD, offer promising predictive power in the cryptocurrency market, with significant opportunities for further research.

Introduction

In technical analysis, various indicators have been developed to predict asset price movements, with momentum-based indicators being among the most popular. Indicators like the Relative Strength Index (RSI), Moving Average Convergence Divergence (MACD), Simple Moving Average (SMA), and Exponential Moving Average (EMA) are all variations of the fundamental idea that an asset’s expected return is, in some way, influenced by its past returns.

Numerai’s crypto contest provides historical return data for various cryptocurrency assets, categorized into five discrete bins that approximate a bell curve. This raises an intriguing question: can momentum indicators, typically based on raw price data, still function effectively when applied to these binned returns?

In this study, I will explore this question by comparing different momentum-based indicators and their respective parameters to determine whether any of them show a significant correlation with future returns. Additionally, I will consider refinements such as focusing on different subgroups of cryptocurrencies, and I will evaluate the confidence intervals of these models to assess the robustness of their future performance.

Data

Numerai’s signals data consists of discretized returns for the top cryptocurrencies, recorded at each weekday to maintain consistency with their other competitions. These returns are calculated over a 30-day time horizon. Notably, the dataset reflects the top cryptocurrencies at each individual timestep, rather than the top cryptocurrencies as of the current date. This approach helps mitigate the risk of survivorship bias, as it includes assets that may no longer be in the top ranks at present. One major advantage of using Numerai’s data is its simplicity, as the required data has already been collated and preprocessed. Additionally it can be accessed freely and easily via their api, as detailed on Numerai’s website.

The dataset spans from June 1, 2020, to May 22, 2024 (at the time of writing), covering periods of both bull and bear markets. This broad timeframe provides a diverse range of market conditions, which is essential for testing the robustness of momentum-based indicators across different market environments.

Iteration 1: Initial Indicator Performance Assessment

In this analysis, I tested various momentum indicators by examining their correlation with Numerai’s targets. Initially, I used a 20-day window for most indicators, as suggested by prior research, with the MACD (Moving Average Convergence Divergence) calculated as the difference between 20-day and 100-day moving averages. Additionally, I experimented with a longer 100-day Simple Moving Average (SMA).

Since Bollinger Bands are typically represented as a discrete step function, which isn’t suitable for direct correlation analysis, I modified the Bollinger Bands to use a ratio of the average price divided by the standard deviation. This adjustment allowed for a continuous measure that could be better correlated with the targets.

Momentum indicators can correlate with future returns in two primary ways:

Inductive Momentum: Assets that have performed well in the past continue to do well in the future (positive correlation).
Anti-Inductive Momentum: Assets become ‘overbought’ or ‘oversold,’ leading to a reversal towards the mean (negative correlation).

In this case, the latter, anti-inductive momentum, appears to be the dominant effect. Therefore, I have inverted the following graph to present anti-correlations as upward trends, which I believe makes the visualization more intuitive.

The results show that the MACD, which is composed of the ‘MACD line,’ the ‘Signal line,’ and the ‘MACD histogram,’ performs comparably to the SMA. However, its performance varies across different periods, highlighting the importance of considering multiple indicators over different time horizons.

Iteration 2: Refining Indicator Accuracy by Subgroup Analysis

To refine the analysis, I focused on the two most promising indicators identified in the first approach: the Simple Moving Average (SMA) and the Signal Line from the MACD. I applied these indicators across various subsets of cryptocurrencies to investigate whether different types of coins exhibit distinct momentum behaviors.

The subsets of coins were derived based on themes, which were generated using ChatGPT. These themes categorize coins into groups, such as “utility tokens” and “governance tokens.” Each subset contains approximately 10 coins, with some overlap between the lists. Correlations were calculated within each subset to assess the performance of the SMA and Signal Line indicators.

To avoid double counting due to the 20-day prediction horizon, hypothesis tests were conducted on every 20th data point. The specific coin subsets will be listed in the appendix for reference.

MACD Signal Line Graph:

In the case of the MACD Signal Line, the “utility tokens” and “all tokens” subsets showed stronger correlations with future returns compared to the entire universe of coins. These correlations were significant at the standard 95% confidence level, and even at the 99.5% confidence level for the “all tokens” subset, without corrections for multiple hypothesis testing.

SMA Graph:

For the SMA indicator, the “governance tokens,” “utility tokens,” and “all tokens” subsets demonstrated better correlations than the entire universe. These results were significant at the 95% confidence level, with the “all tokens” subset reaching significance at the 99.8% confidence level, again without correcting for multiple hypothesis testing.

These findings suggest that certain subgroups of cryptocurrencies may be more responsive to momentum-based indicators. The results indicate potential opportunities for emphasizing specific subgroups, such as utility and governance tokens, in momentum-based analysis to achieve better predictive performance.

Iteration 3: Search for Optimal Window Size

In this iteration, I conducted a grid search to identify the optimal window size for momentum indicators. Since we are only exploring a single dimension of parameters (the window size), a simple grid search was sufficient, eliminating the need for more complex optimization algorithms.

The results indicate that short windows of 5-20 days are optimal for capturing anti-inductive momentum (where assets revert to the mean), while longer horizons of around 200 days are more effective for detecting inductive momentum (where past performance continues into the future).

These findings suggest that the difference between the 10-day and 200-day windows could yield an effective MACD (Moving Average Convergence Divergence) indicator. To test this, I calculated the MACD using both the traditional Exponential Moving Average (EMA) and a Simple Moving Average (SMA) for a more direct comparison. I delayed the start of the analysis by 250 days to ensure that both the 10-day and 200-day averages were fully established.

Interestingly, the SMA-based MACD outperformed the traditional EMA-based version, suggesting that the simpler approach may be more effective in this context.

Simulating Future Returns

To assess the robustness of the SMA-based MACD (or SMACD), I simulated future returns by randomly sampling 20-day chunks of data over 10,000 iterations. The resulting graph of projected returns indicates a 95% probability of achieving a profit within just over 100 days.

Conclusion

While the findings from the SMA-based MACD (SMACD) simulation are promising, it’s important to consider several caveats:

Potential Coding Errors: Despite careful efforts to avoid mistakes, there’s always a chance of errors in the code. The most significant concern—data leakage from future values—would likely cause an inductive rather than anti-inductive relationship, which is not observed here.
Discretization and Data Bias: The discretization of returns in Numerai’s data might introduce biases. This process could oversimplify the true behavior of the assets. For example, an asset with consistent small losses (e.g., -5%) and occasional large gains (e.g., +20%) might be incorrectly categorized if the losses fall within a neutral or positive bin, distorting the overall picture. This kind of data handling could affect the validity of the results, making it crucial to consider how discretization might be influencing the observed patterns.

Assuming these results hold, the SMA-based MACD seems to be an effective momentum indicator with a high probability of generating positive returns. Further research should explore the behavior of this indicator across different cryptocurrency subgroups, as this could provide additional insights and opportunities for model refinement.

Appendix: stock subsets

Privacy Coins

Monero (XMR) - Known for its strong privacy features, Monero is a leading privacy coin 1 2.
Zcash (ZEC) - Offers optional privacy features through its “shielded” transactions 3 4.
Dash (DASH) - Includes a feature called PrivateSend for enhanced privacy 5 6.
Secret (SCRT) - Focuses on privacy-preserving smart contracts 7.
Oasis Network (ROSE) - Aims to provide privacy and scalability for decentralized applications 8.

Meme Coins

Dogecoin (DOGE) - Originally created as a joke, Dogecoin has gained a large following 9 10.
Shiba Inu (SHIB) - Often referred to as the “Dogecoin killer,” it has a strong community 11.
Bonk (BONK) - A newer meme coin gaining popularity 12.
Pepe (PEPE) - Inspired by the popular internet meme, Pepe the Frog 13.
Myro (MYRO) - Another meme coin with a growing community 14.
FLOKI - Named after Elon Musk’s dog, it has a dedicated fanbase 15.
Dogwifhat - A lesser-known but emerging meme coin 16.

Payment Tokens

Ethereum (ETH) - Widely used for transactions and smart contracts 17 18.
Bitcoin Cash (BCH) - Designed for faster and cheaper transactions compared to Bitcoin 19.
Ripple (XRP) - Known for its quick and low-cost international payments 20.
Dash (DASH) - Also used for everyday transactions due to its speed and low fees 5.
Stellar (XLM) - Focuses on cross-border payments and remittances 21.
Binance Coin (BNB) - Used for transactions within the Binance ecosystem 22.
Monero (XMR) - Also used for private transactions 1.
Zcash (ZEC) - Offers private transactions as well 3.
Tether (USDT) - A stablecoin often used for transactions 23.
Cardano (ADA) - Known for its secure and scalable transactions 24.

Utility Tokens

Ethereum (ETH) - Used for gas fees on the Ethereum network.
Binance Coin (BNB) - Used for transaction fees on Binance.
Chainlink (LINK) - Used to pay for services on the Chainlink network.
Uniswap (UNI) - Used for governance and transaction fees on Uniswap.
Filecoin (FIL) - Used to pay for storage on the Filecoin network.
Basic Attention Token (BAT) - Used within the Brave browser ecosystem.
VeChain (VET) - Used for supply chain management.
Theta (THETA) - Used for decentralized video streaming.
Golem (GLM) - Used to pay for computing power on the Golem network.
Synthetix (SNX) - Used for creating synthetic assets on the Synthetix platform.

Governance Tokens

Uniswap (UNI) - Allows holders to vote on protocol changes.
Compound (COMP) - Used for governance in the Compound protocol.
Maker (MKR) - Used for governance in the MakerDAO system.
Aave (AAVE) - Used for governance in the Aave protocol.
Curve DAO Token (CRV) - Used for governance in the Curve Finance protocol.
SushiSwap (SUSHI) - Used for governance in the SushiSwap protocol.
Yearn Finance (YFI) - Used for governance in the Yearn Finance protocol.
Balancer (BAL) - Used for governance in the Balancer protocol.
1inch (1INCH) - Used for governance in the 1inch network.
Kyber Network (KNC) - Used for governance in the Kyber Network.

Security Tokens

tZERO (TZROP) - A security token for the tZERO platform.
Polymath (POLY) - Used for creating and managing security tokens.
Securitize (DS) - Used for digital securities on the Securitize platform.
Harbor (HBR) - Used for compliance and issuance of security tokens.
Swarm (SWM) - Used for tokenizing real-world assets.
Tokeny (T-REX) - Used for issuing and managing security tokens.
Blockstack (STX) - Used for decentralized applications and security tokens.
Neufund (NEU) - Used for equity tokens on the Neufund platform.
Science Blockchain (SCI) - A security token for the Science Blockchain fund.
SPiCE VC (SPICE) - A tokenized venture capital fund.

Other

Bitcoin (BTC) - The original and most well-known cryptocurrency 25.
Ethereum (ETH) - A leading platform for decentralized applications 17 18.
Binance Coin (BNB) - Used within the Binance ecosystem 22.
Cardano (ADA) - Known for its secure and scalable transactions 24.
Polkadot (DOT) - Aims to enable different blockchains to interoperate.
Solana (SOL) - Known for its high-speed transactions.
Avalanche (AVAX) - Focuses on high throughput and low latency.
Chainlink (LINK) - Provides real-world data to smart contracts.
Litecoin (LTC) - Often referred to as the silver to Bitcoin’s gold.
Stellar (XLM) - Focuses on cross-border payments and remittances 21.

accountnumber1 · August 26, 2024, 2:57pm

P.S. I am AKA ‘duckmatter’ on discord/the desight submission site.

Source code is available upon request.

yunusgumussoy · August 26, 2024, 5:12pm

I enjoyed reading your report. Analysis of momentum-based indicators always interest me. I agree that SMA and MACD have an impressive predictive power in cryptocurrency price variance. I also like the way you approach to determining optimal window sizes and evaluating indicator performance across different cryptocurrency subsets. Your recognition of potential coding errors and the impact of data discretization reflects a balanced and realistic perspective

nishimoto · August 27, 2024, 4:58am

Thank you for your comments on my post and good report. I also enjoyed reading your report.

The idea of SMA-based MACD and the idea of categorizing the tokens by token type is interesting. Is the price movement similar for each type of token? I thought it would be more useful to classify them by price movement rather than by type.

accountnumber1 · August 27, 2024, 10:47am

Thanks!

Categorizing the tokens by price movement is an alternative worth considering. Price movement can differ between types of coin, for example meme coins tend to have dramatic bubbles! How would you suggest identifying coins by price movement? Binning them by volatility, perhaps?

datahunter · August 27, 2024, 12:46pm

Not sure if I’m right (weak on my finance skills), but maybe - Use RSI to categorize tokens into overbought, oversold, or neutral conditions, which can help identify potential price reversals or sustained trends (or) as you said Volatility, Calculate the standard deviation of daily or weekly price changes for each token over a specified period (e.g., 30 days, 90 days) maybe?

accountnumber1 · August 27, 2024, 1:44pm

Well, the idea is to categorise coins by type and then look at indicators like RSI to order the coins within each category. This means there is no point in using RSI or SMA to categorise coins, only to use the same indicator to order the coins within the categories! However, the idea of using one indicator to categorise and then another to order - for example, categorise by 90-day SMA and then order by 20-day SMA is quite intriguing. They will have to organise another contest so we can try new ideas!

mlh_alavi · August 30, 2024, 5:51am

Wow!!! you have done a really great work
the analysis and insights are well-studied
good luck

Topic		Replies	Views
Crypto Factor model - Prediction the return of Bitcoin and Crypto with High Trading Volume Data Science	6	628	August 27, 2024
Crypto Factor Modeling for evaluating crypto return Data Science	8	474	August 27, 2024
Multi-Factor Risk Modeling for Cryptocurrency Price Prediction Data Science	5	542	August 27, 2024
A Detailed Case Study on Crypto Multi-factor Risk Analysis Data Science	3	516	August 27, 2024
Crypto Price Dynamics Data Science	6	1067	October 18, 2024

Factor Analysis of Cryptocurrency Returns Using Momentum Indicators

Factor Analysis of Cryptocurrency Returns Using Momentum Indicators

Abstract

Data

Iteration 1: Initial Indicator Performance Assessment

Iteration 2: Refining Indicator Accuracy by Subgroup Analysis

MACD Signal Line Graph:

SMA Graph:

Iteration 3: Search for Optimal Window Size

Simulating Future Returns

Conclusion

Appendix: stock subsets

Privacy Coins

Meme Coins

Payment Tokens

Utility Tokens

Governance Tokens

Security Tokens

Other

Related topics