With the staking cap close to being reached I believe that a lot of people will probably want to convert from a constantly compounding stake to a income generating stake (ie withdrawing a certain percentage each period of time). The goal here would be to come up with a optimal percentage to unstake every week such that you maximize the time-discounted net present value of the expected geometric mean of all future nmr payed out by unstaking it. Geometric mean is being used instead of the regular mean since rewards are multiplicative rather than additive and also since it is the metric used for Kelley Criterion. Hopefully coming up with a good strategy for this type of stake will prevent people from just going all in or all out on there stake based on the payout reduction. I anticipate that as the payout reduction increases, the optimal withdrawal rate will increase, which will create an equilibrium that prevents the total amount staked from getting too high while also keeping participants with a satisfactory payout. To that end I am going to be showing my progress on this publicly to get feedback from everyone.
Here is my current plan for developing the optimal stake withdrawal strategy:
1 Create method for simulating expected future returns
2 Create dataset of 1000s of simulated returns timeseries over course of several years using method from 1
3 Test different withdrawal rates on the timeseries dataset thereby creating a timeseries of ‘payments’
4 Calculate time discounted net present value of the payment timeseries
5 Determine optimal withdrawal rate by seeing which rate has the highest geometric mean net present value
For step one I want to figure out what type of probability distribution to sample from to simulated expected future returns. I can pull several participant scores from the API, mix them together and compare them to different types of probability distributions. Here is my current testing regarding what type of distribution to use:
import numerapi
import pandas as pd
import numpy as np
from scipy import stats
#returns_distribution_test
api = numerapi.NumerAPI()
models = ['integration_test_7','no_formal_training','sorios','themicon','arbitrage','uuazed','niam','cryptoquant','no_formal_agreement','leverage','uuazed2','bookofillusions','era__mix__2000']
user_df = pd.DataFrame(api.daily_submissions_performances("themicon")).groupby("roundNumber").last()
all_corrs = []
for i in range(len(models)):
user_df = pd.DataFrame(api.daily_submissions_performances(models[i])).groupby("roundNumber").last()
user_corrs = user_df.loc[:,"correlation"]
user_corrs = user_corrs.dropna()
user_corrs = user_corrs.values
all_corrs = np.concatenate([all_corrs,user_corrs])
mean = np.mean(all_corrs)
std = np.std(all_corrs)
random_gauss = np.random.normal(mean,std,len(all_corrs))
random_lognorm = np.random.lognormal(np.log((mean**2)/(np.sqrt(mean**2+std**2))),np.sqrt(np.log(1+(std**2)/(mean**2))),len(all_corrs))
random_laplace = np.random.laplace(mean,2**-0.5*std,len(all_corrs))
print('Gaussian distribution similarity scores')
print(stats.ks_2samp(all_corrs, random_gauss))
print('log-normal distribution similarity scores')
print(stats.ks_2samp(all_corrs, random_lognorm))
Here I use the ks statistic to compare how similar the returns from several models are to different distributions that all have the same mean and standard deviation. The output from this code is:
Gaussian distribution similarity scores
Ks_2sampResult(statistic=0.03429602888086647, pvalue=0.5255367523651229)
log-normal distribution similarity scores
Ks_2sampResult(statistic=0.2635379061371841, pvalue=3.355471473673669e-34)
laplacian distribution similarity scores
Ks_2sampResult(statistic=0.07490974729241878, pvalue=0.0037335231882827612)
The output shows that the gaussian distribution is by far the most similar to distribution of the payouts from participants. This is a little surprising as I would expect the ‘heavy-tailed’ laplacian distribution to be closer. Before I continue I would appreciate some feedback, both on my strategy and on my current testing.
Thanks in advance.
**edit fixed formatting