Modern Portfolio Theory for my models

nyuton · November 3, 2022, 4:44pm

Hi,

I’ve built a good number of models in the recent months and some of them are promising.
So I was wondering, how I can stake these models to get good results with low volatility.
What weights I should give them.

Apparently modern portfolio theory is exactly for that.
Check out the math here: Modern portfolio theory - Wikipedia

I also implemented it for checking how I should share stake (or weight in an ensemble) between two models.
You can run your code too: numerai/Portfolio optimalization.ipynb at main · nemethpeti/numerai · GitHub

Just replace your model names and multipliers.

Have fun!
Feedback is welcome!

kayeffnumeraitor · November 4, 2022, 1:55pm

Scary how I thought earlier this week about how to optimize my model returns, and then you create this post. Will definitely give it a try if I find some time on the weekend.

kayeffnumeraitor · November 4, 2022, 3:47pm

So I couldn’t wait because I realised because of my research earlier this week I already had most of everything that I needed and curiosity became to great.

Your code will work only for two models what I didn’t like, so I read the wiki article you posted and quickly dangled together some code that will do the same as your code but for all models. I don’t have some shareable code of how to collect the necessary model performance data, but I wanted to share the optimization part, feel free to update your github repo with this, as I don’t have a github account yet.

So suppose you have a representation of your own model performances in a pandas dataframe that looks like the result generated by this code:

models = ["modelA", "modelB", "modelC"]

fake_tcs = np.random.normal(0.,0.08,(20,len(models)))
fake_corrs = np.random.normal(0.02,0.02,(20,len(models)))
corr_columns = ["corr_" + m for m in models]
tc_columns = ["tc_" + m for m in models]

fake_df = pd.DataFrame(
    np.concatenate(
        [fake_corrs,fake_tcs],
        axis=1
    ),
    columns = corr_columns + tc_columns
)

fake_df["round"] = np.arange(20) + 312

df = fake_df

Dummy dataframe:

import pandas as pd
from scipy.optimize import minimize
import numpy as np

#Your desired multipliers for each model
corr_multiplier = 1.
tc_multiplier = 0.5

reordered_columns = []
for tcol, ccol in zip(tc_columns, corr_columns):
    reordered_columns.append(tcol)
    reordered_columns.append(ccol)

#calculate the returns
mpo_data = pd.concat(
        [
            df[corr_columns] * corr_multiplier,
            df[tc_columns] * tc_multiplier
        ],
        axis=1
    )[reordered_columns] \
    .rolling(2, axis=1) \
    .sum()[reordered_columns[1::2]] \
    .rename( \
        columns={
            c: "return_" + m
            for c,m in zip(corr_columns, models)
        }
    )

def mpo_function(x, Rcov, q, R):
    return x.reshape(1,-1).dot(Rcov).dot(x) - q*R.reshape(1,-1).dot(x)

#covariance matrix of the returns
Rcov = mpo_data.cov().values

#risk tolerance factor, if too high optimization will fail
q = 0.01

#return per model
R = mpo_data.sum(axis=0).values

result = minimize(
    mpo_function,
    args=(Rcov, q, R),
    x0=np.ones(len(models)),
    method='Nelder-Mead'
)
print("Minimization successful:", res.success)

for w, m in zip(result.x, models):
    print(m, w)

Have fun with it, feel free to share / modify / improve, but don’t hold me liable if you lose money because I made a mistake

nyuton · November 5, 2022, 2:04pm

Awesome, thanks!

So I made some tweaks and merged into my code, so that it works for many models now.
Here you can find the generic version:

github.com

nemethpeti/numerai/blob/main/Portfolio optimalization_v2.ipynb

{
 "cells": [
  {
   "cell_type": "code",
   "execution_count": 1,
   "id": "98960a83-a5f7-481c-b20e-874c10fb33bc",
   "metadata": {},
   "outputs": [],
   "source": [
    "from numerapi import NumerAPI, utils\n",
    "import pandas as pd\n",
    "import numpy as np\n",
    "import matplotlib.pyplot as plt\n",
    "from scipy.optimize import minimize\n",
    "\n",
    "napi = NumerAPI()"
   ]
  },
  {
   "cell_type": "code",

This file has been truncated. show original

olivepossum · November 15, 2022, 11:53pm

Super interesting approach @nyuton! I’m wondering what a good time window would be necessary (from starting_round to end_round) for the mean and std to be reliable

nyuton · November 16, 2022, 8:47am

3 months at least I guess. I don’t have any calculations tough.
But you need that 3 months anyway to validate the model in production.

Topic		Replies	Views
Optimal Portfolio and Stake Allocation (or the poor man's stake management) Data Science	4	1138	July 9, 2023
Feature request - portfolio stake Feedback	6	1216	December 16, 2020
About the Stake Weighted Meta Model Data Science	3	854	December 22, 2023
[Enchantments] Reduce risk - weekly portfolio suggestion and external stakes on models Feedback	5	933	August 4, 2021
I am new and want to build my first model Data Science	8	1385	January 10, 2024

Modern Portfolio Theory for my models

Related topics