CWMM lower than numerai computation

eleven_sigma · April 13, 2024, 11:57am

I’m using this function to compute CWMM:

cwmm <- function(mm, preds, era) {
  pred_dt <- data.table('era' = era, 'pred' = preds, 'mm' = mm)
  pred_dt[, preds_ranked_gauss := qnorm((rank(pred, na.last = 'keep') - 0.5) / .N), by = .(era)]
  pred_dt[, preds_ranked_gauss_pot := sign(preds_ranked_gauss) * abs(preds_ranked_gauss)^1.5]
  pred_dt[, mm_ranked_gauss := qnorm((rank(mm, na.last = 'keep') - 0.5) / .N), by = .(era)]
  pred_dt[, mm_ranked_gauss_pot := sign(mm_ranked_gauss) * abs(mm_ranked_gauss)^1.5]
  corr_dt <- pred_dt[, .(CWMM = cor(mm_ranked_gauss, preds_ranked_gauss, method = 'pearson'),
                         CWMM_pot = cor(mm_ranked_gauss_pot, preds_ranked_gauss_pot, method = 'pearson')), by = .(era)]
  return(corr_dt)
}
> cwmm(mm, preds, era)
     era      CWMM  CWMM_pot
   <int>     <num>     <num>
1:  1100 0.8589433 0.8400368
2:  1101 0.8624765 0.8466103
3:  1102 0.8651279 0.8510147
4:  1103 0.8685777 0.8562474
5:  1104 0.8814365 0.8703542

The CWMM of the model in numerai CWMM column is always greater than 0.93

Do you know where is the problem? Can you reproduce CWMM?
In numerai-tools in github there isn’t the script for computing CWMM.

eleven_sigma · April 13, 2024, 8:08pm

I’ve tried with:
ranked pred vs ranked mm
ranked gauss pred vs ranked gauss mm
ranked gauss pot 1.5 pred vs ranked gauss pot 1.5 mm
and all combinations give me CWMM < 0.90 but numerai says in 0.92-0.93
Did someone get reproduce the calculation?

ark · April 22, 2024, 9:53pm

This is the code we use for it:

    predictions = tie_kept_rank__gaussianize__pow_1_5(predictions)
    scores = predictions.apply(lambda sub: pearson_correlation(sub, meta_model))

eleven_sigma · June 13, 2024, 7:59am

It wasn’t trivial to find the correct combination: predictions are ranked, gaussianized and powered but meta model only ranked.
Curious about why to apply different postprocessing to meta model and predictions before computing CWMM. At least power to 1.5 meta model as we do in CORRV2 with target.

Topic		Replies	Views
I don't find CWMM formula Tournament	0	354	March 9, 2024
Reproducing MMC from MM data Tournament	1	326	January 19, 2024
Model ranked low....predictions CSV comparison? Tournament	5	799	February 8, 2021
Pearson vs. Spearman scoring confusion Numeraire	4	3498	March 28, 2021
Example predictions - am i missing something? Tournament	6	1284	January 1, 2022

CWMM lower than numerai computation

Related topics