This post marks the end of my first experiment period on numer.ai. As the title suggests, I’m aiming to do something like this approximately quarterly. This time was more exploration than experiment. I’ve talked on Daily Scores and Chill about my approach to modeling and thought that might be helpful to put in writing, given I have had some success with MMC. Following that, I will discuss the experiment itself. Spoiler alert : I made some mistakes, interpret at your own peril! Feel free to reach out to me on rocketchat, where my username is ‘aelizzybeth’
I believe that taking creative approaches to modeling will help generate that sweet, sweet MMC. That certainly isn’t all that is required, but I think there is a lot of MMC opportunity there.
When I started competing on numerai about 8 months ago, after surveying some posts and rocketchat, I felt not much attention was being paid to feature engineering by modelers in the classic tournament. Of course, many standard models do some feature engineering on their own. Still, I felt there is ample room to find signal here, so I tapped into my generative art roots and designed a generative approach. I tested random forest, gbm, different kinds of transformations, as well as including or excluding the original variables. Unfortunately I didn’t take good notes at the time. The resulting process is summarized below (I implemented this in R, fwiw).
- set a seed for reproducing portions of process involving randomization
- transform the features of the dataset so there are no zeroes. (didn’t bother with the target)
- generate 1000 unique pairs of indices 1->310 (to correspond to features)
- for each generated pair, (A,B), derive a new column = Logarithm base B of A
- train a gbm on the new dataset (original features + 1000 engineered features)
Then, when predicting, transform new data, predict, that’s it.
Feature Neutralization Experiment
From R244 - R256 I carried out a 13-round experiment. I wanted to see what various degrees of feature neutralization applied to the model described above would result in, performance-wise. I used a method adapted from a post by wigglemuse.
(text about misapplying FN deleted. Upon another code review, realized I did it right, sorry for the confusion). For reference, I stake Corr + 2x MMC on all models.
See some box-and-whisker plots of Corr, MMC, and FNC.
Please note, the y-axis differs for each plot.
corr and mmc suffered the more FN, generally, during this 13-round period. I found the FNC results most interesting. Though I don’t present them here, it was interesting to see the rounds were Urza wasn’t the top performer, the order of corr performance tended to be flipped. I’ll paste the data into a comment if anyone is interested (though this can be found on the links provided as well). I would have uploaded the excel spreadsheet I used to write this but now I am tired and not sure how. I’m glad to take questions other than “what are your model parameters?”.
Thoughts for next experiment period
I might get this going right away, might take a few weeks, but I want to try a few things.
- less overfitting
- xgboost (gonna skip this on next experiment cycle, slots are mapped out already)
- more generated features
- different sets of generated features
- automated feature selection (forgot to add at first)
- FN relative to original data only (forgot to add at first) [EDIT : I actually did it right the first time, but I am now interested so will experiment with full-FN vs original-FN as part of my next suite of experimental models]