What do the top models do that lesser models don’t?

Serious question, don’t require detail. Doing a write up and I have to compare my model - idling along in the top 30 to those above and around me. It’s obvious in terms of correlation the top models are miles ahead. What’s the secret? Without actually giving the secret away…

1 Like

What’s the secret for getting to around top 30? Without giving the secret? :upside_down_face: Serious questions though, I’m still in the hundreds.

8 Likes

What’s the secret for getting to the hundreds? Without giving the secret? :upside_down_face: Serious questions though, I’m still in the thousands.

10 Likes

What’s the secret for staying in the top 100? :slight_smile: I managed that before but not very long lol…

5 Likes

Luck… and crossing fingers that other models to perform less well than mine… :smiley:

3 Likes

I suspect the top models have found unique ways to incorporate eras/time into their models. The stock market is a non stationary system, any model that doesn’t find a way to include time as a factor is likely not going to generalize well. It’s hard to do with numerai data because of the encryption, but I’ve found a few innovative ways to use eras. Those have been my best performing models.

6 Likes

Care to share? Without giving too
Much away…

1 Like

Now 1st for corr, 3rd for mmc

I’m curious why you continue to post how amazing your model is doing, both in this thread and this one too.

There’s a leaderboard for that information; we don’t need it replicated in the forum multiple times. It muddies interesting discussions about the value of diagnostics and what approaches the best models are using. I keep coming back to these threads when new posts appear, hoping there might be some content. Alas, it’s mostly just you flexing.

Now that you’ve hit the very top of the leaderboard, care to answer your own question?

2 Likes

Sure. I’ll have a crack at that. The model diagnostics were crap, as I’ve posted previously. My model is not particularly sophisticated and I’m puzzled as to how it got where it did. The posts chart my puzzlement. Will that do you? Actually , given your tone I don’t care if it does you or not.

As for flexing, well, it’s not that, though I am pleased. this is one of several models I’d knocked up for an A level project. My theory is it’s not a good model, it’s just doesn’t perform as badly as some others.

Own question answered.

1 Like

haha, is this yours?

https://numer.ai/model3_tres_optimism/submissions

Congrats, If only you actually staked it with something, that would be pretty sweet the last few weeks

1 Like

Yeah, Penniless student so not likely… though I have someone else staking it.

You can borrow no? Your return is much better than a credit card even

I’m sure you’ll make a ton of money soon after your school if you are messing around with boosted NNs here

Boosting all features or sets of features, or boosting one/two feature at a time like a GAM?

Not at my age sadly. Paying for uni is the dream that ain’t gonna happen.

Feature sets by era.

2 Likes

Sometimes it work sometimes it doesn’t. At the mo i think the conditions are spot on for the model.

Hope the good results continue! If not, then there is improvements to find :slight_smile:

1 Like

Thanks :blush: I’m gonna be giving this up in the near future. Exams coming up… ugh… and I got to write this whole thing up. It’s proving to be harder than I thought.

Sure, I think everyone here already knows that you can’t convert numerai’s dataset to a stationary data due to the encryption. However, that doesn’t mean you should ignore the fact that all stock market data is time series data.

I’ve found a few ways to use sample weights to find the features most applicable to eras “similar” to the live era. Obviously, since it’s time series data usually the eras with the highest weight end up being the most recent eras. But I have found some interesting results where past eras end up with higher weights than the most recent eras.

I’m not saying to just blindly assign weights to eras, but it can an effective way to identify the most useful features/targets (since the new data is multi-target).

2 Likes