Rapid Evaluation of Models from Resolving Scores (REMRS)

nmrdragon · August 6, 2023, 10:31pm

Model evaluation is important as we want to know whether stake on a model or whether to further improve it. Sadly, daily scores don’t give us much of an idea about the model quality. Diagnostics only diagnose with past data and sometimes even that is extremely computationally intensive.

Here I am proposing a method to use multiple daily scores to estimate future average final scores.
First I downloaded daily resolving scores of many models. For every day from 1 to 19 I calculated standard deviation. Than I gave the algorithm my resolving scores. Knowing the standard deviations and my scores I calculated a million possible resolving scores for each day. Then I averaged them to calculate the most likely future average score. Also averaging the values across days I calculated a standard deviation of million possible average scores. A chose a confidence interval to calculate interval of future scores from the most likely score and a margin of error.

I am not as bright as many of you so it’s possible that this approach is very wrong. Also bugs in my code are possible. Thanks jrAI for sharing his smart code for fetching daily resolving scores from past rounds.

You can try the app here:
REMRS

Topic		Replies	Views
Interpreting Model Diagnostics Data Science	0	762	March 30, 2021
1 Year Reputation Tournament	0	825	March 3, 2023
Does Good Model Diagnostics Correlate with Tournament Performance? Data Science	13	3005	February 7, 2021
Time ammount to solve tournament Tournament	5	816	June 12, 2021
Feature request: (virtual) return on NMR stake today Feedback	9	1190	May 16, 2021

Rapid Evaluation of Models from Resolving Scores (REMRS)

Related topics