The real problem with TC is Optimizer Misalignment

Regarding the question of whether to normalize the meta-model, there is another point I forgot to make: normalization has the unfortunate effect of destroying the gradient vector component in the direction of the meta-model. Thus because the meta-model is normalized, it doesn’t matter if your model agrees with the meta-model when the meta-model is correct and disagrees with it when it is wrong or vice-versa, only the meta-model orthogonal component of your prediction can affect your TC. This likely accounts for some of the noisiness we see in TC as well as some of the cases where models that out-perform the meta-model in corr end up with negative TC. Really it is best that we drop normalization entirely despite what a big change that would entail.

Fixing optimizer misalignment allows us to drop meta-model normalization. Dropping meta-model normalization gives us more consistent TC scores. More consistent TC scores train the meta-model more efficiently.

2 Likes