Hi,
I had this thought experiment about the TC metric. How is it possible to have every other metric over 79 percentile and still have TC bellow 2 percentile? Could it be that TC penalizes uniqueness? I would like to discuss whether my thought process is correct. I think that a metric like TC is needed; however, I don’t think that penalizing uniqueness is good.
Disclaimer: I don’t know if I understand correctly how TC works.
Real world example: Round 323 | model: MINMAX2 | jul 16
metric | CORR | MMC | FNC | FNCV3 | TC
val | 0.0241 | 0.0135 | 0.0257 | 0.0146 | -0.07594
percentile | 84.8 | 86.7 | 92.3 | 79.9 | 1.8
Does TC penalize uniqueness?
Setup
Let’s have:
Y = [ 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10] ← Ground truth rank
A = [10, 9, 3, 4, 5, 6, 7, 8, 1, 2 ] ← Average model rank
U = [ 1 , 9, 8, 4, 5, 6, 7, 3, 2, 10] ← Unique model rank
U has way better correlation and prediction than A:
spearman(A,Y) =~ -0.56
spearman(U,Y) =~ 0.1
Metamodels
Now if I understand it correctly, TC is just a gradient between the metamodel and the metamodel without the measured model.
So, let’s assume metamodels Mu and M, where M is Mu without U:
M = [35. 31.5 10.5 14. 17.5 21. 24.5 28. 3.5 7. ] = 3.5 * A
Mu = [36. 40.5 18.5 18. 22.5 27. 31.5 31. 5.5 17. ] = 3.5 * A + U
Mr = [10, 9, 3, 4, 5, 6, 7, 8, 1, 2] ← M rank (Average model rank without change)
Mur=[ 9,10, 4, 3, 5, 6, 8, 7, 1, 2] ← Mu rank
sMr = spearman(Mr,Y) =~ -0.56
sMru = spearman(Mur,Y) =~ -0.57
gradient = sMru - sMr = -0.012 =? TC
Conclusion
Adding U rank to M makes the metamodel M worse. Why? Because U had enough strength to flip 4 and 3, 8 and 7, which makes the ranking worse, and swapping 10 and 9 is not enough. Could this situation happen with the real TC?