Target orthogonality

steelyglint · July 21, 2023, 11:14pm

I find this mathematically quite beautiful and wanted to share it.

The plot shows the almost complete orthogonality of two large clusters of targets. The actual ‘target’ is in one of the clusters, I haven’t looked yet, but I’m guessing it’s in the larger of the two.

Why the decay at the North and South corners, and not the East and West corners? That’s quite informative.

Look carefully and you can see striping at 0.25, 0.5, and 0.75, both SW to NE, and NW to SE. Colour depth indicates an anomaly score, light=low.

Look at the target vectors; in each orthogonal direction there are two groups of dominant (longer), and less dominant (shorter) vectors.

I’m not going to say much about how this is derived, except to say that it’s a low rank, accurate representation of the full rank data.

I think there are lots of opportunities here for segmenting and ensembleing.

steelyglint · July 23, 2023, 11:07pm

Follow-up with same analysis for features. (Every 4th era, 412789 rows, 1586 cols)

Labelling only the dominant (longer) feature vectors, looks like some interesting grouping for feature selection. Again the colour depth is an anomaly score but it looks all very normally distributed (nice looking Tukey box-plots not shown.)

steelyglint · July 23, 2023, 11:59pm

A more detailed, side-by-side display showing the associated scree plots and box plots; features on the left, targets on the right.

steelyglint · July 25, 2023, 1:33pm

You can see the relationships between the targets nicely in this more conventional correlation cluster map; they’re a bit obscured in the biplot.

Doing separate decompositions for the target variants shows the dominant targets (arthur, alan, janet) and their orthogonal relation to the rest.

wigglemuse · July 25, 2023, 8:56pm

It makes sense that he 20 day targets and the 60 day targets would be each clustered together, although I wouldn’t really expect them to be super orthogonal to each other since they are the same targets just farther out.

steelyglint · July 25, 2023, 9:44pm

Yes, important to realise that the extreme orthogonality is in the low rank-2 approximation; but also that that is the overwhelmingly dominant sub-space. There is more beyond rank-2; we can either compute the relations in ever more inclusive dimensions (all the way up to exact full rank), or visualise the more subtle relations as they come out in subsets like in the last plot.

What we see in the biplot is an extraction of the most dominant relationships.

wigglemuse · July 25, 2023, 10:14pm

Aha. Then the orthogonality makes sense also when looking at LR2.

Topic		Replies	Views
Visualizing the New Data Data Science	3	1004	September 10, 2021
Some info on multiple targets? Tournament	10	1565	September 20, 2022
Analyzing Training Data Data Science	2	1387	May 2, 2021
Missing values in targets Tournament	4	551	March 28, 2023
Orthogonal Model Performance Across Eras Data Science	1	714	May 14, 2024

Target orthogonality

Related topics