There are 6 groups of features in the dataset as everyone knows and I’ve been always thinking there should be some reasons behind that.
Followings are the avenues I’ve tried to explore so far:
Train models on a feature group (i.e. Dexterity only, Strength only etc), a combination of feature groups (i.e. Intelligence & Strength, Dexterity & Charisma & Constitution etc). There can be so many variations. Take subsets from each group and combine them, ensemble predictions etc etc.
Generate some representative features from each feature group (e.g. PCAs, correlations, stds…) and use them for predictions.
I tried the XGBoost feature interaction constraint and notice no discernable difference between using it and not using it. I have since abandoned the project. I do believe that the next frontier in model development will include some type of feature restriction, but so far, the frontier lies ahead…
My understanding is that the interaction constraints implemented by XGBoost is to allow only interactions within groups, not across groups. This is the opposite of what we want.
Yes, you are right about the way XGB implemented that. So I defined 310 lists and gave them as constraints. [feature_dexterity1, all features EXCEPT other dexterity features], [feature_dexterity2, all features EXCEPT other dexterity features], etc etc…
It wasn’t a problem but the result just wasn’t great unfortunately… I still don’t know how to make a good use of feature groups. The way I train my model at the moment is completely ignoring the groups so it’s no different even if they are given as feature1-feature310.