training_data["ensemble_neutral_riskiest_50"] = sum(
[training_data[pred_col] for pred_col in pred_cols if pred_col.endswith("neutral_riskiest_50")]).rank(
pct=True)
training_data["ensemble_not_neutral"] = sum(
[training_data[pred_col] for pred_col in pred_cols if "neutral" not in pred_col]).rank(pct=True)
training_data["ensemble_all"] = sum([training_data[pred_col] for pred_col in pred_cols]).rank(pct=True)
training_data["preds_model_target_neutral_riskiest_50"] = sum([training_data[pred_col] for pred_col in pred_cols]).rank(pct=True)
ensemble_cols.add("ensemble_neutral_riskiest_50")
ensemble_cols.add("ensemble_not_neutral")
ensemble_cols.add("ensemble_all")
ensemble_cols.add("preds_model_target_neutral_riskiest_50")
I’m just getting caught up on all of this, I’ll try to get all of this updated in the next week or so. Holidays have me way behind. @mesomachukwu12 appreciate your help here. Thank you!
I have a different way of constructing my models from the way the example model is built. But I am using all available information on feature groupings (I really miss feature groups!)
I’ve been giving this a lot of thought since yesterday’s TC announcement: I’m not sure I’m going to be able to reproduce the boruta output when the new classic data drops. The initial run took a little over a week and when the data 3x’s, that could obviously push out much further. This is being further complicated by my new signals pipeline which is turning into a compute black hole.
I just wanted to give you all a heads up and time to prepare for the next data drop. GL!
Hey, can you explain why exactly you need to fill int8 with 2 instead of 0.5 in this case? Been trying to find the answer myself and can’ seem to sort it out
Almost sent you a message the other day, haven’t seen you on the forum in a minute. Hope all is well.
Things on this end have gotten a little dicey since the first of the year and I’ve largely had to pull away from the comp. I was having a really hard time staying on top of the things that needed addressed with the repo and with TC and the new data-set coming thought it was best to pull it, so people didn’t waste time trying to use something that is out of date.
I still have some of the output from the run though, can paste it here if it will help.