我正在尝试使用 XGBoost 进行一些特征选择,但特征重要性图表只是按出现顺序吐出特征。xtrain 数据中第一列中的特征是迄今为止最重要的,然后是第二列,依此类推。
这似乎表明该模型无法正常工作,因为它没有真正学到任何东西......关于可能出错的任何建议?
更新:相关矩阵 https://ibb.co/3shDJjD
型号代码:
params = {
'subsample':0.5,
'learning_rate': 0.3,
'max_depth':8,
'num_parallel_trees' : 20,
'objective': 'reg:squarederror',
'verbosity':0,
}
watchlist = [(train, 'train'), (test, 'val')]
reg = xgb.train(params, train, num_boost_round=5, early_stopping_rounds=5, evals=watchlist)
结果:
[0] train-rmse:0.274535 val-rmse:0.27431
Multiple eval metrics have been passed: 'val-rmse' will be used for early stopping.
Will train until val-rmse hasn't improved in 5 rounds.
[1] train-rmse:0.273472 val-rmse:0.273653
[2] train-rmse:0.272796 val-rmse:0.27341
[3] train-rmse:0.272318 val-rmse:0.27334
[4] train-rmse:0.271943 val-rmse:0.273346
[5] train-rmse:0.271604 val-rmse:0.273374
[6] train-rmse:0.271218 val-rmse:0.273442
[7] train-rmse:0.270927 val-rmse:0.273529
[8] train-rmse:0.270641 val-rmse:0.273561
Stopping. Best iteration:
[3] train-rmse:0.272318 val-rmse:0.27334
特征重要性(注意,0 和 1 是第一位的)。如果我更改 xtrain 中列的顺序,特征重要性也会发生变化,前两列将始终是两个最重要的特征。 https://ibb.co/QcHwbNg