我正在使用 gridsearchcv 来调整我的模型的参数,并且我还使用管道和交叉验证。当我运行模型来调整 XGBoost 的参数时,它返回nan. 但是,当我对随机森林等其他分类器使用相同的代码时,它可以工作并返回完整的结果。
kf = StratifiedKFold(n_splits=10, shuffle=False)
SCORING = ['accuracy', 'precision', 'recall', 'f1' ]
# define parametres for hypertuning
params = {
'Classifier__n_estimators': [5, 10, 20, 50, 100, 200]
}
XGB = XGBClassifier()
UnSam = RepeatedEditedNearestNeighbours()
pipe = Pipeline(steps=[('UnderSampling', UnSam ), ('Classifier', XGB)])
# ___________________________________________
mod = GridSearchCV(pipe, params, cv =kf, scoring = SCORING, refit='f1', return_train_score=True)
mod.fit(X_train, y_train)
这是我的代码,当我运行它时,得到以下结果:
{'Classifier__n_estimators': 5}
__________________________________________________
F1 : [nan nan nan nan nan nan]
Recall : [nan nan nan nan nan nan]
Accuracy : [nan nan nan nan nan nan]
Precision : [nan nan nan nan nan nan]
另一件奇怪的事情是,当我在 Logistics Regression 中应用相同的代码来调整惩罚时,它会为 l1 和 elasticnet 返回 nan。
kf = StratifiedKFold(n_splits=10, shuffle=False)
SCORING = ['accuracy', 'precision', 'recall', 'f1' ]
# define parametres for hypertuning
params = {
'Classifier__penalty': ['l1','l2','elasticnet']
}
LR = LogisticRegression(random_state=0)
UnSam = RepeatedEditedNearestNeighbours()
pipe = Pipeline(steps=[('UnderSampling', UnSam ), ('Classifier', LR)])
# ___________________________________________
mod = GridSearchCV(pipe, params, cv =kf, scoring = SCORING, refit='f1', return_train_score=True)
mod.fit(X_train, y_train)
结果如下:
{'Classifier__penalty': 'l2'}
__________________________________________________
F1 : [ nan 0.363 nan]
Recall : [ nan 0.4188 nan]
Accuracy : [ nan 0.7809 nan]
Precision : [ nan 0.3215 nan]