我正在运行五种不同的回归模型来找到一个变量的最佳预测模型。我正在使用 Leave-One-Out 方法并使用 RFE 来找到最佳预测特征。
五个模型中有四个运行良好,但我遇到了 SVR 的问题。这是我下面的代码:
from numpy import absolute, mean, std
import matplotlib.pyplot as plt
import pandas as pd
import seaborn as sns
from sklearn.model_selection import cross_val_score, LeaveOneOut
from sklearn.metrics import r2_score, mean_absolute_error, mean_squared_error
from sklearn.feature_selection import RFECV
from sklearn.pipeline import Pipeline
# one hot encoding
dataset.Gender.replace(to_replace=['M','F'],value=[1,0],inplace=True)
# select predictors and dependent
X = dataset.iloc[:,12:]
y = dataset.iloc[:,2]
from sklearn.preprocessing import StandardScaler
scaler = StandardScaler()
X = scaler.fit_transform(X)
首先,我运行具有所有功能的 LOOCV,它运行良好
## LOOCV with all features
# find number of samples
n = X.shape[0]
# create loocv procedure
cv = LeaveOneOut()
# create model
from sklearn.svm import SVR
regressor = SVR(kernel = 'rbf')
# evaluate model
scores = cross_val_score(regressor, X, y, scoring='neg_mean_squared_error', cv=n)
# force positive
#scores = absolute(scores)
# report performance
print('MSE: %.3f (%.3f)' % (mean(scores), std(scores)))
接下来,我想包含 RFECV 以找到模型的最佳预测特征,这对于我的其他回归模型运行良好。
这是我收到错误的代码部分:
# automatically select the number of features with RFE
# create pipeline
rfe = RFECV(estimator=SVR(kernel = 'rbf'))
model = SVR(kernel = 'rbf')
pipeline = Pipeline(steps=[('s',rfe),('m',model)])
# find number of samples
n = X.shape[0]
# create loocv procedure
cv = LeaveOneOut()
# evaluate model
scores = cross_val_score(pipeline, X, y, scoring='neg_mean_squared_error', cv=n)
# report performance
print('MSE: %.3f (%.3f)' % (mean(scores), std(scores)))
我收到的错误是
ValueError: when `importance_getter=='auto'`, the underlying estimator SVR should have `coef_` or `feature_importances_` attribute. Either pass a fitted estimator to feature selector or call fit before calling transform.
我不确定这个错误是什么意思?