将 k 折交叉验证与随机森林方法一起使用时遇到问题。输出之一是错误“randomForest.default(x, y, mtry = param$mtry, ...) 中的错误:需要至少两个类来进行分类。” 但是,我已经有两个类来做分类,分别是“Normal”和“Failure”。在https://stackoverflow.com/questions/60643415/error-in-randomforest-defaultx-y-mtry-parammtry-need-at-least-two?noredirect=1#comment107290940_60643415上发布此问题时,向我推荐了我要求“根据我的数据和您的预测/估计/建模需求对统计模型提出建议”。
有人可以帮助我吗?
R脚本:
library(caret)
library(randomForest)
data_failures <- read.csv('OUTPUT.csv', header = TRUE, sep = ",", stringsAsFactors = TRUE)
train.control <- trainControl(method = "cv", number = 10)
model <- train(Period_1 ~., data = data_failures, method = "rf",
trControl = train.control)
print(model)
print(class(str(data_failures)))
输出:
Random Forest
112 samples
11 predictor
2 classes: 'Failure', 'Normal'
No pre-processing
Resampling: Cross-Validated (10 fold)
Summary of sample sizes: 101, 101, 101, 101, 101, 101, ...
Resampling results across tuning parameters:
mtry Accuracy Kappa
2 1 NaN
6 1 NaN
11 1 NaN
Accuracy was used to select the optimal model using the largest value.
The final value used for the model was mtry = 2.
'data.frame': 112 obs. of 12 variables:
$ Period_1 : Factor w/ 2 levels "Failure","Normal": 2 2 2 2 2 2 2 2 2 2 ...
$ Period_2 : Factor w/ 2 levels "Failure","Normal": 2 2 2 1 2 2 2 2 2 1 ...
$ Period_3 : Factor w/ 2 levels "Failure","Normal": 2 2 1 2 2 2 2 2 2 2 ...
$ Period_4 : Factor w/ 2 levels "Failure","Normal": 2 2 2 2 2 2 2 2 2 2 ...
$ Period_5 : Factor w/ 2 levels "Failure","Normal": 2 2 2 2 2 2 2 2 2 2 ...
$ Period_6 : Factor w/ 2 levels "Failure","Normal": 2 2 2 2 2 2 2 2 2 2 ...
$ Period_7 : Factor w/ 2 levels "Failure","Normal": 2 2 2 2 2 2 2 2 2 2 ...
$ Period_8 : Factor w/ 2 levels "Failure","Normal": 2 2 2 1 2 2 2 2 2 2 ...
$ Period_9 : Factor w/ 2 levels "Failure","Normal": 2 2 2 2 2 2 2 2 2 2 ...
$ Period_10: Factor w/ 2 levels "Failure","Normal": 2 2 2 2 2 2 2 2 2 2 ...
$ Period_11: Factor w/ 2 levels "Failure","Normal": 2 2 2 2 2 2 2 2 2 2 ...
$ Period_12: Factor w/ 2 levels "Failure","Normal": 2 2 2 2 2 2 2 2 2 2 ...
[1] "NULL"
Warning messages:
1: model fit failed for Fold08: mtry= 2 Error in randomForest.default(x, y, mtry = param$mtry, ...) :
Need at least two classes to do classification.
2: model fit failed for Fold08: mtry= 6 Error in randomForest.default(x, y, mtry = param$mtry, ...) :
Need at least two classes to do classification.
3: model fit failed for Fold08: mtry=11 Error in randomForest.default(x, y, mtry = param$mtry, ...) :
Need at least two classes to do classification.
4: In nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
There were missing values in resampled performance measures.
数据样本:
Period_1 Period_2 Period_3 Period_4 Period_5 Period_6 Period_7 Period_8
1 Normal Normal Normal Normal Normal Normal Normal Normal
2 Normal Normal Normal Normal Normal Normal Normal Normal
3 Normal Normal Failure Normal Normal Normal Normal Normal
4 Normal Failure Normal Normal Normal Normal Normal Failure
5 Normal Normal Normal Normal Normal Normal Normal Normal
6 Normal Normal Normal Normal Normal Normal Normal Normal
7 Normal Normal Normal Normal Normal Normal Normal Normal
8 Normal Normal Normal Normal Normal Normal Normal Normal
9 Normal Normal Normal Normal Normal Normal Normal Normal
10 Normal Failure Normal Normal Normal Normal Normal Normal
```