我正在使用sklearn.model_selection.learning_curve5 倍的数据训练。代码如下所示。
train_sizes = [1, 100, 500, 1000, 2000, 3000, 3879]
train_sizes,train_scores, validation_scores = learning_curve(estimator = ensemble.RandomForestClassifier(),
X = X_res,
y = y_res,
train_sizes = train_sizes,
cv = 5,
scoring = 'accuracy')
train_scores_mean = train_scores.mean(axis = 1)
validation_scores_mean = validation_scores.mean(axis =1)
plt.style.use('seaborn')
plt.plot(train_sizes, train_scores_mean, label = 'Training acc')
plt.plot(train_sizes, validation_scores_mean, label = 'Validation acc')
plt.ylabel('Accuracy', fontsize = 14)
plt.xlabel('Training set size', fontsize = 14)
plt.title('Learning curves for a linear RF model', fontsize = 18, y = 1.03)
plt.legend()
另外,我正在使用随机森林分类器模型来解决多类分类问题。当我绘制精度图时,它如下所示。

所有训练规模的平均训练准确率怎么可能是 100%?
任何帮助表示赞赏。