首先,正如Tim Biegeleisen所说,你应该将你的Gewonnen
变量转换为一个因子(在训练和测试集中),如果它还没有:
training$Gewonnen <- as.factor(training$Gewonnen) testing$Gewonnen <- as.factor(testing$Gewonnen)
之后,函数中的type
选项确定您为二进制分类问题获得的响应类型,即类标签或概率.以下是使用包中数据集的文档中可重现的示例:caret
predict
caret
Sonar
mlbench
library(caret) library(mlbench) data(Sonar) str(Sonar$Class) # Factor w/ 2 levels "M","R": 2 2 2 2 2 2 2 2 2 2 ... set.seed(998) inTraining <- createDataPartition(Sonar$Class, p = .75, list = FALSE) training <- Sonar[ inTraining,] testing <- Sonar[-inTraining,] modFit <- train(Class ~ ., data=training, method="rf", prox=TRUE) pred <- predict(modFit, testing, type="prob") # for class probabilities head(pred) # M R # 5 0.442 0.558 # 10 0.276 0.724 # 11 0.096 0.904 # 12 0.360 0.640 # 20 0.654 0.346 # 21 0.522 0.478 pred2 <- predict(modFit, testing, type="raw") # for class labels head(pred2) # [1] R R R R M M # Levels: M R
对于混淆矩阵,您将需要类标签(即pred2
上面):
confusionMatrix(pred2, testing$Class) # Confusion Matrix and Statistics # Reference # Prediction M R # M 25 6 # R 2 18