rfcv2
creates a random forest model which has both
mtry
and ntree
as tuning parameters for cross-validations.
This function is an extension to random forest models that are currently
supported by the train
function of the caret
package as all of those
models just use mtry
.
rfcv2(type)
type | the type of the prediction problem.
One of |
---|
A function to be used in the train
function of the caret
package.
#>#>#>#>#> #>#>#> #>library(mlbench) ####################################### ## Classification Example data(iris) set.seed(0) rf_class_fit = train(Species ~ ., data=iris, method=rfcv2("Classification"), tuneGrid=expand.grid( .mtry=seq(1,ncol(iris)-1, 1), .ntree=seq(100,500,100)), trControl=trainControl(method="cv")) print(rf_class_fit)#> 150 samples #> 4 predictor #> 3 classes: 'setosa', 'versicolor', 'virginica' #> #> No pre-processing #> Resampling: Cross-Validated (10 fold) #> Summary of sample sizes: 135, 135, 135, 135, 135, 135, ... #> Resampling results across tuning parameters: #> #> mtry ntree Accuracy Kappa #> 1 100 0.9466667 0.92 #> 1 200 0.9466667 0.92 #> 1 300 0.9466667 0.92 #> 1 400 0.9400000 0.91 #> 1 500 0.9466667 0.92 #> 2 100 0.9533333 0.93 #> 2 200 0.9466667 0.92 #> 2 300 0.9533333 0.93 #> 2 400 0.9533333 0.93 #> 2 500 0.9533333 0.93 #> 3 100 0.9533333 0.93 #> 3 200 0.9533333 0.93 #> 3 300 0.9533333 0.93 #> 3 400 0.9533333 0.93 #> 3 500 0.9466667 0.92 #> 4 100 0.9533333 0.93 #> 4 200 0.9533333 0.93 #> 4 300 0.9533333 0.93 #> 4 400 0.9466667 0.92 #> 4 500 0.9466667 0.92 #> #> Accuracy was used to select the optimal model using the largest value. #> The final values used for the model were mtry = 2 and ntree = 100.####################################### ## Regression Example data(BostonHousing) set.seed(0) rf_reg_fit = train(medv ~ ., data = BostonHousing, method=rfcv2("Regression"), tuneGrid=expand.grid( .mtry=seq(1,sqrt(ncol(BostonHousing)-1), 1), .ntree=seq(100,500,100)), trControl=trainControl(method="cv")) print(rf_reg_fit)#> 506 samples #> 13 predictor #> #> No pre-processing #> Resampling: Cross-Validated (10 fold) #> Summary of sample sizes: 454, 455, 457, 454, 456, 455, ... #> Resampling results across tuning parameters: #> #> mtry ntree RMSE Rsquared MAE #> 1 100 4.307455 0.8231317 2.927802 #> 1 200 4.297767 0.8260105 2.897532 #> 1 300 4.321638 0.8229013 2.913916 #> 1 400 4.328720 0.8235357 2.914079 #> 1 500 4.306999 0.8263237 2.908991 #> 2 100 3.489209 0.8741194 2.326531 #> 2 200 3.416644 0.8819662 2.324176 #> 2 300 3.420260 0.8800818 2.305374 #> 2 400 3.428410 0.8805542 2.314938 #> 2 500 3.452766 0.8776414 2.324047 #> 3 100 3.189066 0.8935496 2.173963 #> 3 200 3.223522 0.8902936 2.183458 #> 3 300 3.172203 0.8941432 2.152793 #> 3 400 3.203657 0.8908199 2.180217 #> 3 500 3.208938 0.8911333 2.177033 #> #> RMSE was used to select the optimal model using the smallest value. #> The final values used for the model were mtry = 3 and ntree = 300.