Details
Hyperparameters:
ntree (number of trees)
mtry (number of variables randomly sampled at each split)
Important method arguments (passed to m
)
The function provides a wrapper for randomForest::randomForest
. See ?randomForest
for more details.
Implementation
The random forest is always fit with importance = TRUE
. The feature importance values are extracted using coef()
.
References
Liaw, A. and Wiener, M. (2002). Classification and Regression by randomForest. R News 2(3), 18–22.
See also
.fit.svm
, .fit.boost
and m
methods
Examples
# Load data
data <- tidyfit::Factor_Industry_Returns
data <- dplyr::filter(data, Industry == "HiTec")
data <- dplyr::select(data, -Date, -Industry)
# Stand-alone function
fit <- m("rf", Return ~ ., data)
fit
#> # A tibble: 1 × 5
#> estimator_fct `size (MB)` grid_id model_object settings
#> <chr> <dbl> <chr> <list> <list>
#> 1 randomForest::randomForest 8.56 #0010000 <tidyFit> <tibble>
# Within 'regress' function
fit <- regress(data, Return ~ ., m("rf"))
explain(fit)
#> Warning: using explain package 'randomForest'
#> # A tibble: 7 × 5
#> # Groups: model [1]
#> model term importance IncNodePurity importanceSD
#> <chr> <chr> <dbl> <dbl> <dbl>
#> 1 rf (Intercept) 0 0 0
#> 2 rf Mkt-RF 37.3 14342. 0.411
#> 3 rf SMB 1.16 2111. 0.110
#> 4 rf HML 4.27 3346. 0.180
#> 5 rf RMW 2.08 2542. 0.127
#> 6 rf CMA 5.18 4170. 0.214
#> 7 rf RF 0.725 1173. 0.0812