Random Forest regression or classification for tidyfit

Fits a random forest on a 'tidyFit' R6 class. The function can be used with regress and classify.

# S3 method for class 'rf'
.fit(self, data = NULL)

Arguments

self: a 'tidyFit' R6 class.
data: a data frame, data frame extension (e.g. a tibble), or a lazy data frame (e.g. from dbplyr or dtplyr).

Value

A fitted 'tidyFit' class model.

Details

Hyperparameters:

ntree (number of trees)
mtry (number of variables randomly sampled at each split)

Important method arguments (passed to m)

The function provides a wrapper for randomForest::randomForest. See ?randomForest for more details.

Implementation

The random forest is always fit with importance = TRUE. The feature importance values are extracted using coef().

References

Liaw, A. and Wiener, M. (2002). Classification and Regression by randomForest. R News 2(3), 18–22.

Author

Johann Pfitzinger

Examples

# Load data
data <- tidyfit::Factor_Industry_Returns
data <- dplyr::filter(data, Industry == "HiTec")
data <- dplyr::select(data, -Date, -Industry)

# Stand-alone function
fit <- m("rf", Return ~ ., data)
fit
#> # A tibble: 1 × 5
#>   estimator_fct              `size (MB)` grid_id  model_object settings        
#>   <chr>                            <dbl> <chr>    <list>       <list>          
#> 1 randomForest::randomForest        8.50 #0010000 <tidyFit>    <tibble [1 × 0]>

# Within 'regress' function
fit <- regress(data, Return ~ ., m("rf"))
explain(fit)
#> Warning: using explain package 'randomForest'
#> # A tibble: 7 × 5
#> # Groups:   model [1]
#>   model term        importance IncNodePurity importanceSD
#>   <chr> <chr>            <dbl>         <dbl>        <dbl>
#> 1 rf    (Intercept)      0                0        0     
#> 2 rf    Mkt-RF          37.9          14515.       0.451 
#> 3 rf    SMB              0.936         2102.       0.103 
#> 4 rf    HML              4.14          3194.       0.185 
#> 5 rf    RMW              2.20          2576.       0.134 
#> 6 rf    CMA              5.44          4039.       0.243 
#> 7 rf    RF               0.723         1197.       0.0889

Random Forest regression or classification for `tidyfit`