Fits a best subset regression or classification on a 'tidyFit' R6
class. The function can be used with regress
and classify
.
Details
Hyperparameters:
None. Cross validation not applicable.
Important method arguments (passed to m
)
method
(e.g. 'forward', 'backward')IC
(information criterion, e.g. 'AIC')
The best subset regression is estimated using bestglm::bestglm
which is a wrapper around leaps::regsubsets
for the regression case, and performs an exhaustive search for the classification case. See ?bestglm
for more details.
Implementation
Forward or backward selection can be performed by passing method = "forward"
or method = "backward"
to m
.
References
A.I. McLeod, Changjiang Xu and Yuanhao Lai (2020).
bestglm: Best Subset GLM and Regression Utilities.
R package version 0.37.3. URL https://CRAN.R-project.org/package=bestglm.
Examples
# Load data
data <- tidyfit::Factor_Industry_Returns
# Stand-alone function
fit <- m("subset", Return ~ ., data, method = c("forward", "backward"))
tidyr::unnest(fit, settings)
#> # A tibble: 2 × 6
#> estimator_fct `size (MB)` grid_id model_object method warnings
#> <chr> <dbl> <chr> <list> <chr> <chr>
#> 1 bestglm::bestglm 2.74 #0010000 <tidyFit> forward NA
#> 2 bestglm::bestglm 2.74 #0020000 <tidyFit> backward model with …
# Within 'regress' function
fit <- regress(data, Return ~ ., m("subset", method = "forward"),
.mask = c("Date", "Industry"))
coef(fit)
#> # A tibble: 6 × 4
#> # Groups: model [1]
#> model term estimate model_info
#> <chr> <chr> <dbl> <list>
#> 1 subset (Intercept) -0.0243 <tibble [1 × 3]>
#> 2 subset Mkt-RF 0.979 <tibble [1 × 3]>
#> 3 subset HML 0.0628 <tibble [1 × 3]>
#> 4 subset RMW 0.156 <tibble [1 × 3]>
#> 5 subset CMA 0.114 <tibble [1 × 3]>
#> 6 subset RF 0.997 <tibble [1 × 3]>