The function can fit various regression or classification models and returns the results as a tibble. m()
can be used in conjunction with regress
and classify
, or as a stand-alone function.
Arguments
- model_method
The name of the method to fit. See Details.
- formula
an object of class "formula": a symbolic description of the model to be fitted.
- data
a data frame, data frame extension (e.g. a tibble), or a lazy data frame (e.g. from dbplyr or dtplyr).
- ...
Additional arguments passed to the underlying method function (e.g.
lm
orglm
).
Details
model_method
specifies the model to fit to the data and can take one of several options:
Linear (generalized) regression or classification
"lm"
performs an OLS regression using stats::lm
. See .fit.lm
for details.
"glm"
performs a generalized regression or classification using stats::glm
. See .fit.glm
for details.
"anova"
performs analysis of variance using stats::anova
. See .fit.anova
for details.
"robust"
performs a robust regression using MASS::rlm
. See .fit.robust
for details.
"quantile"
performs a quantile regression using quantreg::rq
. See .fit.quantile
for details.
Regression and classification with L1 and L2 penalties
"lasso"
performs a linear regression or classification with L1 penalty using glmnet::glmnet
. See .fit.lasso
for details.
"ridge"
performs a linear regression or classification with L2 penalty using glmnet::glmnet
. See .fit.ridge
for details.
"adalasso"
performs an Adaptive Lasso regression or classification using glmnet::glmnet
. See .fit.adalasso
for details.
"enet"
performs a linear regression or classification with L1 and L2 penalties using glmnet::glmnet
. See .fit.enet
for details.
"group_lasso"
performs a linear regression or classification with grouped L1 penalty using gglasso::gglasso
. See .fit.group_lasso
for details.
Other Machine Learning
"boost"
performs gradient boosting regression or classification using mboost::glmboost
. See .fit.boost
for details.
"rf"
performs a random forest regression or classification using randomForest::randomForest
. See .fit.rf
for details.
"quantile_rf"
performs a quantile random forest regression or classification using quantregForest::quantregForest
. See .fit.quantile_rf
for details.
"svm"
performs a support vector regression or classification using e1071::svm
. See .fit.svm
for details.
"nnet"
performs a neural network regression or classification using nnet::nnet
. See .fit.nnet
for details.
Factor regressions
"pcr"
performs a principal components regression using pls::pcr
. See .fit.pcr
for details.
"plsr"
performs a partial least squares regression using pls::plsr
. See .fit.plsr
for details.
"hfr"
performs a hierarchical feature regression using hfr::hfr
. See .fit.hfr
for details.
Best subset selection
"subset"
performs a best subset regression or classification using bestglm::bestglm
(wrapper for leaps
). See .fit.subset
for details.
"gets"
performs a general-to-specific regression using gets::gets
. See .fit.gets
for details.
Bayesian methods
"bayes"
performs a Bayesian generalized regression or classification using arm::bayesglm
. See .fit.bayes
for details.
"bridge"
performs a Bayesian ridge regression using monomvn::bridge
. See .fit.bridge
for details.
"blasso"
performs a Bayesian Lasso regression using monomvn::blasso
. See .fit.blasso
for details.
"spikeslab"
performs a Bayesian Spike and Slab regression using BoomSpikeSlab::lm.spike
. See .fit.spikeslab
for details.
"bma"
performs a Bayesian model averaging regression using BMS::bms
. See .fit.bma
for details.
"tvp"
performs a Bayesian time-varying parameter regression using shrinkTVP::shrinkTVP
. See .fit.tvp
for details.
Mixed-effects modeling
"glmm"
performs a mixed-effects GLM using lme4::glmer
. See .fit.glmm
for details.
Specialized time series methods
"mslm"
performs a Markov-switching regression using MSwM::msmFit
. See .fit.mslm
for details.
Feature selection
"cor"
calculates Pearson's correlation coefficient using stats::cor.test
. See .fit.cor
for details.
"chisq"
calculates Pearson's Chi-squared test using stats::chisq.test
. See .fit.chisq
for details.
"mrmr"
performs a minimum redundancy, maximum relevance features selection routine using mRMRe::mRMR.ensemble
. See .fit.mrmr
for details.
"relief"
performs a ReliefF feature selection routine using CORElearn::attrEval
. See .fit.relief
for details.
"genetic"
performs a linear regression with feature selection using the genetic algorithm implemented in gaselect::genAlg
. See .fit.genetic
for details.
When called without formula
and data
arguments, the function returns a 'tidyfit.models' data frame with unfitted models.
Examples
# Load data
data <- tidyfit::Factor_Industry_Returns
# Stand-alone function
fit <- m("lm", Return ~ ., data)
fit
#> # A tibble: 1 × 5
#> estimator_fct `size (MB)` grid_id model_object settings
#> <chr> <dbl> <chr> <list> <list>
#> 1 stats::lm 3.81 #0010000 <tidyFit> <tibble [1 × 0]>
# Within 'regress' function
fit <- regress(data, Return ~ ., m("lm"), .mask = "Date")
fit
#> # A tibble: 1 × 6
#> model estimator_fct `size (MB)` grid_id model_object settings
#> <chr> <chr> <dbl> <chr> <list> <list>
#> 1 lm stats::lm 3.64 #0010000 <tidyFit> <tibble [1 × 1]>