Skip to content

The function can fit various regression or classification models and returns the results as a tibble. m() can be used in conjunction with regress and classify, or as a stand-alone function.

Usage

m(model_method, formula = NULL, data = NULL, ...)

Arguments

model_method

The name of the method to fit. See Details.

formula

an object of class "formula": a symbolic description of the model to be fitted.

data

a data frame, data frame extension (e.g. a tibble), or a lazy data frame (e.g. from dbplyr or dtplyr).

...

Additional arguments passed to the underlying method function (e.g. lm or glm).

Value

A 'tidyfit.models' data frame.

Details

model_method specifies the model to fit to the data and can take one of several options:

Linear (generalized) regression or classification

"lm" performs an OLS regression using stats::lm. See .fit.lm for details.

"glm" performs a generalized regression or classification using stats::glm. See .fit.glm for details.

"anova" performs analysis of variance using stats::anova. See .fit.anova for details.

"robust" performs a robust regression using MASS::rlm. See .fit.robust for details.

"quantile" performs a quantile regression using quantreg::rq. See .fit.quantile for details.

Regression and classification with L1 and L2 penalties

"lasso" performs a linear regression or classification with L1 penalty using glmnet::glmnet. See .fit.lasso for details.

"ridge" performs a linear regression or classification with L2 penalty using glmnet::glmnet. See .fit.ridge for details.

"adalasso" performs an Adaptive Lasso regression or classification using glmnet::glmnet. See .fit.adalasso for details.

"enet" performs a linear regression or classification with L1 and L2 penalties using glmnet::glmnet. See .fit.enet for details.

"group_lasso" performs a linear regression or classification with grouped L1 penalty using gglasso::gglasso. See .fit.group_lasso for details.

Other Machine Learning

"boost" performs gradient boosting regression or classification using mboost::glmboost. See .fit.boost for details.

"rf" performs a random forest regression or classification using randomForest::randomForest. See .fit.rf for details.

"quantile_rf" performs a quantile random forest regression or classification using quantregForest::quantregForest. See .fit.quantile_rf for details.

"svm" performs a support vector regression or classification using e1071::svm. See .fit.svm for details.

"nnet" performs a neural network regression or classification using nnet::nnet. See .fit.nnet for details.

Factor regressions

"pcr" performs a principal components regression using pls::pcr. See .fit.pcr for details.

"plsr" performs a partial least squares regression using pls::plsr. See .fit.plsr for details.

"hfr" performs a hierarchical feature regression using hfr::hfr. See .fit.hfr for details.

Best subset selection

"subset" performs a best subset regression or classification using bestglm::bestglm (wrapper for leaps). See .fit.subset for details.

"gets" performs a general-to-specific regression using gets::gets. See .fit.gets for details.

Bayesian methods

"bayes" performs a Bayesian generalized regression or classification using arm::bayesglm. See .fit.bayes for details.

"bridge" performs a Bayesian ridge regression using monomvn::bridge. See .fit.bridge for details.

"blasso" performs a Bayesian Lasso regression using monomvn::blasso. See .fit.blasso for details.

"spikeslab" performs a Bayesian Spike and Slab regression using BoomSpikeSlab::lm.spike. See .fit.spikeslab for details.

"bma" performs a Bayesian model averaging regression using BMS::bms. See .fit.bma for details.

"tvp" performs a Bayesian time-varying parameter regression using shrinkTVP::shrinkTVP. See .fit.tvp for details.

Mixed-effects modeling

"glmm" performs a mixed-effects GLM using lme4::glmer. See .fit.glmm for details.

Specialized time series methods

"mslm" performs a Markov-switching regression using MSwM::msmFit. See .fit.mslm for details.

Feature selection

"cor" calculates Pearson's correlation coefficient using stats::cor.test. See .fit.cor for details.

"chisq" calculates Pearson's Chi-squared test using stats::chisq.test. See .fit.chisq for details.

"mrmr" performs a minimum redundancy, maximum relevance features selection routine using mRMRe::mRMR.ensemble. See .fit.mrmr for details.

"relief" performs a ReliefF feature selection routine using CORElearn::attrEval. See .fit.relief for details.

"genetic" performs a linear regression with feature selection using the genetic algorithm implemented in gaselect::genAlg. See .fit.genetic for details.

When called without formula and data arguments, the function returns a 'tidyfit.models' data frame with unfitted models.

See also

regress and classify methods

Author

Johann Pfitzinger

Examples

# Load data
data <- tidyfit::Factor_Industry_Returns

# Stand-alone function
fit <- m("lm", Return ~ ., data)
fit
#> # A tibble: 1 × 5
#>   estimator_fct `size (MB)` grid_id  model_object settings        
#>   <chr>               <dbl> <chr>    <list>       <list>          
#> 1 stats::lm            3.81 #0010000 <tidyFit>    <tibble [1 × 0]>

# Within 'regress' function
fit <- regress(data, Return ~ ., m("lm"), .mask = "Date")
fit
#> # A tibble: 1 × 6
#>   model estimator_fct `size (MB)` grid_id  model_object settings        
#>   <chr> <chr>               <dbl> <chr>    <list>       <list>          
#> 1 lm    stats::lm            3.64 #0010000 <tidyFit>    <tibble [1 × 1]>