Skip to content

Fits a linear regression or classification with L1 penalty on a 'tidyFit' R6 class. The function can be used with regress and classify.

Usage

# S3 method for lasso
.fit(self, data = NULL)

Arguments

self

a 'tidyFit' R6 class.

data

a data frame, data frame extension (e.g. a tibble), or a lazy data frame (e.g. from dbplyr or dtplyr).

Value

A fitted 'tidyFit' class model.

Details

Hyperparameters:

  • lambda (L1 penalty)

Important method arguments (passed to m)

The Lasso regression is estimated using glmnet::glmnet with alpha = 1. See ?glmnet for more details. For classification pass family = "binomial" to ... in m or use classify.

Implementation

If the response variable contains more than 2 classes, a multinomial response is used automatically.

Features are standardized by default with coefficients transformed to the original scale.

If no hyperparameter grid is passed (is.null(control$lambda)), dials::grid_regular() is used to determine a sensible default grid. The grid size is 100. Note that the grid selection tools provided by glmnet::glmnet cannot be used (e.g. dfmax). This is to guarantee identical grids across groups in the tibble.

References

Jerome Friedman, Trevor Hastie, Robert Tibshirani (2010). Regularization Paths for Generalized Linear Models via Coordinate Descent. Journal of Statistical Software, 33(1), 1-22. URL https://www.jstatsoft.org/v33/i01/.

See also

Author

Johann Pfitzinger

Examples

# Load data
data <- tidyfit::Factor_Industry_Returns

# Stand-alone function
fit <- m("lasso", Return ~ ., data, lambda = 0.5)
fit
#> # A tibble: 1 × 5
#>   estimator_fct  `size (MB)` grid_id  model_object settings        
#>   <chr>                <dbl> <chr>    <list>       <list>          
#> 1 glmnet::glmnet        3.05 #001|001 <tidyFit>    <tibble [1 × 3]>

# Within 'regress' function
fit <- regress(data, Return ~ ., m("lasso", lambda = c(0.1, 0.5)),
               .mask = c("Date", "Industry"))
coef(fit)
#> # A tibble: 8 × 5
#> # Groups:   model [1]
#>   model term        estimate grid_id  model_info      
#>   <chr> <chr>          <dbl> <chr>    <list>          
#> 1 lasso (Intercept)   0.522  #001|001 <tibble [1 × 2]>
#> 2 lasso Mkt-RF        0.819  #001|001 <tibble [1 × 2]>
#> 3 lasso (Intercept)   0.191  #001|002 <tibble [1 × 2]>
#> 4 lasso Mkt-RF        0.934  #001|002 <tibble [1 × 2]>
#> 5 lasso HML           0.0596 #001|002 <tibble [1 × 2]>
#> 6 lasso RMW           0.0910 #001|002 <tibble [1 × 2]>
#> 7 lasso CMA           0.0316 #001|002 <tibble [1 × 2]>
#> 8 lasso RF            0.592  #001|002 <tibble [1 × 2]>