Adaptive Lasso regression or classification for tidyfit

Fits an adaptive Lasso regression or classification on a 'tidyFit' R6 class. The function can be used with regress and classify.

# S3 method for class 'adalasso'
.fit(self, data = NULL)

Arguments

self: a 'tidyFit' R6 class.
data: a data frame, data frame extension (e.g. a tibble), or a lazy data frame (e.g. from dbplyr or dtplyr).

Value

A fitted 'tidyFit' class model.

Details

Hyperparameters:

lambda (L1 penalty)
lambda_ridge (L2 penalty (default = 0.01) used in the first step to determine the penalty factor)

Important method arguments (passed to m)

The adaptive Lasso is a weighted implementation of the Lasso algorithm, with covariate-specific weights obtained using an initial regression fit (in this case, a ridge regression with lambda = lambda_ridge, where lambda_ridge can be passed as an argument). The adaptive Lasso is computed using the glmnet::glmnet function. See ?glmnet for more details. For classification pass family = "binomial" to ... in m or use classify.

Implementation

Features are standardized by default with coefficients transformed to the original scale.

If no hyperparameter grid is passed (is.null(control$lambda)), dials::grid_regular() is used to determine a sensible default grid. The grid size is 100. Note that the grid selection tools provided by glmnet::glmnet cannot be used (e.g. dfmax). This is to guarantee identical grids across groups in the tibble.

References

Zou, H. (2006). The Adaptive Lasso and Its Oracle Properties. Journal of the American Statistical Association, 101(476), 1418-1429.

Jerome Friedman, Trevor Hastie, Robert Tibshirani (2010). Regularization Paths for Generalized Linear Models via Coordinate Descent. Journal of Statistical Software, 33(1), 1-22. URL https://www.jstatsoft.org/v33/i01/.

Author

Johann Pfitzinger

Examples

# Load data
data <- tidyfit::Factor_Industry_Returns

# Stand-alone function
fit <- m("adalasso", Return ~ ., data, lambda = 0.5)
fit
#> # A tibble: 1 × 5
#>   estimator_fct  `size (MB)` grid_id  model_object settings        
#>   <chr>                <dbl> <chr>    <list>       <list>          
#> 1 glmnet::glmnet        3.06 #001|001 <tidyFit>    <tibble [1 × 3]>

# Within 'regress' function
fit <- regress(data, Return ~ ., m("adalasso", lambda = c(0.1, 0.5)),
               .mask = c("Date", "Industry"))
coef(fit)
#> # A tibble: 10 × 5
#> # Groups:   model [1]
#>    model    term        estimate grid_id  model_info      
#>    <chr>    <chr>          <dbl> <chr>    <list>          
#>  1 adalasso (Intercept) 0.125    #001|001 <tibble [1 × 2]>
#>  2 adalasso Mkt-RF      0.932    #001|001 <tibble [1 × 2]>
#>  3 adalasso RMW         0.0464   #001|001 <tibble [1 × 2]>
#>  4 adalasso RF          0.886    #001|001 <tibble [1 × 2]>
#>  5 adalasso (Intercept) 0.000585 #001|002 <tibble [1 × 2]>
#>  6 adalasso Mkt-RF      0.970    #001|002 <tibble [1 × 2]>
#>  7 adalasso HML         0.0214   #001|002 <tibble [1 × 2]>
#>  8 adalasso RMW         0.139    #001|002 <tibble [1 × 2]>
#>  9 adalasso CMA         0.118    #001|002 <tibble [1 × 2]>
#> 10 adalasso RF          0.987    #001|002 <tibble [1 × 2]>

Adaptive Lasso regression or classification for `tidyfit`