Skip to content

Fits a linear regression or classification with L2 penalty on a 'tidyFit' R6 class. The function can be used with regress and classify.

Usage

# S3 method for class 'ridge'
.fit(self, data = NULL)

Arguments

self

a tidyFit R6 class.

data

a data frame, data frame extension (e.g. a tibble), or a lazy data frame (e.g. from dbplyr or dtplyr).

Value

A fitted tidyFit class model.

Details

Hyperparameters:

  • lambda (L2 penalty)

Important method arguments (passed to m)

The ridge regression is estimated using glmnet::glmnet with alpha = 0. See ?glmnet for more details. For classification pass family = "binomial" to ... in m or use classify.

Implementation

If the response variable contains more than 2 classes, a multinomial response is used automatically.

Features are standardized by default with coefficients transformed to the original scale.

If no hyperparameter grid is passed (is.null(control$lambda)), dials::grid_regular() is used to determine a sensible default grid. The grid size is 100. Note that the grid selection tools provided by glmnet::glmnet cannot be used (e.g. dfmax). This is to guarantee identical grids across groups in the tibble.

References

Jerome Friedman, Trevor Hastie, Robert Tibshirani (2010). Regularization Paths for Generalized Linear Models via Coordinate Descent. Journal of Statistical Software, 33(1), 1-22. URL https://www.jstatsoft.org/v33/i01/.

See also

Author

Johann Pfitzinger

Examples

# Load data
data <- tidyfit::Factor_Industry_Returns

# Stand-alone function
fit <- m("ridge", Return ~ ., data, lambda = 0.5)
fit
#> # A tibble: 1 × 5
#>   estimator_fct  `size (MB)` grid_id  model_object settings        
#>   <chr>                <dbl> <chr>    <list>       <list>          
#> 1 glmnet::glmnet        3.05 #001|001 <tidyFit>    <tibble [1 × 3]>

# Within 'regress' function
fit <- regress(data, Return ~ ., m("ridge", lambda = c(0.1, 0.5)),
               .mask = c("Date", "Industry"))
coef(fit)
#> # A tibble: 14 × 5
#> # Groups:   model [1]
#>    model term        estimate grid_id  model_info      
#>    <chr> <chr>          <dbl> <chr>    <list>          
#>  1 ridge (Intercept)  0.129   #001|001 <tibble [1 × 2]>
#>  2 ridge Mkt-RF       0.870   #001|001 <tibble [1 × 2]>
#>  3 ridge SMB          0.0402  #001|001 <tibble [1 × 2]>
#>  4 ridge HML          0.0683  #001|001 <tibble [1 × 2]>
#>  5 ridge RMW          0.118   #001|001 <tibble [1 × 2]>
#>  6 ridge CMA          0.0223  #001|001 <tibble [1 × 2]>
#>  7 ridge RF           0.814   #001|001 <tibble [1 × 2]>
#>  8 ridge (Intercept)  0.00554 #001|002 <tibble [1 × 2]>
#>  9 ridge Mkt-RF       0.953   #001|002 <tibble [1 × 2]>
#> 10 ridge SMB          0.0232  #001|002 <tibble [1 × 2]>
#> 11 ridge HML          0.0640  #001|002 <tibble [1 × 2]>
#> 12 ridge RMW          0.153   #001|002 <tibble [1 × 2]>
#> 13 ridge CMA          0.0927  #001|002 <tibble [1 × 2]>
#> 14 ridge RF           0.959   #001|002 <tibble [1 × 2]>