Skip to content

Fits a principal components regression on a 'tidyFit' R6 class. The function can be used with regress.

Usage

.model.pcr(self, data = NULL)

Arguments

self

a 'tidyFit' R6 class.

data

a data frame, data frame extension (e.g. a tibble), or a lazy data frame (e.g. from dbplyr or dtplyr).

Value

A fitted 'tidyFit' class model.

Details

Hyperparameters:

  • ncomp (number of components)

  • ncomp_pct (number of components, percentage of features)

Important method arguments (passed to m)

The principal components regression is fitted using pls package. See ?pcr for more details.

Implementation

Covariates are standardized, with coefficients back-transformed to the original scale. An intercept is always included.

If no hyperparameter grid is passed (is.null(control$ncomp) & is.null(control$ncomp_pct)), the default is ncomp_pct = seq(0, 1, length.out = 20), where 0 results in one component and 1 results in the number of features.

When 'jackknife = TRUE' is passed (and a 'validation' method is chosen), coef also returns the jack-knife standard errors, t-statistics and p-values.

Note that at present pls does not offer weighted implementations or non-gaussian response. The method can therefore only be used with regress

References

Liland K, Mevik B, Wehrens R (2022). pls: Partial Least Squares and Principal Component Regression. R package version 2.8-1, https://CRAN.R-project.org/package=pls.

See also

.model.plsr and m methods

Author

Johann Pfitzinger

Examples

# Load data
data <- tidyfit::Factor_Industry_Returns
data <- dplyr::filter(data, Industry == "HiTec")
data <- dplyr::select(data, -Industry)

# Stand-alone function
fit <- m("pcr", Return ~ ., data, ncomp = 1:3)
fit
#> # A tibble: 3 × 5
#>   estimator_fct `size (MB)` grid_id  model_object settings        
#>   <chr>               <dbl> <chr>    <list>       <list>          
#> 1 pls::pcr            0.258 #001|001 <tidyFit>    <tibble [1 × 1]>
#> 2 pls::pcr            0.258 #001|002 <tidyFit>    <tibble [1 × 1]>
#> 3 pls::pcr            0.258 #001|003 <tidyFit>    <tibble [1 × 1]>

# Within 'regress' function
fit <- regress(data, Return ~ .,
               m("pcr", jackknife = TRUE, validation = "LOO", ncomp_pct = 0.5),
               .mask = c("Date"))
tidyr::unnest(coef(fit), model_info)
#> # A tibble: 7 × 7
#> # Groups:   model [1]
#>   model term        estimate ncomp std.error statistic   p.value
#>   <chr> <chr>          <dbl> <dbl>     <dbl>     <dbl>     <dbl>
#> 1 pcr   (Intercept)    1.69      3    NA        NA     NA       
#> 2 pcr   Mkt-RF         0.412     3     0.376     4.90   1.20e- 6
#> 3 pcr   SMB            0.509     3     0.282     5.46   6.48e- 8
#> 4 pcr   HML           -0.545     3     0.225    -7.19   1.64e-12
#> 5 pcr   RMW           -0.610     3     0.229    -5.90   5.70e- 9
#> 6 pcr   CMA           -0.869     3     0.133   -13.2    6.67e-36
#> 7 pcr   RF            -1.08      3     0.379    -0.768  4.43e- 1