An interface for variable importance measures for a fitted tidyfit.models frames
Source:R/explain.R
explain.Rd
A generic method for calculating XAI and variable importance methods for tidyfit.models frames.
Arguments
Details
WARNING This function is currently in an experimental stage.
The function uses the 'model_object' column in a tidyfit.model
frame to return variable importance measures for each model.
Possible packages and methods include:
sensitivity
package:
The package provides methods to assess variable importance in linear regressions ('lm') and classifications ('glm').
Usage: use_package="sensitivity"
Methods:
"lmg" (Shapley regression),
"pmvd" (Proportional marginal variance decomposition),
"src" (standardized regression coefficients),
"pcc" (partial correlation coefficients),
"johnson" (Johnson indices)
See ?sensitivity::lmg
for more information and additional arguments.
iml
package:
Integration with iml is currently in progress. The methods can be used for 'nnet', 'rf', 'lasso', 'enet', 'ridge', 'adalasso', 'glm' and 'lm'.
Usage: use_package="iml"
Methods:
"Shapley" (SHAP values)
"LocalModel" (LIME)
"FeatureImp" (Permutation-based feature importance)
The argument 'which_rows' (vector of integer indexes) can be used to explain specific rows in the data set for Shapley and LocalModel methods.
References
Molnar C, Bischl B, Casalicchio G (2018). “iml: An R package for Interpretable Machine Learning.” JOSS, 3(26), 786. doi:10.21105/joss.00786 .
Iooss B, Veiga SD, Janon A, Pujol G, Broto wcfB, Boumhaout K, Clouvel L, Delage T, Amri RE, Fruth J, Gilquin L, Guillaume J, Herin M, Idrissi MI, Le Gratiet L, Lemaitre P, Marrel A, Meynaoui A, Nelson BL, Monari F, Oomen R, Rakovec O, Ramos B, Rochet P, Roustant O, Sarazin G, Song E, Staum J, Sueur R, Touati T, Verges V, Weber F (2024). sensitivity: Global Sensitivity Analysis of Model Outputs and Importance Measures. R package version 1.30.0, https://CRAN.R-project.org/package=sensitivity.
A. Liaw and M. Wiener (2002). Classification and Regression by randomForest. R News 2(3), 18–22.
Examples
data <- dplyr::group_by(tidyfit::Factor_Industry_Returns, Industry)
fit <- regress(data, Return ~ ., m("lm"), .mask = "Date")
tidyfit::explain(fit, use_package = "sensitivity", use_method = "src")
#> Registered S3 method overwritten by 'sensitivity':
#> method from
#> print.src dplyr
#> # A tibble: 60 × 4
#> # Groups: Industry, model [10]
#> Industry model term importance
#> <chr> <chr> <chr> <dbl>
#> 1 Durbl lm Mkt-RF 0.830
#> 2 Durbl lm SMB 0.0831
#> 3 Durbl lm HML 0.119
#> 4 Durbl lm RMW 0.0679
#> 5 Durbl lm CMA 0.0665
#> 6 Durbl lm RF -0.00154
#> 7 Enrgy lm Mkt-RF 0.739
#> 8 Enrgy lm SMB -0.0278
#> 9 Enrgy lm HML 0.162
#> 10 Enrgy lm RMW 0.0572
#> # ℹ 50 more rows
data <- dplyr::filter(tidyfit::Factor_Industry_Returns, Industry == Industry[1])
fit <- regress(data, Return ~ ., m("lm"), .mask = c("Date", "Industry"))
tidyfit::explain(fit, use_package = "iml", use_method = "Shapley", which_rows = c(1))
#> # A tibble: 6 × 5
#> # Groups: model [1]
#> model term importance phi.var feature.value
#> <chr> <chr> <dbl> <dbl> <chr>
#> 1 lm Mkt-RF -1.19 10.6 Mkt.RF=-0.39
#> 2 lm SMB 0.00593 0.0324 SMB=-0.44
#> 3 lm HML 0.0662 0.0140 HML=-0.89
#> 4 lm RMW 0.120 0.890 RMW=0.68
#> 5 lm CMA -0.551 0.336 CMA=-1.23
#> 6 lm RF -0.144 0.154 RF=0.27