causalexplain.explainability package#
Submodules#
Hierarchy of links
Can I use the information above to decide wether to connect groups of variables linked together?
- class Hierarchies(method='spearman', mic_alpha=0.6, mic_c=15, linkage_method='complete', correlation_th=None, prog_bar=False, verbose=False, silent=False)[source]#
Bases:
objectClass representing the hierarchy of links between variables.
- Parameters:
method (str or Callable, optional) – Method to use to compute the correlation. Default is ‘spearman’, but can also be ‘pearson’, ‘kendall’ or ‘mic’.
alpha (float, optional) – Threshold for the correlation. Default is 0.6.
c (int, optional) – Number of clusters to be formed. Default is 15. Only valid with MIC.
linkage_method (str, optional) – Method to use to compute the linkage. Default is ‘complete’.
correlation_th (float, optional) – Deprecated; retained for backward compatibility. Not used for pruning.
prog_bar (bool, optional) – Whether to show a progress bar during computation. Default is False.
verbose (bool, optional) – Whether to print additional information. Default is False.
silent (bool, optional) – Whether to suppress all output. Default is False.
- correlations = None#
- __init__(method='spearman', mic_alpha=0.6, mic_c=15, linkage_method='complete', correlation_th=None, prog_bar=False, verbose=False, silent=False)[source]#
Initialize the Hierarchies object.
- Parameters:
method (str or Callable, optional) – Method to use to compute the correlation. Default is ‘spearman’, but can also be ‘pearson’, ‘kendall’ or ‘mic’.
mic_alpha (float, optional) – Threshold for the correlation. Default is 0.6.
mic_c (int, optional) – Number of clusters to be formed. Default is 15. Only valid with MIC.
linkage_method (str, optional) – Method to use to compute the linkage. Default is ‘complete’.
correlation_th (float, optional) – Deprecated; retained for backward compatibility.
prog_bar (bool, optional) – Whether to show a progress bar during computation. Default is False.
verbose (bool, optional) – Whether to print additional information. Default is False.
silent (bool, optional) – Whether to suppress all output. Default is False.
- fit(X)[source]#
Compute the hierarchy of links between variables using the correlation method specified in corr_method.
- Parameters:
X (pd.DataFrame) – The input data.
y (None) – Ignored.
- Returns:
self – The fitted Hierarchies object.
- Return type:
- static compute_correlation_matrix(data, method='spearman', mic_alpha=0.6, mic_c=15, prog_bar=False)[source]#
Compute the correlation matrix.
- Parameters:
- Returns:
correlations – The correlation matrix.
- Return type:
pd.DataFrame
Return an empty mapping (correlation pruning is deprecated).
- Parameters:
- Returns:
correlated_features – Empty mapping kept for backward compatibility.
- Return type:
defaultdict(list)
- expand_clusters_perm_importance(pi, ground_truth=None)[source]#
Expand the clusters of the linkage matrix to include the features that are in the same cluster in the permutation importance matrix. It expands, for each cluster, with the metrics related to correlation, deltas, backward PI, etc. Used to determine if some criteria can be extracted.
- Parameters:
pi (pd.DataFrame) – Permutation importance matrix.
ground_truth (pd.DataFrame, optional) – Ground truth matrix.
- Return type:
None
- connect_isolated_nodes(G, linkage_mat, feature_names, verbose=False)[source]#
Connect isolated nodes in the graph, based on their relationship in the hierarchical clustering provided through the linkage_mat.
- plot_dendogram_correlations(correlations, feature_names, **kwargs)[source]#
Plot the dendrogram of the correlation matrix.
- Parameters:
(pd.DataFrame) (- correlations) – Correlation matrix.
(List[str]) (- feature_names) – List of feature names.
kwargs (-) – Keyword arguments to be passed to the plot_dendogram function.
Permutation Importance for feature selection. Wrapper over SKLearn’s PermutationImportance and own implementation of the vanilla version of the algorithm to run over models trained with PyTorch.
Renero 2022, 2023
Parameters:#
- models: dict
A dictionary of models, where the keys are the target variables and the values are the models trained to predict the target variables.
- n_repeats: int
The number of times to repeat the permutation importance algorithm.
- mean_pi_percentile: float
The percentile of the mean permutation importance to use as a threshold for feature selection.
- random_state: int
The random state to use for the permutation importance algorithm.
- prog_bar: bool
Whether to display a progress bar or not.
- verbose: bool
Whether to display explanations on the process or not.
- silent: bool
Whether to display anything or not.
- class PermutationImportance(models, discrepancies=None, correlation_th=None, n_repeats=10, mean_pi_percentile=0.8, exhaustive=False, threshold=None, random_state=42, prog_bar=True, verbose=False, silent=False)[source]#
Bases:
BaseEstimatorPermutation Importance for feature selection. Wrapper over SKLearn’s PermutationImportance and own implementation of the vanilla version of the algorithm to run over models trained with PyTorch.
- device = 'cpu'#
- __init__(models, discrepancies=None, correlation_th=None, n_repeats=10, mean_pi_percentile=0.8, exhaustive=False, threshold=None, random_state=42, prog_bar=True, verbose=False, silent=False)[source]#
- fit(X)[source]#
Implementation of the fit method for the PermutationImportance class. If the model is a PyTorch model, the fit method will compute the base loss for each feature. If the model is a SKLearn model, the fit method will compute the permutation importance for each feature.
- class RegQuality[source]#
Bases:
BaseEstimator- static predict(scores, gamma_shape=1, gamma_scale=1, threshold=0.9, verbose=False)[source]#
Returns the indices of features that are both gamma and outliers. Both criteria are applied to the given scores to determine if the MSE error obtained from the regression is bad compared with the rest of regressions for the other features in the dataset, and thus the feature should be considered a parent node.
This module builds the causal graph based on the informacion that we derived from the SHAP values. The main idea is to use the SHAP values to compute the discrepancy between the SHAP values and the target values. This discrepancy is then used to build the graph.
- class ShapDiscrepancy(target, parent, shap_heteroskedasticity, parent_heteroskedasticity, shap_p_value, parent_p_value, shap_model, parent_model, shap_discrepancy, shap_correlation, shap_gof, ks_pvalue, ks_result)[source]#
Bases:
objectA class representing the discrepancy between the SHAP value and the parent value for a given feature.
- - target
The name of the target feature.
- Type:
- - parent
The name of the parent feature.
- Type:
- - shap_heteroskedasticity
Whether the SHAP value exhibits heteroskedasticity.
- Type:
- - parent_heteroskedasticity
Whether the parent value exhibits heteroskedasticity.
- Type:
- - shap_p_value
The p-value for the SHAP value.
- Type:
- - parent_p_value
The p-value for the parent value.
- Type:
- - shap_model
The regression model for the SHAP value.
- Type:
sm.regression.linear_model.RegressionResultsWrapper
- - parent_model
The regression model for the parent value.
- Type:
sm.regression.linear_model.RegressionResultsWrapper
- - shap_discrepancy
The discrepancy between the SHAP value and the parent value.
- Type:
- - shap_correlation
The correlation between the SHAP value and the parent value.
- Type:
- - shap_gof
The goodness of fit for the SHAP value.
- Type:
- - ks_pvalue
The p-value for the Kolmogorov-Smirnov test.
- Type:
- - ks_result
The result of the Kolmogorov-Smirnov test.
- Type:
- shap_model: RegressionResultsWrapper#
- parent_model: RegressionResultsWrapper#
- __init__(target, parent, shap_heteroskedasticity, parent_heteroskedasticity, shap_p_value, parent_p_value, shap_model, parent_model, shap_discrepancy, shap_correlation, shap_gof, ks_pvalue, ks_result)#
- class ShapRunDiagnostics(backend: str, mode: Literal['no_sampling', 'single_sample', 'multi_sample'], m: int, n_background: int, K: int, seeds: List[int], n_explain: int, stability: Dict[str, Any], warnings: List[str])[source]#
Bases:
object- __init__(backend, mode, m, n_background, K, seeds, n_explain, stability, warnings)#
- sample_rows(X, n, stratify=None, seed=None, return_indices=False)[source]#
Sample rows from X with optional stratification.
- Parameters:
- Returns:
Sampled data, and optionally the sampled indices.
- Return type:
- build_kernel_explainer(model, X_bg)[source]#
Create a KernelExplainer with a resolved model callable.
- compute_kernel_shap(explainer, X_explain, n_features, nsamples=None, class_index=None)[source]#
Compute Kernel SHAP values with optional class selection.
- Parameters:
- Returns:
SHAP values in backend-specific format.
- Return type:
- compute_gradient_shap(explainer, X_explain, batch_size=128)[source]#
Compute Gradient SHAP values in batches.
- build_generic_explainer(model, X_bg)[source]#
Create a generic shap.Explainer with a tabular masker.
- compute_shap_adaptive(X, model, backend, y=None, max_shap_samples=512, K_max=5, max_explain_samples=None, random_state=None, stratify=None, warn_threshold_cv=0.1, warn_threshold_rankcorr=0.9, topN_important=20, verbose=False, kernel_nsamples=None, batch_size=128, class_index=None, adaptive_shap_sampling=True)[source]#
Compute SHAP values with adaptive background sampling controls.
- Parameters:
X (Any) – Input data to explain.
model (Any) – Trained model to explain.
backend (Literal['kernel', 'gradient', 'explainer']) – SHAP backend name (“kernel”, “gradient”, “explainer”).
y (Any | None) – Optional target values for stratification or diagnostics.
max_shap_samples (int) – Background cap for adaptive sampling.
K_max (int) – Maximum number of repeated sampling runs.
max_explain_samples (int | None) – Optional cap for explained rows.
random_state (int | None) – Random seed for deterministic sampling.
stratify (Any | None) – Optional stratification labels for sampling.
warn_threshold_cv (float) – CV threshold for stability warnings.
warn_threshold_rankcorr (float) – Rank correlation threshold for warnings.
topN_important (int) – Number of top features to track.
verbose (bool) – Whether to print diagnostic warnings.
kernel_nsamples (int | None) – Optional KernelExplainer nsamples override.
batch_size (int) – Batch size for gradient explainer runs.
class_index (int | None) – Optional class index for multi-class outputs.
adaptive_shap_sampling (bool) – Enable adaptive sampling and stability checks.
- Returns:
A tuple of (shap_result, diagnostics) with backend-specific outputs.
- Return type:
- compute_shap(X, model, backend, y=None, adaptive_shap_sampling=True, **kwargs)[source]#
High-level wrapper for SHAP computation with optional adaptive sampling.
- Parameters:
X (Any) – Input data to explain.
model (Any) – Trained model to explain.
backend (Literal['kernel', 'gradient', 'explainer']) – SHAP backend name (“kernel”, “gradient”, “explainer”).
y (Any | None) – Optional target values for stratification or diagnostics.
adaptive_shap_sampling (bool) – Enable adaptive sampling behavior.
**kwargs (Any) – Forwarded keyword arguments to compute_shap_adaptive.
- Returns:
A tuple of (shap_result, diagnostics).
- Return type:
- class ShapEstimator(explainer='explainer', models=None, correlation_th=None, mean_shap_percentile=0.8, iters=20, reciprocity=False, min_impact=1e-06, exhaustive=False, background_size=200, background_method='sample', background_seed=None, parallel_jobs=0, on_gpu=False, verbose=False, prog_bar=True, silent=False)[source]#
Bases:
BaseEstimatorA class for computing SHAP values and building a causal graph from them.
- Parameters:
explainer (str, default="explainer") – The SHAP explainer to use. Possible values are “kernel”, “gradient”, “explainer”, and “tree”.
models (BaseEstimator, default=None) – The models to use for computing SHAP values. If None, a linear regression model is used for each feature.
correlation_th (float, default=None) – Deprecated; retained for backward compatibility. No features are dropped.
mean_shap_percentile (float, default=0.8) – The percentile threshold for selecting features based on their mean SHAP value.
iters (int, default=20) – The number of iterations to use for the feature selection method.
reciprocity (bool, default=False) – Whether to enforce reciprocity in the causal graph.
min_impact (float, default=1e-06) – The minimum impact threshold for selecting features.
exhaustive (bool, default=False) – Whether to use the exhaustive (recursive) method for selecting features. If this is True, the threshold parameter must be provided, and the clustering is performed until remaining values to be clustered are below the given threshold.
threshold (float, default=None) – The threshold to use when exhaustive is True. If None, exception is raised.
on_gpu (bool, default=False) – Whether to use the GPU for computing SHAP values.
verbose (bool, default=False) – Whether to print verbose output.
prog_bar (bool, default=True) – Whether to show a progress bar.
silent (bool, default=False) – Whether to suppress all output.
- device = 'cpu'#
- shap_discrepancies = None#
- __init__(explainer='explainer', models=None, correlation_th=None, mean_shap_percentile=0.8, iters=20, reciprocity=False, min_impact=1e-06, exhaustive=False, background_size=200, background_method='sample', background_seed=None, parallel_jobs=0, on_gpu=False, verbose=False, prog_bar=True, silent=False)[source]#
Initialize the ShapEstimator object.
- Parameters:
explainer (str, default="explainer") – The SHAP explainer to use. Possible values are “kernel”, “gradient”, “explainer”, and “tree”.
models (BaseEstimator, default=None) – The models to use for computing SHAP values. If None, a linear regression model is used for each feature.
correlation_th (float, default=None) – Deprecated; retained for backward compatibility. No features are dropped.
mean_shap_percentile (float, default=0.8) – The percentile threshold for selecting features based on their mean SHAP value.
iters (int, default=20) – The number of iterations to use for the feature selection method.
reciprocity (bool, default=False) – Whether to enforce reciprocity in the causal graph.
min_impact (float, default=1e-06) – The minimum impact threshold for selecting features.
exhaustive (bool, default=False) – Whether to use the exhaustive (recursive) method for selecting features. If this is True, the threshold parameter must be provided, and the clustering is performed until remaining values to be clustered are below the given threshold.
background_size (int, optional) – Maximum number of background samples used by SHAP explainers. If None or larger than the available samples, all rows are used.
background_method (str, default="sample") – Background selection strategy: “sample” or “kmeans”.
background_seed (int, optional) – Random seed for background sampling when using “sample”.
threshold (float, default=None) – The threshold to use when exhaustive is True. If None, exception is raised.
on_gpu (bool, default=False) – Whether to use the GPU for computing SHAP values.
verbose (bool, default=False) – Whether to print verbose output.
prog_bar (bool, default=True) – Whether to show a progress bar.
silent (bool, default=False) – Whether to suppress all output.
Args – explainer: SHAP explainer name to use. models: Optional estimator collection used for SHAP computation. correlation_th: Deprecated; retained for backward compatibility. mean_shap_percentile: Percentile used to compute SHAP threshold. iters: Number of iterations for feature selection. reciprocity: Whether to enforce reciprocal edges. min_impact: Minimum SHAP impact threshold for selection. exhaustive: Whether to run exhaustive feature selection. background_size: Background sample size for SHAP explainers. background_method: Background selection method (“sample” or “kmeans”). background_seed: Random seed for background sampling. parallel_jobs: Parallel worker count. on_gpu: Whether to use GPU for SHAP computation. verbose: Enable verbose output. prog_bar: Whether to show progress bars. silent: Suppress all output.
Returns – None.
- explainer = 'explainer'#
- models = None#
- correlation_th = None#
- mean_shap_percentile = 0.8#
- iters = 20#
- reciprocity = False#
- min_impact = 1e-06#
- exhaustive = False#
- background_size = 200#
- background_method = 'sample'#
- background_seed = None#
- parallel_jobs = 0#
- on_gpu = False#
- verbose = False#
- prog_bar = True#
- silent = False#
- __str__()[source]#
Return a compact string representation for logging.
- Parameters:
None.
- Returns:
String representation of the estimator.
- Return type:
- fit(X)[source]#
Fit the ShapleyExplainer model to the given dataset.
Parameters: - X: The input dataset.
Returns: - self: The fitted ShapleyExplainer model.
- Parameters:
X (DataFrame) – Input dataset used to compute SHAP values.
- Returns:
The fitted estimator instance.
- Return type:
- predict(X, root_causes=None, prior=None)[source]#
Builds a causal graph from the shap values using a selection mechanism based on clustering, knee or abrupt methods.
- Parameters:
X (pd.DataFrame) – The input data. Consists of all the features in a pandas DataFrame.
root_causes (List[str], optional) – The root causes of the graph. If None, all features are considered as root causes, by default None.
prior (List[List[str]], optional) – The prior knowledge about the connections between the features. If None, all features are considered as valid candidates for the connections, by default None.
- Returns:
nx.DiGraph – The causal graph.
Args – X: Input dataset used for prediction. root_causes: Optional list of root-cause feature names. prior: Optional prior knowledge constraints.
Returns – The inferred causal graph.
- Return type:
DiGraph
- adjust(graph, increase_tolerance=0.0, sd_upper=0.1)[source]#
Adjust graph edges based on SHAP discrepancy thresholds.
- compute_error_contribution()[source]#
Computes the error contribution of each feature for each target. If this value is positive, then it means that, on average, the presence of the feature in the model leads to a higher error. Thus, without that feature, the prediction would have been generally better. In other words, the feature is making more harm than good! On the contrary, the more negative this value, the more beneficial the feature is for the predictions since its presence leads to smaller errors.
Module contents#
Explainability techniques used for causal discovery.
This module contains various techniques and tools for explaining and interpreting causal discovery results:
shapley: Implements Shapley value-based methods for attributing importance to features in causal models.
regression_quality: Provides metrics and tools for assessing the quality of regression models used in causal discovery.
perm_importance: Implements permutation importance methods for feature importance in causal models.
hierarchies: Contains tools for analyzing and visualizing hierarchical structures in causal relationships.
These submodules offer a range of approaches to enhance the interpretability and understanding of causal discovery results, aiding in the validation and refinement of causal models.