causalexplain.explainability package#

Submodules#

Hierarchy of links

Can I use the information above to decide wether to connect groups of variables linked together?

class Hierarchies(method='spearman', mic_alpha=0.6, mic_c=15, linkage_method='complete', correlation_th=None, prog_bar=False, verbose=False, silent=False)[source]#

Bases: object

Class representing the hierarchy of links between variables.

Parameters:
  • method (str or Callable, optional) – Method to use to compute the correlation. Default is ‘spearman’, but can also be ‘pearson’, ‘kendall’ or ‘mic’.

  • alpha (float, optional) – Threshold for the correlation. Default is 0.6.

  • c (int, optional) – Number of clusters to be formed. Default is 15. Only valid with MIC.

  • linkage_method (str, optional) – Method to use to compute the linkage. Default is ‘complete’.

  • correlation_th (float, optional) – Deprecated; retained for backward compatibility. Not used for pruning.

  • prog_bar (bool, optional) – Whether to show a progress bar during computation. Default is False.

  • verbose (bool, optional) – Whether to print additional information. Default is False.

  • silent (bool, optional) – Whether to suppress all output. Default is False.

correlations = None#
__init__(method='spearman', mic_alpha=0.6, mic_c=15, linkage_method='complete', correlation_th=None, prog_bar=False, verbose=False, silent=False)[source]#

Initialize the Hierarchies object.

Parameters:
  • method (str or Callable, optional) – Method to use to compute the correlation. Default is ‘spearman’, but can also be ‘pearson’, ‘kendall’ or ‘mic’.

  • mic_alpha (float, optional) – Threshold for the correlation. Default is 0.6.

  • mic_c (int, optional) – Number of clusters to be formed. Default is 15. Only valid with MIC.

  • linkage_method (str, optional) – Method to use to compute the linkage. Default is ‘complete’.

  • correlation_th (float, optional) – Deprecated; retained for backward compatibility.

  • prog_bar (bool, optional) – Whether to show a progress bar during computation. Default is False.

  • verbose (bool, optional) – Whether to print additional information. Default is False.

  • silent (bool, optional) – Whether to suppress all output. Default is False.

linkage_mat: ndarray = None#
fit(X)[source]#

Compute the hierarchy of links between variables using the correlation method specified in corr_method.

Parameters:
  • X (pd.DataFrame) – The input data.

  • y (None) – Ignored.

Returns:

self – The fitted Hierarchies object.

Return type:

Hierarchies

static compute_correlation_matrix(data, method='spearman', mic_alpha=0.6, mic_c=15, prog_bar=False)[source]#

Compute the correlation matrix.

Parameters:
  • data (pd.DataFrame) – The input data.

  • method (str or Callable, optional) – Method to use to compute the correlation. Default is ‘spearman’, but can also be ‘pearson’, ‘kendall’ or ‘mic’.

  • prog_bar (bool, optional) – Whether to show a progress bar during computation. Default is False.

Returns:

correlations – The correlation matrix.

Return type:

pd.DataFrame

static compute_correlated_features(correlations, correlation_th, feature_names, verbose=False)[source]#

Return an empty mapping (correlation pruning is deprecated).

Parameters:
  • correlations (pd.DataFrame) – The correlation matrix.

  • correlation_th (float) – Deprecated; retained for backward compatibility.

  • feature_names (List[str]) – The list of feature names.

  • verbose (bool, optional) – Whether to print additional information. Default is False.

Returns:

correlated_features – Empty mapping kept for backward compatibility.

Return type:

defaultdict(list)

expand_clusters_perm_importance(pi, ground_truth=None)[source]#

Expand the clusters of the linkage matrix to include the features that are in the same cluster in the permutation importance matrix. It expands, for each cluster, with the metrics related to correlation, deltas, backward PI, etc. Used to determine if some criteria can be extracted.

Parameters:
  • pi (pd.DataFrame) – Permutation importance matrix.

  • ground_truth (pd.DataFrame, optional) – Ground truth matrix.

Return type:

None

hierarchical_dissimilarities()[source]#

Compute the dissimilarities between features in a hierarchical clustering.

Returns:

hierarchical_dissimilarity – Dissimilarities between features.

Return type:

pd.DataFrame

connect_isolated_nodes(G, linkage_mat, feature_names, verbose=False)[source]#

Connect isolated nodes in the graph, based on their relationship in the hierarchical clustering provided through the linkage_mat.

connect_hierarchies(G, linkage_mat, feature_names, verbose=False)[source]#
plot_dendogram_correlations(correlations, feature_names, **kwargs)[source]#

Plot the dendrogram of the correlation matrix.

Parameters:
  • (pd.DataFrame) (- correlations) – Correlation matrix.

  • (List[str]) (- feature_names) – List of feature names.

  • kwargs (-) – Keyword arguments to be passed to the plot_dendogram function.

Permutation Importance for feature selection. Wrapper over SKLearn’s PermutationImportance and own implementation of the vanilla version of the algorithm to run over models trained with PyTorch.

    1. Renero 2022, 2023

Parameters:#

models: dict

A dictionary of models, where the keys are the target variables and the values are the models trained to predict the target variables.

n_repeats: int

The number of times to repeat the permutation importance algorithm.

mean_pi_percentile: float

The percentile of the mean permutation importance to use as a threshold for feature selection.

random_state: int

The random state to use for the permutation importance algorithm.

prog_bar: bool

Whether to display a progress bar or not.

verbose: bool

Whether to display explanations on the process or not.

silent: bool

Whether to display anything or not.

class PermutationImportance(models, discrepancies=None, correlation_th=None, n_repeats=10, mean_pi_percentile=0.8, exhaustive=False, threshold=None, random_state=42, prog_bar=True, verbose=False, silent=False)[source]#

Bases: BaseEstimator

Permutation Importance for feature selection. Wrapper over SKLearn’s PermutationImportance and own implementation of the vanilla version of the algorithm to run over models trained with PyTorch.

device = 'cpu'#
__init__(models, discrepancies=None, correlation_th=None, n_repeats=10, mean_pi_percentile=0.8, exhaustive=False, threshold=None, random_state=42, prog_bar=True, verbose=False, silent=False)[source]#
fit(X)[source]#

Implementation of the fit method for the PermutationImportance class. If the model is a PyTorch model, the fit method will compute the base loss for each feature. If the model is a SKLearn model, the fit method will compute the permutation importance for each feature.

predict(X=None, root_causes=None, prior=None)[source]#

Implementation of the predict method for the PermutationImportance class.

fit_predict(X, root_causes)[source]#
plot(**kwargs)[source]#

Plot the permutation importance for each feature, by calling the internal _plot_perm_imp method.

class RegQuality[source]#

Bases: BaseEstimator

__init__()[source]#
static predict(scores, gamma_shape=1, gamma_scale=1, threshold=0.9, verbose=False)[source]#

Returns the indices of features that are both gamma and outliers. Both criteria are applied to the given scores to determine if the MSE error obtained from the regression is bad compared with the rest of regressions for the other features in the dataset, and thus the feature should be considered a parent node.

Parameters:

scores (List[float]) – List of scores

Returns:

List of indices of features that are both gamma and outliers

Return type:

Set[int]

This module builds the causal graph based on the informacion that we derived from the SHAP values. The main idea is to use the SHAP values to compute the discrepancy between the SHAP values and the target values. This discrepancy is then used to build the graph.

class ShapDiscrepancy(target, parent, shap_heteroskedasticity, parent_heteroskedasticity, shap_p_value, parent_p_value, shap_model, parent_model, shap_discrepancy, shap_correlation, shap_gof, ks_pvalue, ks_result)[source]#

Bases: object

A class representing the discrepancy between the SHAP value and the parent value for a given feature.

- target

The name of the target feature.

Type:

str

- parent

The name of the parent feature.

Type:

str

- shap_heteroskedasticity

Whether the SHAP value exhibits heteroskedasticity.

Type:

bool

- parent_heteroskedasticity

Whether the parent value exhibits heteroskedasticity.

Type:

bool

- shap_p_value

The p-value for the SHAP value.

Type:

float

- parent_p_value

The p-value for the parent value.

Type:

float

- shap_model

The regression model for the SHAP value.

Type:

sm.regression.linear_model.RegressionResultsWrapper

- parent_model

The regression model for the parent value.

Type:

sm.regression.linear_model.RegressionResultsWrapper

- shap_discrepancy

The discrepancy between the SHAP value and the parent value.

Type:

float

- shap_correlation

The correlation between the SHAP value and the parent value.

Type:

float

- shap_gof

The goodness of fit for the SHAP value.

Type:

float

- ks_pvalue

The p-value for the Kolmogorov-Smirnov test.

Type:

float

- ks_result

The result of the Kolmogorov-Smirnov test.

Type:

str

target: str#
parent: str#
shap_heteroskedasticity: bool#
parent_heteroskedasticity: bool#
shap_p_value: float#
parent_p_value: float#
shap_model: RegressionResultsWrapper#
parent_model: RegressionResultsWrapper#
shap_discrepancy: float#
shap_correlation: float#
shap_gof: float#
ks_pvalue: float#
ks_result: str#
__init__(target, parent, shap_heteroskedasticity, parent_heteroskedasticity, shap_p_value, parent_p_value, shap_model, parent_model, shap_discrepancy, shap_correlation, shap_gof, ks_pvalue, ks_result)#
class ShapRunDiagnostics(backend: str, mode: Literal['no_sampling', 'single_sample', 'multi_sample'], m: int, n_background: int, K: int, seeds: List[int], n_explain: int, stability: Dict[str, Any], warnings: List[str])[source]#

Bases: object

backend: str#
mode: Literal['no_sampling', 'single_sample', 'multi_sample']#
m: int#
n_background: int#
K: int#
seeds: List[int]#
n_explain: int#
stability: Dict[str, Any]#
warnings: List[str]#
__init__(backend, mode, m, n_background, K, seeds, n_explain, stability, warnings)#
sample_rows(X, n, stratify=None, seed=None, return_indices=False)[source]#

Sample rows from X with optional stratification.

Parameters:
  • X (Any) – Input data to sample from.

  • n (int | None) – Number of rows to sample (None uses all rows).

  • stratify (Any | None) – Optional labels for stratified sampling.

  • seed (int | None) – Random seed for deterministic sampling.

  • return_indices (bool) – Whether to return sampled row indices.

Returns:

Sampled data, and optionally the sampled indices.

Return type:

Any | Tuple[Any, ndarray]

build_kernel_explainer(model, X_bg)[source]#

Create a KernelExplainer with a resolved model callable.

Parameters:
  • model (Any) – Model to explain.

  • X_bg (Any) – Background data for KernelExplainer.

Returns:

An initialized KernelExplainer instance.

Return type:

Any

compute_kernel_shap(explainer, X_explain, n_features, nsamples=None, class_index=None)[source]#

Compute Kernel SHAP values with optional class selection.

Parameters:
  • explainer (Any) – KernelExplainer instance.

  • X_explain (Any) – Data to explain.

  • n_features (int) – Number of features in X_explain.

  • nsamples (int | None) – KernelExplainer nsamples parameter.

  • class_index (int | None) – Optional class index for multi-class outputs.

Returns:

SHAP values in backend-specific format.

Return type:

Any

build_gradient_explainer(model, X_bg)[source]#

Create a GradientExplainer with backend validation.

Parameters:
  • model (Any) – Differentiable model to explain.

  • X_bg (Any) – Background data for GradientExplainer.

Returns:

An initialized GradientExplainer instance.

Return type:

Any

compute_gradient_shap(explainer, X_explain, batch_size=128)[source]#

Compute Gradient SHAP values in batches.

Parameters:
  • explainer (Any) – GradientExplainer instance.

  • X_explain (Any) – Data to explain.

  • batch_size (int) – Batch size for memory-safe execution.

Returns:

SHAP values in backend-specific format.

Return type:

Any

build_generic_explainer(model, X_bg)[source]#

Create a generic shap.Explainer with a tabular masker.

Parameters:
  • model (Any) – Model or callable to explain.

  • X_bg (Any) – Background data for the masker.

Returns:

A shap.Explainer instance.

Return type:

Any

compute_generic_shap(explainer, X_explain)[source]#

Compute SHAP values for generic explainers.

Parameters:
  • explainer (Any) – shap.Explainer instance.

  • X_explain (Any) – Data to explain.

Returns:

SHAP values in backend-specific format.

Return type:

Any

compute_shap_adaptive(X, model, backend, y=None, max_shap_samples=512, K_max=5, max_explain_samples=None, random_state=None, stratify=None, warn_threshold_cv=0.1, warn_threshold_rankcorr=0.9, topN_important=20, verbose=False, kernel_nsamples=None, batch_size=128, class_index=None, adaptive_shap_sampling=True)[source]#

Compute SHAP values with adaptive background sampling controls.

Parameters:
  • X (Any) – Input data to explain.

  • model (Any) – Trained model to explain.

  • backend (Literal['kernel', 'gradient', 'explainer']) – SHAP backend name (“kernel”, “gradient”, “explainer”).

  • y (Any | None) – Optional target values for stratification or diagnostics.

  • max_shap_samples (int) – Background cap for adaptive sampling.

  • K_max (int) – Maximum number of repeated sampling runs.

  • max_explain_samples (int | None) – Optional cap for explained rows.

  • random_state (int | None) – Random seed for deterministic sampling.

  • stratify (Any | None) – Optional stratification labels for sampling.

  • warn_threshold_cv (float) – CV threshold for stability warnings.

  • warn_threshold_rankcorr (float) – Rank correlation threshold for warnings.

  • topN_important (int) – Number of top features to track.

  • verbose (bool) – Whether to print diagnostic warnings.

  • kernel_nsamples (int | None) – Optional KernelExplainer nsamples override.

  • batch_size (int) – Batch size for gradient explainer runs.

  • class_index (int | None) – Optional class index for multi-class outputs.

  • adaptive_shap_sampling (bool) – Enable adaptive sampling and stability checks.

Returns:

A tuple of (shap_result, diagnostics) with backend-specific outputs.

Return type:

Tuple[Any, ShapRunDiagnostics]

compute_shap(X, model, backend, y=None, adaptive_shap_sampling=True, **kwargs)[source]#

High-level wrapper for SHAP computation with optional adaptive sampling.

Parameters:
  • X (Any) – Input data to explain.

  • model (Any) – Trained model to explain.

  • backend (Literal['kernel', 'gradient', 'explainer']) – SHAP backend name (“kernel”, “gradient”, “explainer”).

  • y (Any | None) – Optional target values for stratification or diagnostics.

  • adaptive_shap_sampling (bool) – Enable adaptive sampling behavior.

  • **kwargs (Any) – Forwarded keyword arguments to compute_shap_adaptive.

Returns:

A tuple of (shap_result, diagnostics).

Return type:

Tuple[Any, ShapRunDiagnostics]

class ShapEstimator(explainer='explainer', models=None, correlation_th=None, mean_shap_percentile=0.8, iters=20, reciprocity=False, min_impact=1e-06, exhaustive=False, background_size=200, background_method='sample', background_seed=None, parallel_jobs=0, on_gpu=False, verbose=False, prog_bar=True, silent=False)[source]#

Bases: BaseEstimator

A class for computing SHAP values and building a causal graph from them.

Parameters:
  • explainer (str, default="explainer") – The SHAP explainer to use. Possible values are “kernel”, “gradient”, “explainer”, and “tree”.

  • models (BaseEstimator, default=None) – The models to use for computing SHAP values. If None, a linear regression model is used for each feature.

  • correlation_th (float, default=None) – Deprecated; retained for backward compatibility. No features are dropped.

  • mean_shap_percentile (float, default=0.8) – The percentile threshold for selecting features based on their mean SHAP value.

  • iters (int, default=20) – The number of iterations to use for the feature selection method.

  • reciprocity (bool, default=False) – Whether to enforce reciprocity in the causal graph.

  • min_impact (float, default=1e-06) – The minimum impact threshold for selecting features.

  • exhaustive (bool, default=False) – Whether to use the exhaustive (recursive) method for selecting features. If this is True, the threshold parameter must be provided, and the clustering is performed until remaining values to be clustered are below the given threshold.

  • threshold (float, default=None) – The threshold to use when exhaustive is True. If None, exception is raised.

  • on_gpu (bool, default=False) – Whether to use the GPU for computing SHAP values.

  • verbose (bool, default=False) – Whether to print verbose output.

  • prog_bar (bool, default=True) – Whether to show a progress bar.

  • silent (bool, default=False) – Whether to suppress all output.

device = 'cpu'#
shap_discrepancies = None#
__init__(explainer='explainer', models=None, correlation_th=None, mean_shap_percentile=0.8, iters=20, reciprocity=False, min_impact=1e-06, exhaustive=False, background_size=200, background_method='sample', background_seed=None, parallel_jobs=0, on_gpu=False, verbose=False, prog_bar=True, silent=False)[source]#

Initialize the ShapEstimator object.

Parameters:
  • explainer (str, default="explainer") – The SHAP explainer to use. Possible values are “kernel”, “gradient”, “explainer”, and “tree”.

  • models (BaseEstimator, default=None) – The models to use for computing SHAP values. If None, a linear regression model is used for each feature.

  • correlation_th (float, default=None) – Deprecated; retained for backward compatibility. No features are dropped.

  • mean_shap_percentile (float, default=0.8) – The percentile threshold for selecting features based on their mean SHAP value.

  • iters (int, default=20) – The number of iterations to use for the feature selection method.

  • reciprocity (bool, default=False) – Whether to enforce reciprocity in the causal graph.

  • min_impact (float, default=1e-06) – The minimum impact threshold for selecting features.

  • exhaustive (bool, default=False) – Whether to use the exhaustive (recursive) method for selecting features. If this is True, the threshold parameter must be provided, and the clustering is performed until remaining values to be clustered are below the given threshold.

  • background_size (int, optional) – Maximum number of background samples used by SHAP explainers. If None or larger than the available samples, all rows are used.

  • background_method (str, default="sample") – Background selection strategy: “sample” or “kmeans”.

  • background_seed (int, optional) – Random seed for background sampling when using “sample”.

  • threshold (float, default=None) – The threshold to use when exhaustive is True. If None, exception is raised.

  • on_gpu (bool, default=False) – Whether to use the GPU for computing SHAP values.

  • verbose (bool, default=False) – Whether to print verbose output.

  • prog_bar (bool, default=True) – Whether to show a progress bar.

  • silent (bool, default=False) – Whether to suppress all output.

  • Args – explainer: SHAP explainer name to use. models: Optional estimator collection used for SHAP computation. correlation_th: Deprecated; retained for backward compatibility. mean_shap_percentile: Percentile used to compute SHAP threshold. iters: Number of iterations for feature selection. reciprocity: Whether to enforce reciprocal edges. min_impact: Minimum SHAP impact threshold for selection. exhaustive: Whether to run exhaustive feature selection. background_size: Background sample size for SHAP explainers. background_method: Background selection method (“sample” or “kmeans”). background_seed: Random seed for background sampling. parallel_jobs: Parallel worker count. on_gpu: Whether to use GPU for SHAP computation. verbose: Enable verbose output. prog_bar: Whether to show progress bars. silent: Suppress all output.

  • Returns – None.

explainer = 'explainer'#
models = None#
correlation_th = None#
mean_shap_percentile = 0.8#
iters = 20#
reciprocity = False#
min_impact = 1e-06#
exhaustive = False#
background_size = 200#
background_method = 'sample'#
background_seed = None#
parallel_jobs = 0#
on_gpu = False#
verbose = False#
prog_bar = True#
silent = False#
__str__()[source]#

Return a compact string representation for logging.

Parameters:

None.

Returns:

String representation of the estimator.

Return type:

str

fit(X)[source]#

Fit the ShapleyExplainer model to the given dataset.

Parameters: - X: The input dataset.

Returns: - self: The fitted ShapleyExplainer model.

Parameters:

X (DataFrame) – Input dataset used to compute SHAP values.

Returns:

The fitted estimator instance.

Return type:

ShapEstimator

predict(X, root_causes=None, prior=None)[source]#

Builds a causal graph from the shap values using a selection mechanism based on clustering, knee or abrupt methods.

Parameters:
  • X (pd.DataFrame) – The input data. Consists of all the features in a pandas DataFrame.

  • root_causes (List[str], optional) – The root causes of the graph. If None, all features are considered as root causes, by default None.

  • prior (List[List[str]], optional) – The prior knowledge about the connections between the features. If None, all features are considered as valid candidates for the connections, by default None.

Returns:

  • nx.DiGraph – The causal graph.

  • Args – X: Input dataset used for prediction. root_causes: Optional list of root-cause feature names. prior: Optional prior knowledge constraints.

  • Returns – The inferred causal graph.

Return type:

DiGraph

adjust(graph, increase_tolerance=0.0, sd_upper=0.1)[source]#

Adjust graph edges based on SHAP discrepancy thresholds.

Parameters:
  • graph (DiGraph) – Graph to adjust.

  • increase_tolerance (float) – Tolerance scaling applied to discrepancy bounds.

  • sd_upper (float) – Upper bound for discrepancy difference.

Returns:

Adjusted directed graph.

Return type:

DiGraph

compute_error_contribution()[source]#

Computes the error contribution of each feature for each target. If this value is positive, then it means that, on average, the presence of the feature in the model leads to a higher error. Thus, without that feature, the prediction would have been generally better. In other words, the feature is making more harm than good! On the contrary, the more negative this value, the more beneficial the feature is for the predictions since its presence leads to smaller errors.

custom_main(exp_name, path='/Users/renero/phd/data/RC4/', output_path='/Users/renero/phd/output/RC4/', scale=False)[source]#

Runs a custom main function for the given experiment name.

Parameters:
  • exp_name (str) – The name of the experiment to run.

  • path (str) – The path to the data files.

  • output_path (str) – The path to the output files.

  • scale (bool) – Whether to scale data before running.

Returns:

None.

Return type:

None

sachs_main()[source]#

Run a demo experiment on the Sachs dataset.

Parameters:

None.

Returns:

None.

Return type:

None

Module contents#

Explainability techniques used for causal discovery.

This module contains various techniques and tools for explaining and interpreting causal discovery results:

  • shapley: Implements Shapley value-based methods for attributing importance to features in causal models.

  • regression_quality: Provides metrics and tools for assessing the quality of regression models used in causal discovery.

  • perm_importance: Implements permutation importance methods for feature importance in causal models.

  • hierarchies: Contains tools for analyzing and visualizing hierarchical structures in causal relationships.

These submodules offer a range of approaches to enhance the interpretability and understanding of causal discovery results, aiding in the validation and refinement of causal models.