causalexplain package#
Subpackages#
- causalexplain.common package
- Submodules
BaseExperimentExperimentShapSummaryShapDiscrepancysetup_plot()add_grid()subplots()format_graph()draw_graph_subplot()cleanup_graph()set_colormap()dag2dot()values_distribution()correlation_matrix()hierarchies()dag()dags()shap_values()shap_discrepancies()deprecated_dags()score_by_method()scores_by_method()score_by_subtype()combined_metrics()latex_table_by_datatype()latex_table_by_method()save_experiment()load_experiment()valid_output_name()graph_from_dot_file()graph_from_dictionary()graph_from_adjacency()graph_from_adjacency_file()graph_to_adjacency()graph_to_adjacency_file()graph_to_dot_file()select_device()resolve_device()graph_intersection()graph_union()digraph_from_connected_features()correct_edge_from_prior()valid_candidates_from_prior()break_cycles_using_prior()potential_misoriented_edges()break_cycles_if_present()stringfy_object()get_feature_names()get_feature_types()cast_categoricals_to_int()find_crossing_point()format_time()combine_dags()list_files()read_json_file()pretty_print()
- Module contents
- Submodules
- causalexplain.estimators package
- causalexplain.explainability package
- Submodules
Hierarchiesconnect_isolated_nodes()connect_hierarchies()plot_dendogram_correlations()- Parameters:
PermutationImportanceRegQualityShapDiscrepancyShapDiscrepancy.targetShapDiscrepancy.parentShapDiscrepancy.shap_heteroskedasticityShapDiscrepancy.parent_heteroskedasticityShapDiscrepancy.shap_p_valueShapDiscrepancy.parent_p_valueShapDiscrepancy.shap_modelShapDiscrepancy.parent_modelShapDiscrepancy.shap_discrepancyShapDiscrepancy.shap_correlationShapDiscrepancy.shap_gofShapDiscrepancy.ks_pvalueShapDiscrepancy.ks_resultShapDiscrepancy.__init__()
ShapRunDiagnosticssample_rows()build_kernel_explainer()compute_kernel_shap()build_gradient_explainer()compute_gradient_shap()build_generic_explainer()compute_generic_shap()compute_shap_adaptive()compute_shap()ShapEstimatorShapEstimator.deviceShapEstimator.shap_discrepanciesShapEstimator.__init__()ShapEstimator.explainerShapEstimator.modelsShapEstimator.correlation_thShapEstimator.mean_shap_percentileShapEstimator.itersShapEstimator.reciprocityShapEstimator.min_impactShapEstimator.exhaustiveShapEstimator.background_sizeShapEstimator.background_methodShapEstimator.background_seedShapEstimator.parallel_jobsShapEstimator.on_gpuShapEstimator.verboseShapEstimator.prog_barShapEstimator.silentShapEstimator.__str__()ShapEstimator.fit()ShapEstimator.predict()ShapEstimator.adjust()ShapEstimator.compute_error_contribution()
custom_main()sachs_main()
- Module contents
- Submodules
- causalexplain.generators package
- causalexplain.gui package
- Subpackages
- Submodules
run_gui()ensure_cytoscape_assets()clean_node_name()normalize_graph()dag_is_valid()graph_from_dot()ensure_file()ensure_output_dir()normalize_output_value()sanitize_output_name()update_metrics_log()overlay_status_message()render_cytoscape_overlay()render_cytoscape_graph()default_train_settings()default_load_settings()default_generate_settings()merge_settings()app_styles_path()read_app_styles()register_app_styles()update_settings()bind_setting()set_input_value()save_upload()make_upload_handler()normalize_output_path()
- Module contents
- causalexplain.independence package
- Submodules
ConditionalIndependenciesSufficientSetsget_backdoor_paths()get_paths()find_colliders_in_path()get_sufficient_sets_for_pair()get_sufficient_sets()get_conditional_independencies()custom_main()main()dag_main()get_edge_orientation()estimate()estimate_edge()main()select_features()find_cluster_change_point()main()test()GraphIndependenceHSIC_ValuesHSICrbf_dot()kernel_Delta_norm()kernel_Delta()kernel_Gaussian()pairwise_mic()fit_and_get_residuals()run_feature_selection()
- Module contents
- Submodules
- causalexplain.metrics package
- causalexplain.models package
- Submodules
MLPDFFMDNColumnsDatasetRBFMMDLossBaseModelBaseModel.modelBaseModel.all_columnsBaseModel.callbacksBaseModel.columnsBaseModel.loggerBaseModel.extra_trainer_argsBaseModel.scalerBaseModel.train_loaderBaseModel.val_loaderBaseModel.n_rowsBaseModel.deviceBaseModel.__init__()BaseModel.init_logger()BaseModel.init_callbacks()BaseModel.init_data()BaseModel.override_extras()
MLPModelextract_weights()see_weights_to_hidden()see_weights_from_input()plot_feature()plot_features()layer_weights()summarize_weights()identify_relationships()infer_causal_relationships()NNRegressorcustom_main()GBTRegressorcustom_main()
- Module contents
- Submodules
Submodules#
- parse_args()[source]#
Parse CLI arguments for the causal discovery runner.
- Parameters:
None.
- Returns:
Parsed command-line arguments.
- Return type:
- check_args_validity(args)[source]#
Validate CLI arguments and derive runtime configuration values.
This performs file existence checks and computes defaults that drive the end-to-end experiment run.
- Parameters:
args (argparse.Namespace) – Parsed command-line arguments.
- Returns:
A dictionary of validated run values.
- Return type:
Dict[str, Any]
- header_()[source]#
Print the ASCII header banner for CLI output.
The banner was created with the “Ogre” font from https://patorjk.com/software/taag/.
- Parameters:
None.
- Returns:
This method does not return a value.
- Return type:
None
- show_run_values(run_values)[source]#
Print resolved run values for debugging or transparency.
- Parameters:
run_values (Dict[str, Any]) – A dictionary of run values.
- Returns:
This method does not return a value.
- Return type:
None
- main()[source]#
Run the CLI entry point for causal discovery experiments.
This orchestrates argument parsing, model loading or training, evaluation, and optional persistence of outputs.
- Parameters:
None.
- Returns:
This method does not return a value.
- Return type:
None
This module contains the GraphDiscovery class which is responsible for creating, fitting, and evaluating causal discovery experiments.
- class GraphDiscovery(experiment_name=None, model_type='rex', csv_filename=None, true_dag_filename=None, verbose=False, seed=42, device=None, parallel_jobs=0, bootstrap_parallel_jobs=0, max_shap_samples=None)[source]#
Bases:
object- __init__(experiment_name=None, model_type='rex', csv_filename=None, true_dag_filename=None, verbose=False, seed=42, device=None, parallel_jobs=0, bootstrap_parallel_jobs=0, max_shap_samples=None)[source]#
Initialize a graph discovery workflow and optionally load dataset metadata.
This constructor sets up the estimator, loads the CSV metadata, and prepares train/test splits when both an experiment name and CSV path are provided. If neither is provided, it leaves the instance in an empty state so it can be configured later.
- Parameters:
experiment_name (str, optional) – The name of the experiment.
model_type (str, optional) – The type of model to use. Valid options are: ‘rex’, ‘pc’, ‘fci’, ‘ges’, ‘lingam’, ‘cam’, ‘notears’.
csv_filename (str, optional) – The filename of the CSV file containing the data.
true_dag_filename (str, optional) – The filename of the DOT file containing the true causal graph.
verbose (bool, optional) – Whether to print verbose output.
seed (int, optional) – The random seed for reproducibility.
device (Optional[str], optional) – Device selection for regressors.
parallel_jobs (int, optional) – Number of parallel jobs for CPU training.
bootstrap_parallel_jobs (int, optional) – Number of parallel jobs for bootstrap.
max_shap_samples (Optional[int], optional) – Cap for SHAP background samples.
- Returns:
This method does not return a value.
- Return type:
None
- create_experiments()[source]#
Create an Experiment object for each regressor configured on the instance.
This uses the dataset metadata and train/test indices prepared during initialization, and prepares the trainer map without fitting models.
- Parameters:
None.
- Returns:
- A dictionary of Experiment objects keyed by
trainer name.
- Return type:
Dict[str, Experiment]
- fit_experiments(hpo_iterations=None, bootstrap_iterations=None, prior=None, bootstrap_tolerance=None, quiet=False, **kwargs)[source]#
Fit the Experiment objects prepared by create_experiments.
This configures estimator-specific options (ReX vs. other methods) and forwards any additional keyword arguments to fit_predict.
- Parameters:
hpo_iterations (Optional[int]) – Number of HPO trials for ReX.
bootstrap_iterations (Optional[int]) – Number of bootstrap trials for ReX.
prior (Optional[List[List[str]]]) – Optional prior constraints.
bootstrap_tolerance (Optional[float]) – Threshold for bootstrapped adjacency matrix filtering.
quiet (bool) – Disable verbose output and progress indicators.
**kwargs (Any) – Additional keyword arguments forwarded to fit_predict.
- Returns:
This method does not return a value.
- Return type:
None
- combine_and_evaluate_dags(prior=None, combine_op='union')[source]#
Combine or select DAGs from experiments and compute evaluation metrics.
For non-ReX estimators this simply selects the single model DAG. For ReX, it combines multiple DAGs (currently the first two) using the requested union or intersection operation before evaluation.
- Parameters:
- Returns:
The experiment object with the final DAG and metrics.
- Return type:
- run(hpo_iterations=None, bootstrap_iterations=None, prior=None, bootstrap_tolerance=None, quiet=False, combine_op='union', **kwargs)[source]#
Run the full experiment pipeline from creation to evaluation.
This is a convenience wrapper that creates experiments, fits them, and combines/evaluates the resulting DAGs in one call.
- Parameters:
hpo_iterations (int, optional) – Number of HPO trials for REX. Defaults to None.
bootstrap_iterations (int, optional) – Number of bootstrap trials for REX. Defaults to None.
prior (Optional[List[List[str]]], optional) – Optional prior constraints to pass to ReX.
bootstrap_tolerance (float, optional) – Threshold to apply to the bootstrapped adjacency matrix. Defaults to None.
quiet (bool, optional) – Disable verbose output and progress indicators. Defaults to False.
combine_op (str, optional) – Operation used to combine DAGs in ReX. Defaults to ‘union’.
**kwargs (Any) – Additional keyword arguments forwarded to fit_experiments.
- Returns:
This method does not return a value.
- Return type:
None
- save(full_filename_path)[source]#
Save the current trainer state to disk.
This is a convenience alias for save_model.
- Parameters:
full_filename_path (str) – Full path to the output pickle file.
- Returns:
This method does not return a value.
- Return type:
None
- save_model(full_filename_path)[source]#
Save the model as an Experiment object.
Use this after fitting to persist the trainer state for later reuse or analysis.
- Parameters:
full_filename_path (str) – A full path where to save the model, including the filename.
- Returns:
This method does not return a value.
- Return type:
None
- load(model_path)[source]#
Load a saved trainer state from disk.
This is a convenience alias for load_model.
- Parameters:
model_path (str) – Path to the pickle file.
- Returns:
The loaded trainer dictionary.
- Return type:
Dict[str, Experiment]
- load_model(model_path)[source]#
Load the model from a pickle file.
This restores the trainer dictionary and updates the cached DAG/metrics on the current instance.
- Parameters:
model_path (str) – Path to the pickle file containing the model
- Returns:
The loaded trainer dictionary.
- Return type:
Dict[str, Experiment]
- printout_results(graph, metrics, combine_op)[source]#
Print the DAG and metrics to stdout in a readable, hierarchical format.
This is intended for CLI runs where the user needs a quick textual summary of the discovered graph, optional evaluation metrics, and the sampling strategy used during estimation.
- export(output_file)[source]#
Export the current DAG to a DOT file.
This is a convenience alias for export_dag.
- Parameters:
output_file (str) – Path to the output DOT file.
- Returns:
This method does not return a value.
- Return type:
None
- export_dag(output_file)[source]#
Export the most recent DAG to a DOT file.
This is typically called after training to persist the discovered causal graph for external inspection or visualization.
- plot(show_metrics=False, show_node_fill=True, title=None, ax=None, figsize=(5, 5), dpi=75, save_to_pdf=None, layout='dot', **kwargs)[source]#
Plot the current DAG using networkx and matplotlib utilities.
Use this to visualize the discovered graph after training, optionally overlaying evaluation metrics or saving the figure to a PDF file.
- Parameters:
show_metrics (bool, optional) – Whether to show metrics on the plot.
show_node_fill (bool, optional) – Whether to fill nodes with color.
title (Optional[str], optional) – Title for the plot.
ax (Optional[Axes], optional) – Matplotlib axes to draw on.
figsize (Tuple[int, int], optional) – Figure size in inches.
dpi (int, optional) – Figure DPI.
save_to_pdf (Optional[str], optional) – Path to save the plot as PDF.
layout (str, optional) – Layout engine to use (‘dot’ or ‘circular’).
**kwargs (Any) – Additional keyword arguments forwarded to plot.dag.
- Returns:
This method does not return a value.
- Return type:
None
- plot_interactive(ui_parent, show_metrics=False, show_node_fill=True, title=None, layout='dagre', rank_dir='TB', width='900px', height='500px', persist_positions=True, on_node_click=None, on_edge_click=None, root_causes=None, **kwargs)[source]#
Render the current DAG in a NiceGUI container using Cytoscape.js.
- Example (within a NiceGUI page):
>>> from nicegui import ui >>> with ui.column() as container: ... discoverer.plot_interactive(container, layout="dagre", rank_dir="LR")
- Parameters:
ui_parent (Any) – NiceGUI container to attach the visualization.
show_metrics (bool, optional) – Reserved for parity with plot().
show_node_fill (bool, optional) – Whether to apply node fill based on scores.
title (Optional[str], optional) – Title to show above the graph.
layout (str, optional) – “dagre” or “elk”.
rank_dir (str, optional) – Layout direction (“LR”, “RL”, “TB”, “BT”).
width (str, optional) – CSS width of the graph container.
height (str, optional) – CSS height of the graph container.
persist_positions (bool, optional) – Persist node positions on drag.
on_node_click (Callable, optional) – Callback receiving the node id.
on_edge_click (Callable, optional) – Callback receiving edge id and classes.
root_causes (Optional[List[str]]) – Nodes to emphasize with a thicker border.
**kwargs (Any) – Reserved for future styling options.
- Returns:
Persisted node positions keyed by id.
- Return type:
- property model: Experiment#
Return the most recent Experiment from the trainer map.
This is commonly used after training to access the final DAG or metrics.
- Parameters:
None.
- Returns:
The most recently added Experiment instance.
- Return type:
Module contents#
CausalExplain: A Python package for causal discovery and inference.
This package provides tools for discovering and analyzing causal relationships in data using various methods and algorithms.
- class GraphDiscovery(experiment_name=None, model_type='rex', csv_filename=None, true_dag_filename=None, verbose=False, seed=42, device=None, parallel_jobs=0, bootstrap_parallel_jobs=0, max_shap_samples=None)[source]#
Bases:
object- __init__(experiment_name=None, model_type='rex', csv_filename=None, true_dag_filename=None, verbose=False, seed=42, device=None, parallel_jobs=0, bootstrap_parallel_jobs=0, max_shap_samples=None)[source]#
Initialize a graph discovery workflow and optionally load dataset metadata.
This constructor sets up the estimator, loads the CSV metadata, and prepares train/test splits when both an experiment name and CSV path are provided. If neither is provided, it leaves the instance in an empty state so it can be configured later.
- Parameters:
experiment_name (str, optional) – The name of the experiment.
model_type (str, optional) – The type of model to use. Valid options are: ‘rex’, ‘pc’, ‘fci’, ‘ges’, ‘lingam’, ‘cam’, ‘notears’.
csv_filename (str, optional) – The filename of the CSV file containing the data.
true_dag_filename (str, optional) – The filename of the DOT file containing the true causal graph.
verbose (bool, optional) – Whether to print verbose output.
seed (int, optional) – The random seed for reproducibility.
device (Optional[str], optional) – Device selection for regressors.
parallel_jobs (int, optional) – Number of parallel jobs for CPU training.
bootstrap_parallel_jobs (int, optional) – Number of parallel jobs for bootstrap.
max_shap_samples (Optional[int], optional) – Cap for SHAP background samples.
- Returns:
This method does not return a value.
- Return type:
None
- create_experiments()[source]#
Create an Experiment object for each regressor configured on the instance.
This uses the dataset metadata and train/test indices prepared during initialization, and prepares the trainer map without fitting models.
- Parameters:
None.
- Returns:
- A dictionary of Experiment objects keyed by
trainer name.
- Return type:
Dict[str, Experiment]
- fit_experiments(hpo_iterations=None, bootstrap_iterations=None, prior=None, bootstrap_tolerance=None, quiet=False, **kwargs)[source]#
Fit the Experiment objects prepared by create_experiments.
This configures estimator-specific options (ReX vs. other methods) and forwards any additional keyword arguments to fit_predict.
- Parameters:
hpo_iterations (Optional[int]) – Number of HPO trials for ReX.
bootstrap_iterations (Optional[int]) – Number of bootstrap trials for ReX.
prior (Optional[List[List[str]]]) – Optional prior constraints.
bootstrap_tolerance (Optional[float]) – Threshold for bootstrapped adjacency matrix filtering.
quiet (bool) – Disable verbose output and progress indicators.
**kwargs (Any) – Additional keyword arguments forwarded to fit_predict.
- Returns:
This method does not return a value.
- Return type:
None
- combine_and_evaluate_dags(prior=None, combine_op='union')[source]#
Combine or select DAGs from experiments and compute evaluation metrics.
For non-ReX estimators this simply selects the single model DAG. For ReX, it combines multiple DAGs (currently the first two) using the requested union or intersection operation before evaluation.
- Parameters:
- Returns:
The experiment object with the final DAG and metrics.
- Return type:
- run(hpo_iterations=None, bootstrap_iterations=None, prior=None, bootstrap_tolerance=None, quiet=False, combine_op='union', **kwargs)[source]#
Run the full experiment pipeline from creation to evaluation.
This is a convenience wrapper that creates experiments, fits them, and combines/evaluates the resulting DAGs in one call.
- Parameters:
hpo_iterations (int, optional) – Number of HPO trials for REX. Defaults to None.
bootstrap_iterations (int, optional) – Number of bootstrap trials for REX. Defaults to None.
prior (Optional[List[List[str]]], optional) – Optional prior constraints to pass to ReX.
bootstrap_tolerance (float, optional) – Threshold to apply to the bootstrapped adjacency matrix. Defaults to None.
quiet (bool, optional) – Disable verbose output and progress indicators. Defaults to False.
combine_op (str, optional) – Operation used to combine DAGs in ReX. Defaults to ‘union’.
**kwargs (Any) – Additional keyword arguments forwarded to fit_experiments.
- Returns:
This method does not return a value.
- Return type:
None
- save(full_filename_path)[source]#
Save the current trainer state to disk.
This is a convenience alias for save_model.
- Parameters:
full_filename_path (str) – Full path to the output pickle file.
- Returns:
This method does not return a value.
- Return type:
None
- save_model(full_filename_path)[source]#
Save the model as an Experiment object.
Use this after fitting to persist the trainer state for later reuse or analysis.
- Parameters:
full_filename_path (str) – A full path where to save the model, including the filename.
- Returns:
This method does not return a value.
- Return type:
None
- load(model_path)[source]#
Load a saved trainer state from disk.
This is a convenience alias for load_model.
- Parameters:
model_path (str) – Path to the pickle file.
- Returns:
The loaded trainer dictionary.
- Return type:
Dict[str, Experiment]
- load_model(model_path)[source]#
Load the model from a pickle file.
This restores the trainer dictionary and updates the cached DAG/metrics on the current instance.
- Parameters:
model_path (str) – Path to the pickle file containing the model
- Returns:
The loaded trainer dictionary.
- Return type:
Dict[str, Experiment]
- printout_results(graph, metrics, combine_op)[source]#
Print the DAG and metrics to stdout in a readable, hierarchical format.
This is intended for CLI runs where the user needs a quick textual summary of the discovered graph, optional evaluation metrics, and the sampling strategy used during estimation.
- export(output_file)[source]#
Export the current DAG to a DOT file.
This is a convenience alias for export_dag.
- Parameters:
output_file (str) – Path to the output DOT file.
- Returns:
This method does not return a value.
- Return type:
None
- export_dag(output_file)[source]#
Export the most recent DAG to a DOT file.
This is typically called after training to persist the discovered causal graph for external inspection or visualization.
- plot(show_metrics=False, show_node_fill=True, title=None, ax=None, figsize=(5, 5), dpi=75, save_to_pdf=None, layout='dot', **kwargs)[source]#
Plot the current DAG using networkx and matplotlib utilities.
Use this to visualize the discovered graph after training, optionally overlaying evaluation metrics or saving the figure to a PDF file.
- Parameters:
show_metrics (bool, optional) – Whether to show metrics on the plot.
show_node_fill (bool, optional) – Whether to fill nodes with color.
title (Optional[str], optional) – Title for the plot.
ax (Optional[Axes], optional) – Matplotlib axes to draw on.
figsize (Tuple[int, int], optional) – Figure size in inches.
dpi (int, optional) – Figure DPI.
save_to_pdf (Optional[str], optional) – Path to save the plot as PDF.
layout (str, optional) – Layout engine to use (‘dot’ or ‘circular’).
**kwargs (Any) – Additional keyword arguments forwarded to plot.dag.
- Returns:
This method does not return a value.
- Return type:
None
- plot_interactive(ui_parent, show_metrics=False, show_node_fill=True, title=None, layout='dagre', rank_dir='TB', width='900px', height='500px', persist_positions=True, on_node_click=None, on_edge_click=None, root_causes=None, **kwargs)[source]#
Render the current DAG in a NiceGUI container using Cytoscape.js.
- Example (within a NiceGUI page):
>>> from nicegui import ui >>> with ui.column() as container: ... discoverer.plot_interactive(container, layout="dagre", rank_dir="LR")
- Parameters:
ui_parent (Any) – NiceGUI container to attach the visualization.
show_metrics (bool, optional) – Reserved for parity with plot().
show_node_fill (bool, optional) – Whether to apply node fill based on scores.
title (Optional[str], optional) – Title to show above the graph.
layout (str, optional) – “dagre” or “elk”.
rank_dir (str, optional) – Layout direction (“LR”, “RL”, “TB”, “BT”).
width (str, optional) – CSS width of the graph container.
height (str, optional) – CSS height of the graph container.
persist_positions (bool, optional) – Persist node positions on drag.
on_node_click (Callable, optional) – Callback receiving the node id.
on_edge_click (Callable, optional) – Callback receiving edge id and classes.
root_causes (Optional[List[str]]) – Nodes to emphasize with a thicker border.
**kwargs (Any) – Reserved for future styling options.
- Returns:
Persisted node positions keyed by id.
- Return type:
- property model: Experiment#
Return the most recent Experiment from the trainer map.
This is commonly used after training to access the final DAG or metrics.
- Parameters:
None.
- Returns:
The most recently added Experiment instance.
- Return type: