causalexplain package#

Subpackages#

Submodules#

parse_args()[source]#

Parse CLI arguments for the causal discovery runner.

Parameters:

None.

Returns:

Parsed command-line arguments.

Return type:

argparse.Namespace

check_args_validity(args)[source]#

Validate CLI arguments and derive runtime configuration values.

This performs file existence checks and computes defaults that drive the end-to-end experiment run.

Parameters:

args (argparse.Namespace) – Parsed command-line arguments.

Returns:

A dictionary of validated run values.

Return type:

Dict[str, Any]

header_()[source]#

Print the ASCII header banner for CLI output.

The banner was created with the “Ogre” font from https://patorjk.com/software/taag/.

Parameters:

None.

Returns:

This method does not return a value.

Return type:

None

show_run_values(run_values)[source]#

Print resolved run values for debugging or transparency.

Parameters:

run_values (Dict[str, Any]) – A dictionary of run values.

Returns:

This method does not return a value.

Return type:

None

main()[source]#

Run the CLI entry point for causal discovery experiments.

This orchestrates argument parsing, model loading or training, evaluation, and optional persistence of outputs.

Parameters:

None.

Returns:

This method does not return a value.

Return type:

None

This module contains the GraphDiscovery class which is responsible for creating, fitting, and evaluating causal discovery experiments.

class GraphDiscovery(experiment_name=None, model_type='rex', csv_filename=None, true_dag_filename=None, verbose=False, seed=42, device=None, parallel_jobs=0, bootstrap_parallel_jobs=0, max_shap_samples=None)[source]#

Bases: object

__init__(experiment_name=None, model_type='rex', csv_filename=None, true_dag_filename=None, verbose=False, seed=42, device=None, parallel_jobs=0, bootstrap_parallel_jobs=0, max_shap_samples=None)[source]#

Initialize a graph discovery workflow and optionally load dataset metadata.

This constructor sets up the estimator, loads the CSV metadata, and prepares train/test splits when both an experiment name and CSV path are provided. If neither is provided, it leaves the instance in an empty state so it can be configured later.

Parameters:
  • experiment_name (str, optional) – The name of the experiment.

  • model_type (str, optional) – The type of model to use. Valid options are: ‘rex’, ‘pc’, ‘fci’, ‘ges’, ‘lingam’, ‘cam’, ‘notears’.

  • csv_filename (str, optional) – The filename of the CSV file containing the data.

  • true_dag_filename (str, optional) – The filename of the DOT file containing the true causal graph.

  • verbose (bool, optional) – Whether to print verbose output.

  • seed (int, optional) – The random seed for reproducibility.

  • device (Optional[str], optional) – Device selection for regressors.

  • parallel_jobs (int, optional) – Number of parallel jobs for CPU training.

  • bootstrap_parallel_jobs (int, optional) – Number of parallel jobs for bootstrap.

  • max_shap_samples (Optional[int], optional) – Cap for SHAP background samples.

Returns:

This method does not return a value.

Return type:

None

create_experiments()[source]#

Create an Experiment object for each regressor configured on the instance.

This uses the dataset metadata and train/test indices prepared during initialization, and prepares the trainer map without fitting models.

Parameters:

None.

Returns:

A dictionary of Experiment objects keyed by

trainer name.

Return type:

Dict[str, Experiment]

fit_experiments(hpo_iterations=None, bootstrap_iterations=None, prior=None, bootstrap_tolerance=None, quiet=False, **kwargs)[source]#

Fit the Experiment objects prepared by create_experiments.

This configures estimator-specific options (ReX vs. other methods) and forwards any additional keyword arguments to fit_predict.

Parameters:
  • hpo_iterations (Optional[int]) – Number of HPO trials for ReX.

  • bootstrap_iterations (Optional[int]) – Number of bootstrap trials for ReX.

  • prior (Optional[List[List[str]]]) – Optional prior constraints.

  • bootstrap_tolerance (Optional[float]) – Threshold for bootstrapped adjacency matrix filtering.

  • quiet (bool) – Disable verbose output and progress indicators.

  • **kwargs (Any) – Additional keyword arguments forwarded to fit_predict.

Returns:

This method does not return a value.

Return type:

None

combine_and_evaluate_dags(prior=None, combine_op='union')[source]#

Combine or select DAGs from experiments and compute evaluation metrics.

For non-ReX estimators this simply selects the single model DAG. For ReX, it combines multiple DAGs (currently the first two) using the requested union or intersection operation before evaluation.

Parameters:
  • prior (List[List[str]], optional) – The prior to use for ReX. Defaults to None.

  • combine_op (str, optional) – Operation used to combine DAGs in ReX. Supported values are: ‘union’ and ‘intersection’.

Returns:

The experiment object with the final DAG and metrics.

Return type:

Experiment

run(hpo_iterations=None, bootstrap_iterations=None, prior=None, bootstrap_tolerance=None, quiet=False, combine_op='union', **kwargs)[source]#

Run the full experiment pipeline from creation to evaluation.

This is a convenience wrapper that creates experiments, fits them, and combines/evaluates the resulting DAGs in one call.

Parameters:
  • hpo_iterations (int, optional) – Number of HPO trials for REX. Defaults to None.

  • bootstrap_iterations (int, optional) – Number of bootstrap trials for REX. Defaults to None.

  • prior (Optional[List[List[str]]], optional) – Optional prior constraints to pass to ReX.

  • bootstrap_tolerance (float, optional) – Threshold to apply to the bootstrapped adjacency matrix. Defaults to None.

  • quiet (bool, optional) – Disable verbose output and progress indicators. Defaults to False.

  • combine_op (str, optional) – Operation used to combine DAGs in ReX. Defaults to ‘union’.

  • **kwargs (Any) – Additional keyword arguments forwarded to fit_experiments.

Returns:

This method does not return a value.

Return type:

None

save(full_filename_path)[source]#

Save the current trainer state to disk.

This is a convenience alias for save_model.

Parameters:

full_filename_path (str) – Full path to the output pickle file.

Returns:

This method does not return a value.

Return type:

None

save_model(full_filename_path)[source]#

Save the model as an Experiment object.

Use this after fitting to persist the trainer state for later reuse or analysis.

Parameters:

full_filename_path (str) – A full path where to save the model, including the filename.

Returns:

This method does not return a value.

Return type:

None

load(model_path)[source]#

Load a saved trainer state from disk.

This is a convenience alias for load_model.

Parameters:

model_path (str) – Path to the pickle file.

Returns:

The loaded trainer dictionary.

Return type:

Dict[str, Experiment]

load_model(model_path)[source]#

Load the model from a pickle file.

This restores the trainer dictionary and updates the cached DAG/metrics on the current instance.

Parameters:

model_path (str) – Path to the pickle file containing the model

Returns:

The loaded trainer dictionary.

Return type:

Dict[str, Experiment]

printout_results(graph, metrics, combine_op)[source]#

Print the DAG and metrics to stdout in a readable, hierarchical format.

This is intended for CLI runs where the user needs a quick textual summary of the discovered graph, optional evaluation metrics, and the sampling strategy used during estimation.

Parameters:
  • graph (nx.DiGraph) – The DAG to print.

  • metrics (Optional[Metrics]) – Optional metrics summary to display.

  • combine_op (str) – The DAG combination operation used for labeling.

Returns:

This method does not return a value.

Return type:

None

export(output_file)[source]#

Export the current DAG to a DOT file.

This is a convenience alias for export_dag.

Parameters:

output_file (str) – Path to the output DOT file.

Returns:

This method does not return a value.

Return type:

None

export_dag(output_file)[source]#

Export the most recent DAG to a DOT file.

This is typically called after training to persist the discovered causal graph for external inspection or visualization.

Parameters:

output_file (str) – Path to the output DOT file.

Returns:

The path to the output DOT file.

Return type:

str

plot(show_metrics=False, show_node_fill=True, title=None, ax=None, figsize=(5, 5), dpi=75, save_to_pdf=None, layout='dot', **kwargs)[source]#

Plot the current DAG using networkx and matplotlib utilities.

Use this to visualize the discovered graph after training, optionally overlaying evaluation metrics or saving the figure to a PDF file.

Parameters:
  • show_metrics (bool, optional) – Whether to show metrics on the plot.

  • show_node_fill (bool, optional) – Whether to fill nodes with color.

  • title (Optional[str], optional) – Title for the plot.

  • ax (Optional[Axes], optional) – Matplotlib axes to draw on.

  • figsize (Tuple[int, int], optional) – Figure size in inches.

  • dpi (int, optional) – Figure DPI.

  • save_to_pdf (Optional[str], optional) – Path to save the plot as PDF.

  • layout (str, optional) – Layout engine to use (‘dot’ or ‘circular’).

  • **kwargs (Any) – Additional keyword arguments forwarded to plot.dag.

Returns:

This method does not return a value.

Return type:

None

plot_interactive(ui_parent, show_metrics=False, show_node_fill=True, title=None, layout='dagre', rank_dir='TB', width='900px', height='500px', persist_positions=True, on_node_click=None, on_edge_click=None, root_causes=None, **kwargs)[source]#

Render the current DAG in a NiceGUI container using Cytoscape.js.

Example (within a NiceGUI page):
>>> from nicegui import ui
>>> with ui.column() as container:
...     discoverer.plot_interactive(container, layout="dagre", rank_dir="LR")
Parameters:
  • ui_parent (Any) – NiceGUI container to attach the visualization.

  • show_metrics (bool, optional) – Reserved for parity with plot().

  • show_node_fill (bool, optional) – Whether to apply node fill based on scores.

  • title (Optional[str], optional) – Title to show above the graph.

  • layout (str, optional) – “dagre” or “elk”.

  • rank_dir (str, optional) – Layout direction (“LR”, “RL”, “TB”, “BT”).

  • width (str, optional) – CSS width of the graph container.

  • height (str, optional) – CSS height of the graph container.

  • persist_positions (bool, optional) – Persist node positions on drag.

  • on_node_click (Callable, optional) – Callback receiving the node id.

  • on_edge_click (Callable, optional) – Callback receiving edge id and classes.

  • root_causes (Optional[List[str]]) – Nodes to emphasize with a thicker border.

  • **kwargs (Any) – Reserved for future styling options.

Returns:

Persisted node positions keyed by id.

Return type:

Dict[str, Dict[str, float]]

property model: Experiment#

Return the most recent Experiment from the trainer map.

This is commonly used after training to access the final DAG or metrics.

Parameters:

None.

Returns:

The most recently added Experiment instance.

Return type:

Experiment

Module contents#

CausalExplain: A Python package for causal discovery and inference.

This package provides tools for discovering and analyzing causal relationships in data using various methods and algorithms.

class GraphDiscovery(experiment_name=None, model_type='rex', csv_filename=None, true_dag_filename=None, verbose=False, seed=42, device=None, parallel_jobs=0, bootstrap_parallel_jobs=0, max_shap_samples=None)[source]#

Bases: object

__init__(experiment_name=None, model_type='rex', csv_filename=None, true_dag_filename=None, verbose=False, seed=42, device=None, parallel_jobs=0, bootstrap_parallel_jobs=0, max_shap_samples=None)[source]#

Initialize a graph discovery workflow and optionally load dataset metadata.

This constructor sets up the estimator, loads the CSV metadata, and prepares train/test splits when both an experiment name and CSV path are provided. If neither is provided, it leaves the instance in an empty state so it can be configured later.

Parameters:
  • experiment_name (str, optional) – The name of the experiment.

  • model_type (str, optional) – The type of model to use. Valid options are: ‘rex’, ‘pc’, ‘fci’, ‘ges’, ‘lingam’, ‘cam’, ‘notears’.

  • csv_filename (str, optional) – The filename of the CSV file containing the data.

  • true_dag_filename (str, optional) – The filename of the DOT file containing the true causal graph.

  • verbose (bool, optional) – Whether to print verbose output.

  • seed (int, optional) – The random seed for reproducibility.

  • device (Optional[str], optional) – Device selection for regressors.

  • parallel_jobs (int, optional) – Number of parallel jobs for CPU training.

  • bootstrap_parallel_jobs (int, optional) – Number of parallel jobs for bootstrap.

  • max_shap_samples (Optional[int], optional) – Cap for SHAP background samples.

Returns:

This method does not return a value.

Return type:

None

create_experiments()[source]#

Create an Experiment object for each regressor configured on the instance.

This uses the dataset metadata and train/test indices prepared during initialization, and prepares the trainer map without fitting models.

Parameters:

None.

Returns:

A dictionary of Experiment objects keyed by

trainer name.

Return type:

Dict[str, Experiment]

fit_experiments(hpo_iterations=None, bootstrap_iterations=None, prior=None, bootstrap_tolerance=None, quiet=False, **kwargs)[source]#

Fit the Experiment objects prepared by create_experiments.

This configures estimator-specific options (ReX vs. other methods) and forwards any additional keyword arguments to fit_predict.

Parameters:
  • hpo_iterations (Optional[int]) – Number of HPO trials for ReX.

  • bootstrap_iterations (Optional[int]) – Number of bootstrap trials for ReX.

  • prior (Optional[List[List[str]]]) – Optional prior constraints.

  • bootstrap_tolerance (Optional[float]) – Threshold for bootstrapped adjacency matrix filtering.

  • quiet (bool) – Disable verbose output and progress indicators.

  • **kwargs (Any) – Additional keyword arguments forwarded to fit_predict.

Returns:

This method does not return a value.

Return type:

None

combine_and_evaluate_dags(prior=None, combine_op='union')[source]#

Combine or select DAGs from experiments and compute evaluation metrics.

For non-ReX estimators this simply selects the single model DAG. For ReX, it combines multiple DAGs (currently the first two) using the requested union or intersection operation before evaluation.

Parameters:
  • prior (List[List[str]], optional) – The prior to use for ReX. Defaults to None.

  • combine_op (str, optional) – Operation used to combine DAGs in ReX. Supported values are: ‘union’ and ‘intersection’.

Returns:

The experiment object with the final DAG and metrics.

Return type:

Experiment

run(hpo_iterations=None, bootstrap_iterations=None, prior=None, bootstrap_tolerance=None, quiet=False, combine_op='union', **kwargs)[source]#

Run the full experiment pipeline from creation to evaluation.

This is a convenience wrapper that creates experiments, fits them, and combines/evaluates the resulting DAGs in one call.

Parameters:
  • hpo_iterations (int, optional) – Number of HPO trials for REX. Defaults to None.

  • bootstrap_iterations (int, optional) – Number of bootstrap trials for REX. Defaults to None.

  • prior (Optional[List[List[str]]], optional) – Optional prior constraints to pass to ReX.

  • bootstrap_tolerance (float, optional) – Threshold to apply to the bootstrapped adjacency matrix. Defaults to None.

  • quiet (bool, optional) – Disable verbose output and progress indicators. Defaults to False.

  • combine_op (str, optional) – Operation used to combine DAGs in ReX. Defaults to ‘union’.

  • **kwargs (Any) – Additional keyword arguments forwarded to fit_experiments.

Returns:

This method does not return a value.

Return type:

None

save(full_filename_path)[source]#

Save the current trainer state to disk.

This is a convenience alias for save_model.

Parameters:

full_filename_path (str) – Full path to the output pickle file.

Returns:

This method does not return a value.

Return type:

None

save_model(full_filename_path)[source]#

Save the model as an Experiment object.

Use this after fitting to persist the trainer state for later reuse or analysis.

Parameters:

full_filename_path (str) – A full path where to save the model, including the filename.

Returns:

This method does not return a value.

Return type:

None

load(model_path)[source]#

Load a saved trainer state from disk.

This is a convenience alias for load_model.

Parameters:

model_path (str) – Path to the pickle file.

Returns:

The loaded trainer dictionary.

Return type:

Dict[str, Experiment]

load_model(model_path)[source]#

Load the model from a pickle file.

This restores the trainer dictionary and updates the cached DAG/metrics on the current instance.

Parameters:

model_path (str) – Path to the pickle file containing the model

Returns:

The loaded trainer dictionary.

Return type:

Dict[str, Experiment]

printout_results(graph, metrics, combine_op)[source]#

Print the DAG and metrics to stdout in a readable, hierarchical format.

This is intended for CLI runs where the user needs a quick textual summary of the discovered graph, optional evaluation metrics, and the sampling strategy used during estimation.

Parameters:
  • graph (nx.DiGraph) – The DAG to print.

  • metrics (Optional[Metrics]) – Optional metrics summary to display.

  • combine_op (str) – The DAG combination operation used for labeling.

Returns:

This method does not return a value.

Return type:

None

export(output_file)[source]#

Export the current DAG to a DOT file.

This is a convenience alias for export_dag.

Parameters:

output_file (str) – Path to the output DOT file.

Returns:

This method does not return a value.

Return type:

None

export_dag(output_file)[source]#

Export the most recent DAG to a DOT file.

This is typically called after training to persist the discovered causal graph for external inspection or visualization.

Parameters:

output_file (str) – Path to the output DOT file.

Returns:

The path to the output DOT file.

Return type:

str

plot(show_metrics=False, show_node_fill=True, title=None, ax=None, figsize=(5, 5), dpi=75, save_to_pdf=None, layout='dot', **kwargs)[source]#

Plot the current DAG using networkx and matplotlib utilities.

Use this to visualize the discovered graph after training, optionally overlaying evaluation metrics or saving the figure to a PDF file.

Parameters:
  • show_metrics (bool, optional) – Whether to show metrics on the plot.

  • show_node_fill (bool, optional) – Whether to fill nodes with color.

  • title (Optional[str], optional) – Title for the plot.

  • ax (Optional[Axes], optional) – Matplotlib axes to draw on.

  • figsize (Tuple[int, int], optional) – Figure size in inches.

  • dpi (int, optional) – Figure DPI.

  • save_to_pdf (Optional[str], optional) – Path to save the plot as PDF.

  • layout (str, optional) – Layout engine to use (‘dot’ or ‘circular’).

  • **kwargs (Any) – Additional keyword arguments forwarded to plot.dag.

Returns:

This method does not return a value.

Return type:

None

plot_interactive(ui_parent, show_metrics=False, show_node_fill=True, title=None, layout='dagre', rank_dir='TB', width='900px', height='500px', persist_positions=True, on_node_click=None, on_edge_click=None, root_causes=None, **kwargs)[source]#

Render the current DAG in a NiceGUI container using Cytoscape.js.

Example (within a NiceGUI page):
>>> from nicegui import ui
>>> with ui.column() as container:
...     discoverer.plot_interactive(container, layout="dagre", rank_dir="LR")
Parameters:
  • ui_parent (Any) – NiceGUI container to attach the visualization.

  • show_metrics (bool, optional) – Reserved for parity with plot().

  • show_node_fill (bool, optional) – Whether to apply node fill based on scores.

  • title (Optional[str], optional) – Title to show above the graph.

  • layout (str, optional) – “dagre” or “elk”.

  • rank_dir (str, optional) – Layout direction (“LR”, “RL”, “TB”, “BT”).

  • width (str, optional) – CSS width of the graph container.

  • height (str, optional) – CSS height of the graph container.

  • persist_positions (bool, optional) – Persist node positions on drag.

  • on_node_click (Callable, optional) – Callback receiving the node id.

  • on_edge_click (Callable, optional) – Callback receiving edge id and classes.

  • root_causes (Optional[List[str]]) – Nodes to emphasize with a thicker border.

  • **kwargs (Any) – Reserved for future styling options.

Returns:

Persisted node positions keyed by id.

Return type:

Dict[str, Dict[str, float]]

property model: Experiment#

Return the most recent Experiment from the trainer map.

This is commonly used after training to access the final DAG or metrics.

Parameters:

None.

Returns:

The most recently added Experiment instance.

Return type:

Experiment