causalexplain.estimators.ges package#
Submodules#
Module containing the DecomposableScore class, inherited by all classes which implement a locally decomposable score for directed acyclic graphs. By default, the class also caches the results of computing local scores.
NOTE: It is not mandatory to inherit this class when developing custom scores to use with the GES implementation in ges.py. The only requirement is that the class defines:
the local_score function (see below),
an attribute “p” for the total number of variables.
- class DecomposableScore(data, cache=True, debug=0)[source]#
Bases:
object
Methods
local_score
(x, pa)Return the local score of a given node and a set of parents.
The main module, containing the implementation of GES, including the logic for the insert, delete and turn operators. The implementation is directly based on two papers:
1. The 2002 GES paper by Chickering, “Optimal Structure Identification With Greedy Search” - https://www.jmlr.org/papers/volume3/chickering02b/chickering02b.pdf
2. For the turn operator, the 2012 GIES paper by Hauser & Bühlmann, “Characterization and Greedy Learning of Interventional Markov Equivalence Classes of Directed Acyclic Graphs” - https://www.jmlr.org/papers/volume13/hauser12a/hauser12a.pdf
Further credit is given where due.
Additional modules / packages:
ges.utils contains auxiliary functions and the logic to transform a PDAG into a CPDAG, used after each application of an operator.
- ges.scores contains the modules with the score classes:
ges.scores.decomposable_score contains the base class for decomposable score classes (see that module for more details).
ges.scores.gauss_obs_l0_pen contains a cached implementation of the gaussian likelihood BIC score used in the original GES paper.
ges.test contains the modules with the unit tests and tests comparing against the algorithm’s implementation in the R package ‘pcalg’.
- class GES(name, phases=None, iterate=False, debug=0, verbose=False)[source]#
Bases:
object
The GES (Greedy Equivalence Search) algorithm for learning causal graphs from observational data.
- Parameters:
phases (list, optional) – Which phases of the GES procedure are run, and in which order. Defaults to [‘forward’, ‘backward’, ‘turning’].
iterate (bool, default=False) – Indicates whether the given phases should be iterated more than once.
debug (int, optional) – Debug level for printing traces. Higher values correspond to increased verbosity.
- dag#
The adjacency matrix of the estimated CPDAG (completed partially directed acyclic graph).
- Type:
- ges_adjmat#
The adjacency matrix of the estimated PDAG (partially directed acyclic graph) before completion.
- Type:
- fit_bic(data, A0=None, iterate=False, debug=0)[source]#
Run GES on the given data using the Gaussian BIC score.
Examples
>>> import numpy as np >>> from causalexplain.estimators.ges import GES >>> data = np.array([[3.23125779, 3.24950062, 13.430682, 24.67939513], ... [1.90913354, -0.06843781, 6.93957057, 16.10164608], ... [2.68547149, 1.88351553, 8.78711076, 17.18557716], ... [0.16850822, 1.48067393, 5.35871419, 11.82895779], ... [0.07355872, 1.06857039, 2.05006096, 3.07611922]]) >>> ges = GES() >>> ges.fit(data)
- Attributes:
- dag
- feature_names
- ges_adjmat
- ges_score
- iterate
- metrics
- phases
Methods
backward_step
(A, cache[, debug])Scores all valid delete operators that can be applied to the current CPDAG A, and applies the highest scoring one.
delete_operator
(x, y, H, A)Applies the delete operator: 1) deletes the edge x -> y or x - y 2) for every node h in H * orients the edge y -> h * if the edge with x is undirected, orients it as x -> h
fit_bic
(data[, A0])Run GES on the given data, using the Gaussian BIC score (l0-penalized Gaussian Likelihood).
fit_predict
(train, test[, ref_graph, A0])Fit the GES algorithm to the given data and return the adjacency matrix of the estimated CPDAG.
forward_step
(A, cache[, debug])Scores all valid insert operators that can be applied to the current CPDAG A, and applies the highest scoring one.
insert
(x, y, T, A)Applies the insert operator: 1) adds the edge x -> y 2) for all t in T, orients the previously undirected edge t -> y
score_valid_delete_operators
(x, y, A, cache)Generate and score all valid delete(x,y,H) operators involving the edge x -> y or x - y, and all possible subsets H of neighbors of y which are adjacent to x.
score_valid_insert_operators
(x, y, A, cache)Generate and score all valid insert(x,y,T) operators involving the edge x-> y, and all possible subsets T of neighbors of y which are NOT adjacent to x.
score_valid_turn_operators
(x, y, A, cache[, ...])Generate and score all valid turn(x,y,C) operators that can be applied to the edge x <- y or x - y, iterating through the valid subsets C of neighbors of y.
score_valid_turn_operators_dir
(x, y, A, cache)Logic for finding and scoring the valid turn operators that can be applied to the edge x <- y.
score_valid_turn_operators_undir
(x, y, A, cache)Logic for finding and scoring the valid turn operators that can be applied to the edge x - y.
turn
(x, y, C, A)Applies the turning operator: For an edge x - y or x <- y, 1) orients the edge as x -> y 2) for all c in C, orients the previously undirected edge c -> y
turning_step
(A, cache[, debug])Scores all valid turn operators that can be applied to the current CPDAG A, and applies the highest scoring one.
fit
- dag = None#
- ges_adjmat = None#
- ges_score = None#
- is_fitted_ = False#
- feature_names = None#
- metrics = None#
- phases = None#
- iterate = None#
- debug = 0#
- fit_predict(train, test, ref_graph=None, A0=None)[source]#
Fit the GES algorithm to the given data and return the adjacency matrix of the estimated CPDAG.
- Parameters:
data (numpy.ndarray) – The n x p array containing the observations, where columns correspond to variables and rows to observations.
labels (list, optional) – The list of variable names. If not given, the columns of data are used as labels.
A0 (numpy.ndarray, optional) – The initial CPDAG on which GES will run, where where A0[i,j] != 0 implies the edge i -> j and A[i,j] != 0 & A[j,i] != 0 implies the edge i - j. Defaults to the empty graph (i.e. matrix of zeros).
- Returns:
estimate – The adjacency matrix of the estimated CPDAG.
- Return type:
- Raises:
TypeError: – If the type of some of the parameters was not expected, e.g. if data is not a numpy array.
ValueError: – If the value of some of the parameters is not appropriate, e.g. a wrong phase is specified.
- fit_bic(data, A0=None)[source]#
Run GES on the given data, using the Gaussian BIC score (l0-penalized Gaussian Likelihood). The data is not assumed to be centered, i.e. an intercept is fitted.
To use a custom score, see ges.fit.
- Parameters:
data (numpy.ndarray) – The n x p array containing the observations, where columns correspond to variables and rows to observations.
A0 (numpy.ndarray, optional) – The initial CPDAG on which GES will run, where where A0[i,j] != 0 implies the edge i -> j and A[i,j] != 0 & A[j,i] != 0 implies the edge i - j. Defaults to the empty graph (i.e. matrix of zeros).
- Returns:
estimate (numpy.ndarray) – The adjacency matrix of the estimated CPDAG.
total_score (float) – The score of the estimate.
- Raises:
TypeError: – If the type of some of the parameters was not expected, e.g. if data is not a numpy array.
ValueError: – If the value of some of the parameters is not appropriate, e.g. a wrong phase is specified.
Example
Data from a linear-gaussian SCM (generated using sempler)
>>> import numpy as np >>> data = np.array([[3.23125779, 3.24950062, 13.430682, 24.67939513], ... [1.90913354, -0.06843781, 6.93957057, 16.10164608], ... [2.68547149, 1.88351553, 8.78711076, 17.18557716], ... [0.16850822, 1.48067393, 5.35871419, 11.82895779], ... [0.07355872, 1.06857039, 2.05006096, 3.07611922]])
Run GES using the gaussian BIC score:
>>> from causalexplain.estimators.ges import GES >>> ges = GES() >>> ges.fit_bic(data) (array([[0, 1, 1, 0], [0, 0, 0, 0], [1, 1, 0, 1], [0, 1, 1, 0]]), 15.674267611628233)
- forward_step(A, cache, debug=0)[source]#
Scores all valid insert operators that can be applied to the current CPDAG A, and applies the highest scoring one.
- Parameters:
A (np.array) – the adjacency matrix of a CPDAG, where A[i,j] != 0 => i -> j and A[i,j] != 0 & A[j,i] != 0 => i - j.
cache (DecomposableScore) – an instance of the score class, which computes the change in score and acts as cache.
debug (int, optional) – if larger than 0, debug are traces printed. Higher values correspond to increased verbosity
- Returns:
score (float) – the change in score resulting from applying the highest scoring operator (note, can be smaller than 0).
new_A (np.array) – the adjacency matrix of the PDAG resulting from applying the operator (not yet a CPDAG).
- backward_step(A, cache, debug=0)[source]#
Scores all valid delete operators that can be applied to the current CPDAG A, and applies the highest scoring one.
- Parameters:
A (np.array) – the adjacency matrix of a CPDAG, where A[i,j] != 0 => i -> j and A[i,j] != 0 & A[j,i] != 0 => i - j.
cache (DecomposableScore) – an instance of the score class, which computes the change in score and acts as cache.
debug (int) – if larger than 0, debug are traces printed. Higher values correspond to increased verbosity
- Returns:
score (float) – the change in score resulting from applying the highest scoring operator (note, can be smaller than 0).
new_A (np.array) – the adjacency matrix of the PDAG resulting from applying the operator (not yet a CPDAG).
- turning_step(A, cache, debug=0)[source]#
Scores all valid turn operators that can be applied to the current CPDAG A, and applies the highest scoring one.
- Parameters:
A (np.array) – the adjacency matrix of a CPDAG, where A[i,j] != 0 => i -> j and A[i,j] != 0 & A[j,i] != 0 => i - j.
cache (DecomposableScore) – an instance of the score class, which computes the change in score and acts as cache.
debug (int) – if larger than 0, debug are traces printed. Higher values correspond to increased verbosity
- Returns:
score (float) – the change in score resulting from applying the highest scoring operator (note, can be smaller than 0).
new_A (np.array) – the adjacency matrix of the PDAG resulting from applying the operator (not yet a CPDAG).
- insert(x, y, T, A)[source]#
Applies the insert operator: 1) adds the edge x -> y 2) for all t in T, orients the previously undirected edge t -> y
- Parameters:
- Returns:
new_A – the adjacency matrix resulting from applying the operator
- Return type:
np.array
- score_valid_insert_operators(x, y, A, cache, debug=0)[source]#
Generate and score all valid insert(x,y,T) operators involving the edge x-> y, and all possible subsets T of neighbors of y which are NOT adjacent to x.
- Parameters:
x (int) – the origin node (i.e. x -> y)
y (int) – the target node
A (np.array) – the current adjacency matrix
cache (instance of ges.scores.DecomposableScore) – the score cache to compute the score of the operators that are valid
debug (int) – if larger than 0, debug are traces printed. Higher values correspond to increased verbosity
- Returns:
valid_operators – a list of tubles, each containing a valid operator, its score and the resulting connectivity matrix
- Return type:
list of tuples
- delete_operator(x, y, H, A)[source]#
Applies the delete operator: 1) deletes the edge x -> y or x - y 2) for every node h in H
orients the edge y -> h
if the edge with x is undirected, orients it as x -> h
Note that H must be a subset of the neighbors of y which are adjacent to x. A ValueError exception is thrown otherwise.
- Parameters:
- Returns:
new_A – the adjacency matrix resulting from applying the operator
- Return type:
np.array
- score_valid_delete_operators(x, y, A, cache, debug=0)[source]#
Generate and score all valid delete(x,y,H) operators involving the edge x -> y or x - y, and all possible subsets H of neighbors of y which are adjacent to x.
- Parameters:
x (int) – the “origin” node (i.e. x -> y or x - y)
y (int) – the “target” node
A (np.array) – the current adjacency matrix
cache (instance of ges.scores.DecomposableScore) – the score cache to compute the score of the operators that are valid
debug (int) – if larger than 0, debug are traces printed. Higher values correspond to increased verbosity
- Returns:
valid_operators – a list of tubles, each containing a valid operator, its score and the resulting connectivity matrix
- Return type:
list of tuples
- turn(x, y, C, A)[source]#
Applies the turning operator: For an edge x - y or x <- y, 1) orients the edge as x -> y 2) for all c in C, orients the previously undirected edge c -> y
- score_valid_turn_operators(x, y, A, cache, debug=0)[source]#
Generate and score all valid turn(x,y,C) operators that can be applied to the edge x <- y or x - y, iterating through the valid subsets C of neighbors of y.
- Parameters:
x (int) – the origin node (i.e. orient x -> y)
y (int) – the target node
A (np.array) – the current adjacency matrix
cache (instance of ges.scores.DecomposableScore) – the score cache to compute the score of the operators that are valid
debug (int) – if larger than 0, debug are traces printed. Higher values correspond to increased verbosity
- Returns:
valid_operators – a list of tubles, each containing a valid operator, its score and the resulting connectivity matrix
- Return type:
list of tuples
- score_valid_turn_operators_dir(x, y, A, cache, debug=0)[source]#
Logic for finding and scoring the valid turn operators that can be applied to the edge x <- y.
- Parameters:
x (int) – the origin node (i.e. x -> y)
y (int) – the target node
A (np.array) – the current adjacency matrix
cache (instance of ges.scores.DecomposableScore) – the score cache to compute the score of the operators that are valid
debug (bool or string) – if debug traces should be printed (True/False). If a non-empty string is passed, traces are printed with the given string as prefix (useful for indenting the prints from a calling function)
- Returns:
valid_operators – a list of tubles, each containing a valid operator, its score and the resulting connectivity matrix
- Return type:
list of tuples
- score_valid_turn_operators_undir(x, y, A, cache, debug=0)[source]#
Logic for finding and scoring the valid turn operators that can be applied to the edge x - y.
- Parameters:
x (int) – the origin node (i.e. x -> y)
y (int) – the target node
A (np.array) – the current adjacency matrix
cache (instance of ges.scores.DecomposableScore) – the score cache to compute the score of the operators that are valid
debug (bool or string) – if debug traces should be printed (True/False). If a non-empty string is passed, traces are printed with the given string as prefix (useful for indenting the prints from a calling function)
- Returns:
valid_operators – a list of tubles, each containing a valid operator, its score and the resulting connectivity matrix
- Return type:
list of tuples
- main(dataset_name, input_path='/Users/renero/phd/data/sachs/', output_path='/Users/renero/phd/output/RC4/sachs/compared', save=False)[source]#
- class GaussObsL0Pen(data, lmbda=None, method='scatter', cache=True, debug=0)[source]#
Bases:
DecomposableScore
Implements a cached l0-penalized gaussian likelihood score.
Methods
full_score
(A)Given a DAG adjacency A, return the l0-penalized log-likelihood of a sample from a single environment, by finding the maximum likelihood estimates of the corresponding connectivity matrix (weights) and noise term variances.
local_score
(x, pa)Return the local score of a given node and a set of parents.
- __init__(data, lmbda=None, method='scatter', cache=True, debug=0)[source]#
Creates a new instance of the class.
- Parameters:
data (numpy.ndarray) – the nxp matrix containing the observations of each variable (each column corresponds to a variable).
lmbda (float or NoneType, optional) – the regularization parameter. If None, defaults to the BIC score, i.e. lmbda = 1/2 * log(n), where n is the number of observations.
method ({'scatter', 'raw'}, optional) – the method used to compute the likelihood. If ‘scatter’, the empirical covariance matrix (i.e. scatter matrix) is used. If ‘raw’, the likelihood is computed from the raw data. In both cases an intercept is fitted.
cache (bool, optional) – if computations of the local score should be cached for future calls. Defaults to True.
debug (int, optional) – if larger than 0, debug are traces printed. Higher values correspond to increased verbosity.
- full_score(A)[source]#
Given a DAG adjacency A, return the l0-penalized log-likelihood of a sample from a single environment, by finding the maximum likelihood estimates of the corresponding connectivity matrix (weights) and noise term variances.
- Parameters:
A (np.array) – The adjacency matrix of a DAG, where A[i,j] != 0 => i -> j.
- Returns:
score – the penalized log-likelihood score.
- Return type:
Module containing the auxiliary functions used in the implementation of GES, including the PDAG to CPDAG conversion algorithm described in Chickering’s original GES paper from 2002.
- neighbors(i, A)[source]#
The neighbors of i in A, i.e. all nodes connected to i by an undirected edge.
- adj(i, A)[source]#
The adjacent nodes of i in A, i.e. all nodes connected by a directed or undirected edge. :param i: the node’s index :type i: int :param A: the adjacency matrix of the graph, where A[i,j] != 0 => i -> j
and A[i,j] != 0 & A[j,i] != 0 => i - j.
- Returns:
nodes – the adjacent nodes
- Return type:
set of ints
- ch(i, A)[source]#
The children of i in A.
- Parameters:
A (np.array) – the adjacency matrix of the graph, where A[i,j] != 0 => i -> j and A[i,j] != 0 & A[j,i] != 0 => i - j.
- Returns:
nodes – the children nodes
- Return type:
set of ints
- is_dag(A)[source]#
Checks wether the given adjacency matrix corresponds to a DAG.
- Parameters:
A (np.array) – the adjacency matrix of the graph, where A[i,j] != 0 => i -> j.
- Returns:
is_dag – if the adjacency corresponds to a DAG
- Return type:
- topological_ordering(A)[source]#
Return a topological ordering for the DAG with adjacency matrix A, using Kahn’s 1962 algorithm.
Raises a ValueError exception if the given adjacency does not correspond to a DAG.
- Parameters:
A (np.array) – the adjacency matrix of the graph, where A[i,j] != 0 => i -> j.
- Returns:
ordering – a topological ordering for the DAG
- Return type:
list of ints
- Raises:
ValueError : – If the given adjacency does not correspond to a DAG.
- semi_directed_paths(fro, to, A)[source]#
Return all paths from i to j in A. Note: a path is a sequence (a_1,…,a_n) of non-repeating nodes where either a_i -> a_i+1 or a_i - a_i+1 are edges in the PDAG A.
- separates(S, A, B, G)[source]#
Returns true if the set S separates A from B in G, i.e. if all paths in G from nodes in A to nodes in B contain a node in S. Exception is raised if S,A and B are not pairwise disjoint.
- Parameters:
- Returns:
separated – if S separates A from B in G
- Return type:
- chain_component(i, G)[source]#
Return all nodes in the connected component of node i after dropping all directed edges in G.
- induced_subgraph(S, G)[source]#
Remove all edges which are not between nodes in S.
- Parameters:
S (set of ints) – a set of node indices
G (np.array) – the adjacency matrix of the graph, where G[i,j] != 0 => i -> j and G[i,j] != 0 and G[j,i] != 0 => i - j
- Returns:
subgraph – the adjacency matrix of the resulting graph where all edges between nodes not in S are removed. Note that this is not really the subgraph, as the nodes not in S still appear as disconnected nodes.
- Return type:
np.array
- vstructures(A)[source]#
Return the v-structures of a DAG or PDAG, given its adjacency matrix.
- Parameters:
A (np.array) – The adjacency of the (P)DAG, where A[i,j] != 0 => i->j
- Returns:
vstructs – the set of v-structures, where every v-structure is a three element tuple, e.g. (i,j,k) represents the v-structure i -> j <- k, where i < j for consistency.
- Return type:
set()
- only_directed(P)[source]#
Return the graph with the same nodes as P and only its directed edges.
- Parameters:
P (np.array) – adjacency matrix of a graph
- Returns:
G – adjacency matrix of the graph with the same nodes as P and only its directed edges
- Return type:
np.array
- only_undirected(P)[source]#
Return the graph with the same nodes as P and only its undirected edges.
- Parameters:
P (np.array) – adjacency matrix of a graph
- Returns:
G – adjacency matrix of the graph with the same nodes as P and only its undirected edges
- Return type:
np.array
- skeleton(A)[source]#
Return the skeleton of a given graph.
- Parameters:
A (np.array) – adjacency matrix of a graph
- Returns:
S – adjacency matrix of the skeleton, i.e. the graph resulting from dropping all edge orientations
- Return type:
np.array
- is_consistent_extension(G, P, debug=False)[source]#
Returns True if the DAG G is a consistent extension of the PDAG P. Will raise a ValueError exception if the graph G is not a DAG (i.e. cycles or undirected edges).
- pdag_to_cpdag(pdag)[source]#
Transform a PDAG into its corresponding CPDAG. Returns a ValueError exception if the given PDAG does not admit a consistent extension.
- Parameters:
pdag (np.array) – the adjacency matrix of a given PDAG where pdag[i,j] != 0 if i -> j and i - j if also pdag[j,i] != 0.
- Returns:
cpdag – the adjacency matrix of the corresponding CPDAG
- Return type:
np.array
- dag_to_cpdag(G)[source]#
Return the completed partially directed acyclic graph (CPDAG) that represents the Markov equivalence class of a given DAG. Returns a ValueError exception if the given graph is not a DAG.
- Parameters:
G (np.array) – the adjacency matrix of the given graph, where G[i,j] != 0 iff i -> j
- Returns:
cpdag – the adjacency matrix of the corresponding CPDAG
- Return type:
np.array
- pdag_to_dag(P, debug=False)[source]#
Find a consistent extension of the given PDAG. Return a ValueError exception if the PDAG does not admit a consistent extension.
- Parameters:
P (np.array) – adjacency matrix representing the PDAG connectivity, where P[i,j] = 1 => i->j
debug (bool, optional) – if debugging traces should be printed
- Returns:
G – the adjacency matrix of a DAG which is a consistent extension (i.e. same v-structures and skeleton) of P.
- Return type:
np.array
- order_edges(G)[source]#
Find a total ordering of the edges in DAG G, as an intermediate step to obtaining the CPDAG representing the Markov equivalence class to which it belongs. Raises a ValueError exception if G is not a DAG.
- Parameters:
G (np.array) – the adjacency matrix of a graph G, where G[i,j] != 0 iff i -> j.
- Returns:
ordered – the adjacency matrix of the graph G, but with labelled edges, i.e. i -> j is has label x iff ordered[i,j] = x.
- Return type:
np.array
- label_edges(ordered)[source]#
Given a DAG with edges labelled according to a total ordering, label each edge as being compelled or reverisble.
- Parameters:
ordered (np.array) – the adjacency matrix of a graph, with the edges labelled according to a total ordering.
- Returns:
labelled –
- the adjacency matrix of G but with labelled edges, where
labelled[i,j] = 1 iff i -> j is compelled, and
labelled[i,j] = -1 iff i -> j is reversible.
- Return type:
np.array
- cartesian(arrays, out=None, dtype=<class 'numpy.int8'>)[source]#
Generate a cartesian product of input arrays.
- Parameters:
arrays (list of array-like) – 1-D arrays to form the cartesian product of.
out (ndarray) – Array to place the cartesian product in.
- Returns:
out – 2-D array of shape (M, len(arrays)) containing cartesian products formed of input arrays.
- Return type:
ndarray
Examples
>>> cartesian(([1, 2, 3], [4, 5], [6, 7])) array([[1, 4, 6], [1, 4, 7], [1, 5, 6], [1, 5, 7], [2, 4, 6], [2, 4, 7], [2, 5, 6], [2, 5, 7], [3, 4, 6], [3, 4, 7], [3, 5, 6], [3, 5, 7]], dtype=int8)
- sort(L, order=None)[source]#
Sort the elements in an iterable according to its pre-defined ‘sorted’ function, or according to a given order: i will precede j if i precedes j in the order.
- Parameters:
L (iterable) – the iterable to be sorted
order (iterable or None, optional) – a given ordering. In the sorted result, i will precede j if i precedes j in order. If None, the predefined ‘sorted’ function of the iterator will be used. Defaults to None.
- Returns:
ordered – a list containing the elements of L, sorted from lesser to greater or according to the given order.
- Return type:
- subsets(S)[source]#
Return an iterator with all possible subsets of the set S.
- Parameters:
S (set) – a given set
- Returns:
subsets – an iterable over the subsets of S, in increasing size
- Return type:
iterable
- delete(array, mask, axis=None)[source]#
Wrapper for numpy.delete, which adapts the call depending on the numpy version (the API changed on 1.19.0)
- Parameters:
array (array_like) – Input array.
mask (boolean array) – Specifies the sub-arrays to remove along the given axis
axis (int) – The axis along which to delete the subarrays specified by mask
- Returns:
out – a copy of array with the elements specified by mask removed
- Return type:
ndarray