causalexplain.estimators.ges package#

Submodules#

Utilities for locally decomposable scores used by GES.

class DecomposableScore(data, cache=True, debug=0)[source]#

Bases: object

__init__(data, cache=True, debug=0)[source]#
local_score(x, pa)[source]#

Return the local score of a given node and a set of parents. If self.cache=True, will use previously computed score if possible.

Parameters:
  • x (int) – a node

  • pa (set of ints) – the node’s parents

Returns:

score – the corresponding score

Return type:

float

Greedy Equivalence Search (GES) implementation.

class GES(name, phases=None, iterate=False, debug=0, verbose=False)[source]#

Bases: object

Greedy Equivalence Search (GES) estimator.

dag = None#
ges_adjmat = None#
ges_score = None#
is_fitted_ = False#
feature_names = None#
metrics = None#
__init__(name, phases=None, iterate=False, debug=0, verbose=False)[source]#
phases = None#
iterate = None#
debug = 0#
fit(X, A0=None)[source]#
fit_predict(train, test, ref_graph=None, A0=None)[source]#

Fit the GES algorithm to the given data and return the adjacency matrix of the estimated CPDAG.

Parameters:
  • data (numpy.ndarray) – The n x p array containing the observations, where columns correspond to variables and rows to observations.

  • labels (list, optional) – The list of variable names. If not given, the columns of data are used as labels.

  • A0 (numpy.ndarray, optional) – The initial CPDAG on which GES will run, where where A0[i,j] != 0 implies the edge i -> j and A[i,j] != 0 & A[j,i] != 0 implies the edge i - j. Defaults to the empty graph (i.e. matrix of zeros).

Returns:

estimate – The adjacency matrix of the estimated CPDAG.

Return type:

numpy.ndarray

Raises:
  • TypeError: – If the type of some of the parameters was not expected, e.g. if data is not a numpy array.

  • ValueError: – If the value of some of the parameters is not appropriate, e.g. a wrong phase is specified.

fit_bic(data, A0=None)[source]#

Run GES on the given data, using the Gaussian BIC score (l0-penalized Gaussian Likelihood). The data is not assumed to be centered, i.e. an intercept is fitted.

To use a custom score, see ges.fit.

Parameters:
  • data (numpy.ndarray) – The n x p array containing the observations, where columns correspond to variables and rows to observations.

  • A0 (numpy.ndarray, optional) – The initial CPDAG on which GES will run, where where A0[i,j] != 0 implies the edge i -> j and A[i,j] != 0 & A[j,i] != 0 implies the edge i - j. Defaults to the empty graph (i.e. matrix of zeros).

Returns:

  • estimate (numpy.ndarray) – The adjacency matrix of the estimated CPDAG.

  • total_score (float) – The score of the estimate.

Raises:
  • TypeError: – If the type of some of the parameters was not expected, e.g. if data is not a numpy array.

  • ValueError: – If the value of some of the parameters is not appropriate, e.g. a wrong phase is specified.

Example

Data from a linear-gaussian SCM (generated using sempler)

>>> import numpy as np
>>> data = np.array([[3.23125779, 3.24950062, 13.430682, 24.67939513],
...                  [1.90913354, -0.06843781, 6.93957057, 16.10164608],
...                  [2.68547149, 1.88351553, 8.78711076, 17.18557716],
...                  [0.16850822, 1.48067393, 5.35871419, 11.82895779],
...                  [0.07355872, 1.06857039, 2.05006096, 3.07611922]])

Run GES using the gaussian BIC score:

>>> from causalexplain.estimators.ges import GES  
>>> ges = GES()                                 
>>> ges.fit_bic(data)                           
(array([[0, 1, 1, 0],                           
    [0, 0, 0, 0],                               
    [1, 1, 0, 1],                               
    [0, 1, 1, 0]]), 15.674267611628233)         
forward_step(A, cache, debug=0)[source]#

Scores all valid insert operators that can be applied to the current CPDAG A, and applies the highest scoring one.

Parameters:
  • A (np.array) – the adjacency matrix of a CPDAG, where A[i,j] != 0 => i -> j and A[i,j] != 0 & A[j,i] != 0 => i - j.

  • cache (DecomposableScore) – an instance of the score class, which computes the change in score and acts as cache.

  • debug (int, optional) – if larger than 0, debug are traces printed. Higher values correspond to increased verbosity

Returns:

  • score (float) – the change in score resulting from applying the highest scoring operator (note, can be smaller than 0).

  • new_A (np.array) – the adjacency matrix of the PDAG resulting from applying the operator (not yet a CPDAG).

backward_step(A, cache, debug=0)[source]#

Scores all valid delete operators that can be applied to the current CPDAG A, and applies the highest scoring one.

Parameters:
  • A (np.array) – the adjacency matrix of a CPDAG, where A[i,j] != 0 => i -> j and A[i,j] != 0 & A[j,i] != 0 => i - j.

  • cache (DecomposableScore) – an instance of the score class, which computes the change in score and acts as cache.

  • debug (int) – if larger than 0, debug are traces printed. Higher values correspond to increased verbosity

Returns:

  • score (float) – the change in score resulting from applying the highest scoring operator (note, can be smaller than 0).

  • new_A (np.array) – the adjacency matrix of the PDAG resulting from applying the operator (not yet a CPDAG).

turning_step(A, cache, debug=0)[source]#

Scores all valid turn operators that can be applied to the current CPDAG A, and applies the highest scoring one.

Parameters:
  • A (np.array) – the adjacency matrix of a CPDAG, where A[i,j] != 0 => i -> j and A[i,j] != 0 & A[j,i] != 0 => i - j.

  • cache (DecomposableScore) – an instance of the score class, which computes the change in score and acts as cache.

  • debug (int) – if larger than 0, debug are traces printed. Higher values correspond to increased verbosity

Returns:

  • score (float) – the change in score resulting from applying the highest scoring operator (note, can be smaller than 0).

  • new_A (np.array) – the adjacency matrix of the PDAG resulting from applying the operator (not yet a CPDAG).

insert(x, y, T, A)[source]#

Applies the insert operator: 1) adds the edge x -> y 2) for all t in T, orients the previously undirected edge t -> y

Parameters:
  • x (int) – the origin node (i.e. x -> y)

  • y (int) – the target node

  • T (iterable of ints) – a subset of the neighbors of y which are not adjacent to x

  • A (np.array) – the current adjacency matrix

Returns:

new_A – the adjacency matrix resulting from applying the operator

Return type:

np.array

score_valid_insert_operators(x, y, A, cache, debug=0)[source]#

Generate and score all valid insert(x,y,T) operators involving the edge x-> y, and all possible subsets T of neighbors of y which are NOT adjacent to x.

Parameters:
  • x (int) – the origin node (i.e. x -> y)

  • y (int) – the target node

  • A (np.array) – the current adjacency matrix

  • cache (instance of ges.scores.DecomposableScore) – the score cache to compute the score of the operators that are valid

  • debug (int) – if larger than 0, debug are traces printed. Higher values correspond to increased verbosity

Returns:

valid_operators – a list of tubles, each containing a valid operator, its score and the resulting connectivity matrix

Return type:

list of tuples

delete_operator(x, y, H, A)[source]#

Apply the delete operator for edge (x, y) with conditioning set H.

score_valid_delete_operators(x, y, A, cache, debug=0)[source]#

Generate and score all valid delete(x,y,H) operators involving the edge x -> y or x - y, and all possible subsets H of neighbors of y which are adjacent to x.

Parameters:
  • x (int) – the “origin” node (i.e. x -> y or x - y)

  • y (int) – the “target” node

  • A (np.array) – the current adjacency matrix

  • cache (instance of ges.scores.DecomposableScore) – the score cache to compute the score of the operators that are valid

  • debug (int) – if larger than 0, debug are traces printed. Higher values correspond to increased verbosity

Returns:

valid_operators – a list of tubles, each containing a valid operator, its score and the resulting connectivity matrix

Return type:

list of tuples

turn(x, y, C, A)[source]#

Applies the turning operator: For an edge x - y or x <- y, 1) orients the edge as x -> y 2) for all c in C, orients the previously undirected edge c -> y

Parameters:
  • x (int) – the origin node (i.e. x -> y)

  • y (int) – the target node

  • C (iterable of ints) – a subset of the neighbors of y

  • A (np.array) – the current adjacency matrix

Returns:

new_A – the adjacency matrix resulting from applying the operator

Return type:

np.array

score_valid_turn_operators(x, y, A, cache, debug=0)[source]#

Generate and score all valid turn(x,y,C) operators that can be applied to the edge x <- y or x - y, iterating through the valid subsets C of neighbors of y.

Parameters:
  • x (int) – the origin node (i.e. orient x -> y)

  • y (int) – the target node

  • A (np.array) – the current adjacency matrix

  • cache (instance of ges.scores.DecomposableScore) – the score cache to compute the score of the operators that are valid

  • debug (int) – if larger than 0, debug are traces printed. Higher values correspond to increased verbosity

Returns:

valid_operators – a list of tubles, each containing a valid operator, its score and the resulting connectivity matrix

Return type:

list of tuples

score_valid_turn_operators_dir(x, y, A, cache, debug=0)[source]#

Logic for finding and scoring the valid turn operators that can be applied to the edge x <- y.

Parameters:
  • x (int) – the origin node (i.e. x -> y)

  • y (int) – the target node

  • A (np.array) – the current adjacency matrix

  • cache (instance of ges.scores.DecomposableScore) – the score cache to compute the score of the operators that are valid

  • debug (bool or string) – if debug traces should be printed (True/False). If a non-empty string is passed, traces are printed with the given string as prefix (useful for indenting the prints from a calling function)

Returns:

valid_operators – a list of tubles, each containing a valid operator, its score and the resulting connectivity matrix

Return type:

list of tuples

score_valid_turn_operators_undir(x, y, A, cache, debug=0)[source]#

Logic for finding and scoring the valid turn operators that can be applied to the edge x - y.

Parameters:
  • x (int) – the origin node (i.e. x -> y)

  • y (int) – the target node

  • A (np.array) – the current adjacency matrix

  • cache (instance of ges.scores.DecomposableScore) – the score cache to compute the score of the operators that are valid

  • debug (bool or string) – if debug traces should be printed (True/False). If a non-empty string is passed, traces are printed with the given string as prefix (useful for indenting the prints from a calling function)

Returns:

valid_operators – a list of tubles, each containing a valid operator, its score and the resulting connectivity matrix

Return type:

list of tuples

main(dataset_name, input_path='/Users/renero/phd/data/sachs/', output_path='/Users/renero/phd/output/RC4/sachs/compared', save=False)[source]#
class GaussObsL0Pen(data, lmbda=None, method='scatter', cache=True, debug=0)[source]#

Bases: DecomposableScore

Implements a cached l0-penalized gaussian likelihood score.

__init__(data, lmbda=None, method='scatter', cache=True, debug=0)[source]#

Creates a new instance of the class.

Parameters:
  • data (numpy.ndarray) – the nxp matrix containing the observations of each variable (each column corresponds to a variable).

  • lmbda (float or NoneType, optional) – the regularization parameter. If None, defaults to the BIC score, i.e. lmbda = 1/2 * log(n), where n is the number of observations.

  • method ({'scatter', 'raw'}, optional) – the method used to compute the likelihood. If ‘scatter’, the empirical covariance matrix (i.e. scatter matrix) is used. If ‘raw’, the likelihood is computed from the raw data. In both cases an intercept is fitted.

  • cache (bool, optional) – if computations of the local score should be cached for future calls. Defaults to True.

  • debug (int, optional) – if larger than 0, debug are traces printed. Higher values correspond to increased verbosity.

full_score(A)[source]#

Given a DAG adjacency A, return the l0-penalized log-likelihood of a sample from a single environment, by finding the maximum likelihood estimates of the corresponding connectivity matrix (weights) and noise term variances.

Parameters:

A (np.array) – The adjacency matrix of a DAG, where A[i,j] != 0 => i -> j.

Returns:

score – the penalized log-likelihood score.

Return type:

float

Module containing the auxiliary functions used in the implementation of GES, including the PDAG to CPDAG conversion algorithm described in Chickering’s original GES paper from 2002.

na(y, x, A)[source]#

Return all neighbors of y which are adjacent to x in A.

Parameters:
  • y (int) – the node’s index

  • x (int) – the node’s index

  • A (np.array) – the adjacency matrix of the graph, where A[i,j] != 0 => i -> j and A[i,j] != 0 & A[j,i] != 0 => i - j.

Returns:

nodes – the resulting nodes

Return type:

set of ints

neighbors(i, A)[source]#

The neighbors of i in A, i.e. all nodes connected to i by an undirected edge.

Parameters:
  • i (int) – the node’s index

  • A (np.array) – the adjacency matrix of the graph, where A[i,j] != 0 => i -> j and A[i,j] != 0 & A[j,i] != 0 => i - j.

Returns:

nodes – the neighbor nodes

Return type:

set of ints

adj(i, A)[source]#

Return nodes adjacent to node i in adjacency matrix A.

pa(i, A, directed_only=False)[source]#

The parents of i in A.

Parameters:
  • i (int) – the node’s index

  • A (np.array) – the adjacency matrix of the graph, where A[i,j] != 0 => i -> j and A[i,j] != 0 & A[j,i] != 0 => i - j.

  • directed_only (bool, optional) – if True, only return strictly directed parents (j -> i and i -> j does not hold). If False, return all adjacent nodes (incoming or outgoing) to match a looser notion of parents sometimes used in PDAG utilities.

Returns:

nodes – the parent nodes

Return type:

set of ints

ch(i, A, directed_only=False)[source]#

The children of i in A.

Parameters:
  • A (np.array) – the adjacency matrix of the graph, where A[i,j] != 0 => i -> j and A[i,j] != 0 & A[j,i] != 0 => i - j.

  • directed_only (bool, optional) – if True, only return strictly directed children (i -> j and j -> i does not hold). If False, return all adjacent nodes (incoming or outgoing) to match a looser notion of children sometimes used in PDAG utilities.

Returns:

nodes – the children nodes

Return type:

set of ints

is_clique(S, A)[source]#

Check if the subgraph of A induced by nodes S is a clique.

Parameters:
  • S (set of ints) – set containing the nodes’ indices

  • A (np.array) – the adjacency matrix of the graph, where A[i,j] != 0 => i -> j and A[i,j] != 0 & A[j,i] != 0 => i - j.

Returns:

is_clique – if the subgraph induced by S is a clique in A

Return type:

bool

is_dag(A)[source]#

Checks wether the given adjacency matrix corresponds to a DAG.

Parameters:

A (np.array) – the adjacency matrix of the graph, where A[i,j] != 0 => i -> j.

Returns:

is_dag – if the adjacency corresponds to a DAG

Return type:

bool

topological_ordering(A)[source]#

Return a topological ordering for the DAG with adjacency matrix A, using Kahn’s 1962 algorithm.

Raises a ValueError exception if the given adjacency does not correspond to a DAG.

Parameters:

A (np.array) – the adjacency matrix of the graph, where A[i,j] != 0 => i -> j.

Returns:

ordering – a topological ordering for the DAG

Return type:

list of ints

Raises:

ValueError : – If the given adjacency does not correspond to a DAG.

semi_directed_paths(fro, to, A)[source]#

Return all paths from i to j in A. Note: a path is a sequence (a_1,…,a_n) of non-repeating nodes where either a_i -> a_i+1 or a_i - a_i+1 are edges in the PDAG A.

Parameters:
  • fro (int) – the index of the starting node

  • to (int) – the index of the target node

  • A (np.array) – the adjacency matrix of the graph, where A[i,j] != 0 => i -> j.

Returns:

paths – all the paths between the two nodes

Return type:

list of lists

separates(S, A, B, G)[source]#

Returns true if the set S separates A from B in G, i.e. if all paths in G from nodes in A to nodes in B contain a node in S. Exception is raised if S,A and B are not pairwise disjoint.

Parameters:
  • S (set of ints) – a set of nodes in G

  • A (set of ints) – a set of nodes in G

  • B (set of ints) – a set of nodes in G

  • G (np.array) – the adjacency matrix of the graph, where G[i,j] != 0 => i -> j and G[i,j] != 0 and G[j,i] != 0 => i - j.

Returns:

separated – if S separates A from B in G

Return type:

bool

chain_component(i, G)[source]#

Return all nodes in the connected component of node i after dropping all directed edges in G.

Parameters:
  • i (int) – the node’s index

  • G (np.array) – the adjacency matrix of the graph, where G[i,j] != 0 => i -> j and G[i,j] != 0 and G[j,i] != 0 => i - j

Returns:

visited – the nodes in the chain component of i

Return type:

set of ints

induced_subgraph(S, G)[source]#

Remove all edges which are not between nodes in S.

Parameters:
  • S (set of ints) – a set of node indices

  • G (np.array) – the adjacency matrix of the graph, where G[i,j] != 0 => i -> j and G[i,j] != 0 and G[j,i] != 0 => i - j

Returns:

subgraph – the adjacency matrix of the resulting graph where all edges between nodes not in S are removed. Note that this is not really the subgraph, as the nodes not in S still appear as disconnected nodes.

Return type:

np.array

vstructures(A)[source]#

Return the v-structures of a DAG or PDAG, given its adjacency matrix.

Parameters:

A (np.array) – The adjacency of the (P)DAG, where A[i,j] != 0 => i->j

Returns:

vstructs – the set of v-structures, where every v-structure is a three element tuple, e.g. (i,j,k) represents the v-structure i -> j <- k, where i < j for consistency.

Return type:

set()

only_directed(P)[source]#

Return the graph with the same nodes as P and only its directed edges.

Parameters:

P (np.array) – adjacency matrix of a graph

Returns:

G – adjacency matrix of the graph with the same nodes as P and only its directed edges

Return type:

np.array

only_undirected(P)[source]#

Return the graph with the same nodes as P and only its undirected edges.

Parameters:

P (np.array) – adjacency matrix of a graph

Returns:

G – adjacency matrix of the graph with the same nodes as P and only its undirected edges

Return type:

np.array

skeleton(A)[source]#

Return the skeleton of a given graph.

Parameters:

A (np.array) – adjacency matrix of a graph

Returns:

S – adjacency matrix of the skeleton, i.e. the graph resulting from dropping all edge orientations

Return type:

np.array

is_consistent_extension(G, P, debug=False)[source]#

Returns True if the DAG G is a consistent extension of the PDAG P. Will raise a ValueError exception if the graph G is not a DAG (i.e. cycles or undirected edges).

Parameters:
  • G (np.array) – the adjacency matrix of DAG

  • P (np.array) – the adjacency matrix of PDAG

  • debug (bool) – if debugging traces should be outputted

Returns:

consistent – True if G is a consistent extension of P (see below)

Return type:

bool

pdag_to_cpdag(pdag)[source]#

Transform a PDAG into its corresponding CPDAG. Returns a ValueError exception if the given PDAG does not admit a consistent extension.

Parameters:

pdag (np.array) – the adjacency matrix of a given PDAG where pdag[i,j] != 0 if i -> j and i - j if also pdag[j,i] != 0.

Returns:

cpdag – the adjacency matrix of the corresponding CPDAG

Return type:

np.array

dag_to_cpdag(G)[source]#

Return the completed partially directed acyclic graph (CPDAG) that represents the Markov equivalence class of a given DAG. Returns a ValueError exception if the given graph is not a DAG.

Parameters:

G (np.array) – the adjacency matrix of the given graph, where G[i,j] != 0 iff i -> j

Returns:

cpdag – the adjacency matrix of the corresponding CPDAG

Return type:

np.array

pdag_to_dag(P, debug=False)[source]#

Find a consistent extension of the given PDAG. Return a ValueError exception if the PDAG does not admit a consistent extension.

Parameters:
  • P (np.array) – adjacency matrix representing the PDAG connectivity, where P[i,j] = 1 => i->j

  • debug (bool, optional) – if debugging traces should be printed

Returns:

G – the adjacency matrix of a DAG which is a consistent extension (i.e. same v-structures and skeleton) of P.

Return type:

np.array

order_edges(G)[source]#

Find a total ordering of the edges in DAG G, as an intermediate step to obtaining the CPDAG representing the Markov equivalence class to which it belongs. Raises a ValueError exception if G is not a DAG.

Parameters:

G (np.array) – the adjacency matrix of a graph G, where G[i,j] != 0 iff i -> j.

Returns:

ordered – the adjacency matrix of the graph G, but with labelled edges, i.e. i -> j is has label x iff ordered[i,j] = x.

Return type:

np.array

label_edges(ordered)[source]#

Given a DAG with edges labelled according to a total ordering, label each edge as being compelled or reverisble.

Parameters:

ordered (np.array) – the adjacency matrix of a graph, with the edges labelled according to a total ordering.

Returns:

labelled

the adjacency matrix of G but with labelled edges, where
  • labelled[i,j] = 1 iff i -> j is compelled, and

  • labelled[i,j] = -1 iff i -> j is reversible.

Return type:

np.array

cartesian(arrays, out=None, dtype=<class 'numpy.int8'>)[source]#

Generate a cartesian product of input arrays.

Parameters:
  • arrays (list of array-like) – 1-D arrays to form the cartesian product of.

  • out (ndarray) – Array to place the cartesian product in.

Returns:

out – 2-D array of shape (M, len(arrays)) containing cartesian products formed of input arrays.

Return type:

ndarray

Examples

>>> cartesian(([1, 2, 3], [4, 5], [6, 7]))
array([[1, 4, 6],
       [1, 4, 7],
       [1, 5, 6],
       [1, 5, 7],
       [2, 4, 6],
       [2, 4, 7],
       [2, 5, 6],
       [2, 5, 7],
       [3, 4, 6],
       [3, 4, 7],
       [3, 5, 6],
       [3, 5, 7]], dtype=int8)
sort(L, order=None)[source]#

Sort the elements in an iterable according to its pre-defined ‘sorted’ function, or according to a given order: i will precede j if i precedes j in the order.

Parameters:
  • L (iterable) – the iterable to be sorted

  • order (iterable or None, optional) – a given ordering. In the sorted result, i will precede j if i precedes j in order. If None, the predefined ‘sorted’ function of the iterator will be used. Defaults to None.

Returns:

ordered – a list containing the elements of L, sorted from lesser to greater or according to the given order.

Return type:

list

subsets(S)[source]#

Return an iterator with all possible subsets of the set S.

Parameters:

S (set) – a given set

Returns:

subsets – an iterable over the subsets of S, in increasing size

Return type:

iterable

member(L, A)[source]#

Return the index of the first appearance of array A in L.

Parameters:
  • L (list of np.array) – list on which to perform the search

  • A (np.array) – the target array

Returns:

position – the index of the first appearance of array A in list L, or None if A is not in L.

Return type:

int or None

delete(array, mask, axis=None)[source]#

Wrapper for numpy.delete, which adapts the call depending on the numpy version (the API changed on 1.19.0)

Parameters:
  • array (array_like) – Input array.

  • mask (boolean array) – Specifies the sub-arrays to remove along the given axis

  • axis (int) – The axis along which to delete the subarrays specified by mask

Returns:

out – a copy of array with the elements specified by mask removed

Return type:

ndarray

Module contents#