causalexplain.estimators.ges package#

Submodules#

Module containing the DecomposableScore class, inherited by all classes which implement a locally decomposable score for directed acyclic graphs. By default, the class also caches the results of computing local scores.

NOTE: It is not mandatory to inherit this class when developing custom scores to use with the GES implementation in ges.py. The only requirement is that the class defines:

the local_score function (see below),

an attribute “p” for the total number of variables.

class DecomposableScore(data, cache=True, debug=0)[source]#

Bases: object

Methods

local_score(x, pa)

Return the local score of a given node and a set of parents.

__init__(data, cache=True, debug=0)[source]#

local_score(x, pa)[source]#

Return the local score of a given node and a set of parents. If self.cache=True, will use previously computed score if possible.

Parameters:

x (int) – a node
pa (set of ints) – the node’s parents

Returns:

score – the corresponding score

Return type:

float

The main module, containing the implementation of GES, including the logic for the insert, delete and turn operators. The implementation is directly based on two papers:

1. The 2002 GES paper by Chickering, “Optimal Structure Identification With Greedy Search” - https://www.jmlr.org/papers/volume3/chickering02b/chickering02b.pdf

2. For the turn operator, the 2012 GIES paper by Hauser & Bühlmann, “Characterization and Greedy Learning of Interventional Markov Equivalence Classes of Directed Acyclic Graphs” - https://www.jmlr.org/papers/volume13/hauser12a/hauser12a.pdf

Further credit is given where due.

Additional modules / packages:

ges.utils contains auxiliary functions and the logic to transform a PDAG into a CPDAG, used after each application of an operator.

ges.scores contains the modules with the score classes:

ges.scores.decomposable_score contains the base class for decomposable score classes (see that module for more details).

ges.scores.gauss_obs_l0_pen contains a cached implementation of the gaussian likelihood BIC score used in the original GES paper.

ges.test contains the modules with the unit tests and tests comparing against the algorithm’s implementation in the R package ‘pcalg’.

class GES(name, phases=None, iterate=False, debug=0, verbose=False)[source]#

Bases: object

The GES (Greedy Equivalence Search) algorithm for learning causal graphs from observational data.

Parameters:

phases (list, optional) – Which phases of the GES procedure are run, and in which order. Defaults to [‘forward’, ‘backward’, ‘turning’].
iterate (bool, default=False) – Indicates whether the given phases should be iterated more than once.
debug (int, optional) – Debug level for printing traces. Higher values correspond to increased verbosity.

dag#

The adjacency matrix of the estimated CPDAG (completed partially directed acyclic graph).

Type:: numpy.ndarray

ges_adjmat#

The adjacency matrix of the estimated PDAG (partially directed acyclic graph) before completion.

Type:: numpy.ndarray

ges_score#

The score of the estimated CPDAG.

Type:: float

phases#

The phases of the GES procedure that are run, and in which order.

Type:: list

iterate#

Indicates whether the given phases should be iterated more than once.

Type:: bool

fit(data, labels=None, A0=None)[source]#: Fit the GES algorithm to the given data.

fit_bic(data, A0=None, iterate=False, debug=0)[source]#: Run GES on the given data using the Gaussian BIC score.

forward_step(A, cache, debug=0)[source]#: Score and apply the highest scoring insert operator.

backward_step(A, cache, debug=0)[source]#: Score and apply the highest scoring delete operator.

Examples

>>> import numpy as np  
>>> from causalexplain.estimators.ges import GES  
>>> data = np.array([[3.23125779, 3.24950062, 13.430682, 24.67939513],  
...                  [1.90913354, -0.06843781, 6.93957057, 16.10164608],
...                  [2.68547149, 1.88351553, 8.78711076, 17.18557716], 
...                  [0.16850822, 1.48067393, 5.35871419, 11.82895779], 
...                  [0.07355872, 1.06857039, 2.05006096, 3.07611922]]) 
>>> ges = GES() 
>>> ges.fit(data)   

Attributes:

dag
feature_names
ges_adjmat
ges_score
iterate
metrics
phases

Methods

`backward_step`(A, cache[, debug])	Scores all valid delete operators that can be applied to the current CPDAG A, and applies the highest scoring one.
`delete_operator`(x, y, H, A)	Applies the delete operator: 1) deletes the edge x -> y or x - y 2) for every node h in H * orients the edge y -> h * if the edge with x is undirected, orients it as x -> h
`fit_bic`(data[, A0])	Run GES on the given data, using the Gaussian BIC score (l0-penalized Gaussian Likelihood).
`fit_predict`(train, test[, ref_graph, A0])	Fit the GES algorithm to the given data and return the adjacency matrix of the estimated CPDAG.
`forward_step`(A, cache[, debug])	Scores all valid insert operators that can be applied to the current CPDAG A, and applies the highest scoring one.
`insert`(x, y, T, A)	Applies the insert operator: 1) adds the edge x -> y 2) for all t in T, orients the previously undirected edge t -> y
`score_valid_delete_operators`(x, y, A, cache)	Generate and score all valid delete(x,y,H) operators involving the edge x -> y or x - y, and all possible subsets H of neighbors of y which are adjacent to x.
`score_valid_insert_operators`(x, y, A, cache)	Generate and score all valid insert(x,y,T) operators involving the edge x-> y, and all possible subsets T of neighbors of y which are NOT adjacent to x.
`score_valid_turn_operators`(x, y, A, cache[, ...])	Generate and score all valid turn(x,y,C) operators that can be applied to the edge x <- y or x - y, iterating through the valid subsets C of neighbors of y.
`score_valid_turn_operators_dir`(x, y, A, cache)	Logic for finding and scoring the valid turn operators that can be applied to the edge x <- y.
`score_valid_turn_operators_undir`(x, y, A, cache)	Logic for finding and scoring the valid turn operators that can be applied to the edge x - y.
`turn`(x, y, C, A)	Applies the turning operator: For an edge x - y or x <- y, 1) orients the edge as x -> y 2) for all c in C, orients the previously undirected edge c -> y
`turning_step`(A, cache[, debug])	Scores all valid turn operators that can be applied to the current CPDAG A, and applies the highest scoring one.

fit

dag = None#

ges_adjmat = None#

ges_score = None#

is_fitted_ = False#

feature_names = None#

metrics = None#

__init__(name, phases=None, iterate=False, debug=0, verbose=False)[source]#

phases = None#

iterate = None#

debug = 0#

fit(X, A0=None)[source]#

fit_predict(train, test, ref_graph=None, A0=None)[source]#

Fit the GES algorithm to the given data and return the adjacency matrix of the estimated CPDAG.

Parameters:

data (numpy.ndarray) – The n x p array containing the observations, where columns correspond to variables and rows to observations.
labels (list, optional) – The list of variable names. If not given, the columns of data are used as labels.
A0 (numpy.ndarray, optional) – The initial CPDAG on which GES will run, where where A0[i,j] != 0 implies the edge i -> j and A[i,j] != 0 & A[j,i] != 0 implies the edge i - j. Defaults to the empty graph (i.e. matrix of zeros).

Returns:

estimate – The adjacency matrix of the estimated CPDAG.

Return type:

numpy.ndarray

Raises:

TypeError: – If the type of some of the parameters was not expected, e.g. if data is not a numpy array.
ValueError: – If the value of some of the parameters is not appropriate, e.g. a wrong phase is specified.

fit_bic(data, A0=None)[source]#

Run GES on the given data, using the Gaussian BIC score (l0-penalized Gaussian Likelihood). The data is not assumed to be centered, i.e. an intercept is fitted.

To use a custom score, see ges.fit.

Parameters:

data (numpy.ndarray) – The n x p array containing the observations, where columns correspond to variables and rows to observations.
A0 (numpy.ndarray, optional) – The initial CPDAG on which GES will run, where where A0[i,j] != 0 implies the edge i -> j and A[i,j] != 0 & A[j,i] != 0 implies the edge i - j. Defaults to the empty graph (i.e. matrix of zeros).

Returns:

estimate (numpy.ndarray) – The adjacency matrix of the estimated CPDAG.
total_score (float) – The score of the estimate.

Raises:

TypeError: – If the type of some of the parameters was not expected, e.g. if data is not a numpy array.
ValueError: – If the value of some of the parameters is not appropriate, e.g. a wrong phase is specified.

Example

Data from a linear-gaussian SCM (generated using sempler)

>>> import numpy as np
>>> data = np.array([[3.23125779, 3.24950062, 13.430682, 24.67939513],
...                  [1.90913354, -0.06843781, 6.93957057, 16.10164608],
...                  [2.68547149, 1.88351553, 8.78711076, 17.18557716],
...                  [0.16850822, 1.48067393, 5.35871419, 11.82895779],
...                  [0.07355872, 1.06857039, 2.05006096, 3.07611922]])

Run GES using the gaussian BIC score:

>>> from causalexplain.estimators.ges import GES  
>>> ges = GES()                                 
>>> ges.fit_bic(data)                           
(array([[0, 1, 1, 0],                           
    [0, 0, 0, 0],                               
    [1, 1, 0, 1],                               
    [0, 1, 1, 0]]), 15.674267611628233)         

forward_step(A, cache, debug=0)[source]#

Scores all valid insert operators that can be applied to the current CPDAG A, and applies the highest scoring one.

Parameters:

A (np.array) – the adjacency matrix of a CPDAG, where A[i,j] != 0 => i -> j and A[i,j] != 0 & A[j,i] != 0 => i - j.
cache (DecomposableScore) – an instance of the score class, which computes the change in score and acts as cache.
debug (int, optional) – if larger than 0, debug are traces printed. Higher values correspond to increased verbosity

Returns:

score (float) – the change in score resulting from applying the highest scoring operator (note, can be smaller than 0).
new_A (np.array) – the adjacency matrix of the PDAG resulting from applying the operator (not yet a CPDAG).

backward_step(A, cache, debug=0)[source]#

Scores all valid delete operators that can be applied to the current CPDAG A, and applies the highest scoring one.

Parameters:

A (np.array) – the adjacency matrix of a CPDAG, where A[i,j] != 0 => i -> j and A[i,j] != 0 & A[j,i] != 0 => i - j.
cache (DecomposableScore) – an instance of the score class, which computes the change in score and acts as cache.
debug (int) – if larger than 0, debug are traces printed. Higher values correspond to increased verbosity

Returns:

score (float) – the change in score resulting from applying the highest scoring operator (note, can be smaller than 0).
new_A (np.array) – the adjacency matrix of the PDAG resulting from applying the operator (not yet a CPDAG).

turning_step(A, cache, debug=0)[source]#

Scores all valid turn operators that can be applied to the current CPDAG A, and applies the highest scoring one.

Parameters:

A (np.array) – the adjacency matrix of a CPDAG, where A[i,j] != 0 => i -> j and A[i,j] != 0 & A[j,i] != 0 => i - j.
cache (DecomposableScore) – an instance of the score class, which computes the change in score and acts as cache.
debug (int) – if larger than 0, debug are traces printed. Higher values correspond to increased verbosity

Returns:

score (float) – the change in score resulting from applying the highest scoring operator (note, can be smaller than 0).
new_A (np.array) – the adjacency matrix of the PDAG resulting from applying the operator (not yet a CPDAG).

insert(x, y, T, A)[source]#

Applies the insert operator: 1) adds the edge x -> y 2) for all t in T, orients the previously undirected edge t -> y

Parameters:

x (int) – the origin node (i.e. x -> y)
y (int) – the target node
T (iterable of ints) – a subset of the neighbors of y which are not adjacent to x
A (np.array) – the current adjacency matrix

Returns:

new_A – the adjacency matrix resulting from applying the operator

Return type:

np.array

score_valid_insert_operators(x, y, A, cache, debug=0)[source]#

Generate and score all valid insert(x,y,T) operators involving the edge x-> y, and all possible subsets T of neighbors of y which are NOT adjacent to x.

Parameters:

x (int) – the origin node (i.e. x -> y)
y (int) – the target node
A (np.array) – the current adjacency matrix
cache (instance of ges.scores.DecomposableScore) – the score cache to compute the score of the operators that are valid
debug (int) – if larger than 0, debug are traces printed. Higher values correspond to increased verbosity

Returns:

valid_operators – a list of tubles, each containing a valid operator, its score and the resulting connectivity matrix

Return type:

list of tuples

delete_operator(x, y, H, A)[source]#

Applies the delete operator: 1) deletes the edge x -> y or x - y 2) for every node h in H

orients the edge y -> h

if the edge with x is undirected, orients it as x -> h

Note that H must be a subset of the neighbors of y which are adjacent to x. A ValueError exception is thrown otherwise.

Parameters:

x (int) – the “origin” node (i.e. x -> y or x - y)
y (int) – the “target” node
H (iterable of ints) – a subset of the neighbors of y which are adjacent to x
A (np.array) – the current adjacency matrix

Returns:

new_A – the adjacency matrix resulting from applying the operator

Return type:

np.array

score_valid_delete_operators(x, y, A, cache, debug=0)[source]#

Generate and score all valid delete(x,y,H) operators involving the edge x -> y or x - y, and all possible subsets H of neighbors of y which are adjacent to x.

Parameters:

x (int) – the “origin” node (i.e. x -> y or x - y)
y (int) – the “target” node
A (np.array) – the current adjacency matrix
cache (instance of ges.scores.DecomposableScore) – the score cache to compute the score of the operators that are valid
debug (int) – if larger than 0, debug are traces printed. Higher values correspond to increased verbosity

Returns:

valid_operators – a list of tubles, each containing a valid operator, its score and the resulting connectivity matrix

Return type:

list of tuples

turn(x, y, C, A)[source]#

Applies the turning operator: For an edge x - y or x <- y, 1) orients the edge as x -> y 2) for all c in C, orients the previously undirected edge c -> y

Parameters:

x (int) – the origin node (i.e. x -> y)
y (int) – the target node
C (iterable of ints) – a subset of the neighbors of y
A (np.array) – the current adjacency matrix

Returns:

new_A – the adjacency matrix resulting from applying the operator

Return type:

np.array

score_valid_turn_operators(x, y, A, cache, debug=0)[source]#

Generate and score all valid turn(x,y,C) operators that can be applied to the edge x <- y or x - y, iterating through the valid subsets C of neighbors of y.

Parameters:

x (int) – the origin node (i.e. orient x -> y)
y (int) – the target node
A (np.array) – the current adjacency matrix
cache (instance of ges.scores.DecomposableScore) – the score cache to compute the score of the operators that are valid
debug (int) – if larger than 0, debug are traces printed. Higher values correspond to increased verbosity

Returns:

valid_operators – a list of tubles, each containing a valid operator, its score and the resulting connectivity matrix

Return type:

list of tuples

score_valid_turn_operators_dir(x, y, A, cache, debug=0)[source]#

Logic for finding and scoring the valid turn operators that can be applied to the edge x <- y.

Parameters:

x (int) – the origin node (i.e. x -> y)
y (int) – the target node
A (np.array) – the current adjacency matrix
cache (instance of ges.scores.DecomposableScore) – the score cache to compute the score of the operators that are valid
debug (bool or string) – if debug traces should be printed (True/False). If a non-empty string is passed, traces are printed with the given string as prefix (useful for indenting the prints from a calling function)

Returns:

valid_operators – a list of tubles, each containing a valid operator, its score and the resulting connectivity matrix

Return type:

list of tuples

score_valid_turn_operators_undir(x, y, A, cache, debug=0)[source]#

Logic for finding and scoring the valid turn operators that can be applied to the edge x - y.

Parameters:

x (int) – the origin node (i.e. x -> y)
y (int) – the target node
A (np.array) – the current adjacency matrix
cache (instance of ges.scores.DecomposableScore) – the score cache to compute the score of the operators that are valid
debug (bool or string) – if debug traces should be printed (True/False). If a non-empty string is passed, traces are printed with the given string as prefix (useful for indenting the prints from a calling function)

Returns:

valid_operators – a list of tubles, each containing a valid operator, its score and the resulting connectivity matrix

Return type:

list of tuples

main(dataset_name, input_path='/Users/renero/phd/data/sachs/', output_path='/Users/renero/phd/output/RC4/sachs/compared', save=False)[source]#

class GaussObsL0Pen(data, lmbda=None, method='scatter', cache=True, debug=0)[source]#

Bases: DecomposableScore

Implements a cached l0-penalized gaussian likelihood score.

Methods

`full_score`(A)	Given a DAG adjacency A, return the l0-penalized log-likelihood of a sample from a single environment, by finding the maximum likelihood estimates of the corresponding connectivity matrix (weights) and noise term variances.
`local_score`(x, pa)	Return the local score of a given node and a set of parents.

__init__(data, lmbda=None, method='scatter', cache=True, debug=0)[source]#

Creates a new instance of the class.

Parameters:

data (numpy.ndarray) – the nxp matrix containing the observations of each variable (each column corresponds to a variable).
lmbda (float or NoneType, optional) – the regularization parameter. If None, defaults to the BIC score, i.e. lmbda = 1/2 * log(n), where n is the number of observations.
method ({'scatter', 'raw'}, optional) – the method used to compute the likelihood. If ‘scatter’, the empirical covariance matrix (i.e. scatter matrix) is used. If ‘raw’, the likelihood is computed from the raw data. In both cases an intercept is fitted.
cache (bool, optional) – if computations of the local score should be cached for future calls. Defaults to True.
debug (int, optional) – if larger than 0, debug are traces printed. Higher values correspond to increased verbosity.

full_score(A)[source]#

Given a DAG adjacency A, return the l0-penalized log-likelihood of a sample from a single environment, by finding the maximum likelihood estimates of the corresponding connectivity matrix (weights) and noise term variances.

Parameters:: A (np.array) – The adjacency matrix of a DAG, where A[i,j] != 0 => i -> j.
Returns:: score – the penalized log-likelihood score.
Return type:: float

Module containing the auxiliary functions used in the implementation of GES, including the PDAG to CPDAG conversion algorithm described in Chickering’s original GES paper from 2002.

na(y, x, A)[source]#

Return all neighbors of y which are adjacent to x in A.

Parameters:

y (int) – the node’s index
x (int) – the node’s index
A (np.array) – the adjacency matrix of the graph, where A[i,j] != 0 => i -> j and A[i,j] != 0 & A[j,i] != 0 => i - j.

Returns:

nodes – the resulting nodes

Return type:

set of ints

neighbors(i, A)[source]#

The neighbors of i in A, i.e. all nodes connected to i by an undirected edge.

Parameters:

i (int) – the node’s index
A (np.array) – the adjacency matrix of the graph, where A[i,j] != 0 => i -> j and A[i,j] != 0 & A[j,i] != 0 => i - j.

Returns:

nodes – the neighbor nodes

Return type:

set of ints

adj(i, A)[source]#

The adjacent nodes of i in A, i.e. all nodes connected by a directed or undirected edge. :param i: the node’s index :type i: int :param A: the adjacency matrix of the graph, where A[i,j] != 0 => i -> j

and A[i,j] != 0 & A[j,i] != 0 => i - j.

Returns:: nodes – the adjacent nodes
Return type:: set of ints

pa(i, A)[source]#

The parents of i in A.

Parameters:

i (int) – the node’s index
A (np.array) – the adjacency matrix of the graph, where A[i,j] != 0 => i -> j and A[i,j] != 0 & A[j,i] != 0 => i - j.

Returns:

nodes – the parent nodes

Return type:

set of ints

ch(i, A)[source]#

The children of i in A.

Parameters:: A (np.array) – the adjacency matrix of the graph, where A[i,j] != 0 => i -> j and A[i,j] != 0 & A[j,i] != 0 => i - j.
Returns:: nodes – the children nodes
Return type:: set of ints

is_clique(S, A)[source]#

Check if the subgraph of A induced by nodes S is a clique.

Parameters:

S (set of ints) – set containing the nodes’ indices
A (np.array) – the adjacency matrix of the graph, where A[i,j] != 0 => i -> j and A[i,j] != 0 & A[j,i] != 0 => i - j.

Returns:

is_clique – if the subgraph induced by S is a clique in A

Return type:

bool

is_dag(A)[source]#

Checks wether the given adjacency matrix corresponds to a DAG.

Parameters:: A (np.array) – the adjacency matrix of the graph, where A[i,j] != 0 => i -> j.
Returns:: is_dag – if the adjacency corresponds to a DAG
Return type:: bool

topological_ordering(A)[source]#

Return a topological ordering for the DAG with adjacency matrix A, using Kahn’s 1962 algorithm.

Raises a ValueError exception if the given adjacency does not correspond to a DAG.

Parameters:: A (np.array) – the adjacency matrix of the graph, where A[i,j] != 0 => i -> j.
Returns:: ordering – a topological ordering for the DAG
Return type:: list of ints
Raises:: ValueError : – If the given adjacency does not correspond to a DAG.

semi_directed_paths(fro, to, A)[source]#

Return all paths from i to j in A. Note: a path is a sequence (a_1,…,a_n) of non-repeating nodes where either a_i -> a_i+1 or a_i - a_i+1 are edges in the PDAG A.

Parameters:

fro (int) – the index of the starting node
to (int) – the index of the target node
A (np.array) – the adjacency matrix of the graph, where A[i,j] != 0 => i -> j.

Returns:

paths – all the paths between the two nodes

Return type:

list of lists

separates(S, A, B, G)[source]#

Returns true if the set S separates A from B in G, i.e. if all paths in G from nodes in A to nodes in B contain a node in S. Exception is raised if S,A and B are not pairwise disjoint.

Parameters:

S (set of ints) – a set of nodes in G
A (set of ints) – a set of nodes in G
B (set of ints) – a set of nodes in G
G (np.array) – the adjacency matrix of the graph, where G[i,j] != 0 => i -> j and G[i,j] != 0 and G[j,i] != 0 => i - j.

Returns:

separated – if S separates A from B in G

Return type:

bool

chain_component(i, G)[source]#

Return all nodes in the connected component of node i after dropping all directed edges in G.

Parameters:

i (int) – the node’s index
G (np.array) – the adjacency matrix of the graph, where G[i,j] != 0 => i -> j and G[i,j] != 0 and G[j,i] != 0 => i - j

Returns:

visited – the nodes in the chain component of i

Return type:

set of ints

induced_subgraph(S, G)[source]#

Remove all edges which are not between nodes in S.

Parameters:

S (set of ints) – a set of node indices
G (np.array) – the adjacency matrix of the graph, where G[i,j] != 0 => i -> j and G[i,j] != 0 and G[j,i] != 0 => i - j

Returns:

subgraph – the adjacency matrix of the resulting graph where all edges between nodes not in S are removed. Note that this is not really the subgraph, as the nodes not in S still appear as disconnected nodes.

Return type:

np.array

vstructures(A)[source]#

Return the v-structures of a DAG or PDAG, given its adjacency matrix.

Parameters:: A (np.array) – The adjacency of the (P)DAG, where A[i,j] != 0 => i->j
Returns:: vstructs – the set of v-structures, where every v-structure is a three element tuple, e.g. (i,j,k) represents the v-structure i -> j <- k, where i < j for consistency.
Return type:: set()

only_directed(P)[source]#

Return the graph with the same nodes as P and only its directed edges.

Parameters:: P (np.array) – adjacency matrix of a graph
Returns:: G – adjacency matrix of the graph with the same nodes as P and only its directed edges
Return type:: np.array

only_undirected(P)[source]#

Return the graph with the same nodes as P and only its undirected edges.

Parameters:: P (np.array) – adjacency matrix of a graph
Returns:: G – adjacency matrix of the graph with the same nodes as P and only its undirected edges
Return type:: np.array

skeleton(A)[source]#

Return the skeleton of a given graph.

Parameters:: A (np.array) – adjacency matrix of a graph
Returns:: S – adjacency matrix of the skeleton, i.e. the graph resulting from dropping all edge orientations
Return type:: np.array

is_consistent_extension(G, P, debug=False)[source]#

Returns True if the DAG G is a consistent extension of the PDAG P. Will raise a ValueError exception if the graph G is not a DAG (i.e. cycles or undirected edges).

Parameters:

G (np.array) – the adjacency matrix of DAG
P (np.array) – the adjacency matrix of PDAG
debug (bool) – if debugging traces should be outputted

Returns:

consistent – True if G is a consistent extension of P (see below)

Return type:

bool

pdag_to_cpdag(pdag)[source]#

Transform a PDAG into its corresponding CPDAG. Returns a ValueError exception if the given PDAG does not admit a consistent extension.

Parameters:: pdag (np.array) – the adjacency matrix of a given PDAG where pdag[i,j] != 0 if i -> j and i - j if also pdag[j,i] != 0.
Returns:: cpdag – the adjacency matrix of the corresponding CPDAG
Return type:: np.array

dag_to_cpdag(G)[source]#

Return the completed partially directed acyclic graph (CPDAG) that represents the Markov equivalence class of a given DAG. Returns a ValueError exception if the given graph is not a DAG.

Parameters:: G (np.array) – the adjacency matrix of the given graph, where G[i,j] != 0 iff i -> j
Returns:: cpdag – the adjacency matrix of the corresponding CPDAG
Return type:: np.array

pdag_to_dag(P, debug=False)[source]#

Find a consistent extension of the given PDAG. Return a ValueError exception if the PDAG does not admit a consistent extension.

Parameters:

P (np.array) – adjacency matrix representing the PDAG connectivity, where P[i,j] = 1 => i->j
debug (bool, optional) – if debugging traces should be printed

Returns:

G – the adjacency matrix of a DAG which is a consistent extension (i.e. same v-structures and skeleton) of P.

Return type:

np.array

order_edges(G)[source]#

Find a total ordering of the edges in DAG G, as an intermediate step to obtaining the CPDAG representing the Markov equivalence class to which it belongs. Raises a ValueError exception if G is not a DAG.

Parameters:: G (np.array) – the adjacency matrix of a graph G, where G[i,j] != 0 iff i -> j.
Returns:: ordered – the adjacency matrix of the graph G, but with labelled edges, i.e. i -> j is has label x iff ordered[i,j] = x.
Return type:: np.array

label_edges(ordered)[source]#

Given a DAG with edges labelled according to a total ordering, label each edge as being compelled or reverisble.

Parameters:

ordered (np.array) – the adjacency matrix of a graph, with the edges labelled according to a total ordering.

Returns:

labelled –

the adjacency matrix of G but with labelled edges, where

labelled[i,j] = 1 iff i -> j is compelled, and
labelled[i,j] = -1 iff i -> j is reversible.

Return type:

np.array

cartesian(arrays, out=None, dtype=<class 'numpy.int8'>)[source]#

Generate a cartesian product of input arrays.

Parameters:

arrays (list of array-like) – 1-D arrays to form the cartesian product of.
out (ndarray) – Array to place the cartesian product in.

Returns:

out – 2-D array of shape (M, len(arrays)) containing cartesian products formed of input arrays.

Return type:

ndarray

Examples

>>> cartesian(([1, 2, 3], [4, 5], [6, 7]))
array([[1, 4, 6],
       [1, 4, 7],
       [1, 5, 6],
       [1, 5, 7],
       [2, 4, 6],
       [2, 4, 7],
       [2, 5, 6],
       [2, 5, 7],
       [3, 4, 6],
       [3, 4, 7],
       [3, 5, 6],
       [3, 5, 7]], dtype=int8)

sort(L, order=None)[source]#

Sort the elements in an iterable according to its pre-defined ‘sorted’ function, or according to a given order: i will precede j if i precedes j in the order.

Parameters:

L (iterable) – the iterable to be sorted
order (iterable or None, optional) – a given ordering. In the sorted result, i will precede j if i precedes j in order. If None, the predefined ‘sorted’ function of the iterator will be used. Defaults to None.

Returns:

ordered – a list containing the elements of L, sorted from lesser to greater or according to the given order.

Return type:

list

subsets(S)[source]#

Return an iterator with all possible subsets of the set S.

Parameters:: S (set) – a given set
Returns:: subsets – an iterable over the subsets of S, in increasing size
Return type:: iterable

member(L, A)[source]#

Return the index of the first appearance of array A in L.

Parameters:

L (list of np.array) – list on which to perform the search
A (np.array) – the target array

Returns:

position – the index of the first appearance of array A in list L, or None if A is not in L.

Return type:

int or None

delete(array, mask, axis=None)[source]#

Wrapper for numpy.delete, which adapts the call depending on the numpy version (the API changed on 1.19.0)

Parameters:

array (array_like) – Input array.
mask (boolean array) – Specifies the sub-arrays to remove along the given axis
axis (int) – The axis along which to delete the subarrays specified by mask

Returns:

out – a copy of array with the elements specified by mask removed

Return type:

ndarray

causalexplain.estimators.ges package#

Submodules#

Module contents#

This Page