causalexplain.models package#

Submodules#

class MLP(input_size, layers_dimensions, activation, batch_size, lr, loss, dropout)[source]#

Bases: LightningModule

device = 'cpu'#
class Block(d_in, d_out, activation, bias, dropout, device)[source]#

Bases: Module

The main building block of MLP.

__init__(d_in, d_out, activation, bias, dropout, device)[source]#

Initialize internal Module state, shared by both nn.Module and ScriptModule.

forward(x)[source]#

Define the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

__init__(input_size, layers_dimensions, activation, batch_size, lr, loss, dropout)[source]#
forward(x)[source]#

Same as torch.nn.Module.forward().

Parameters:
  • *args – Whatever you decide to pass into the forward method.

  • **kwargs – Keyword arguments are also possible.

Returns:

Your model’s output

predict_step(batch, batch_idx, **kwargs)[source]#

Step function called during predict(). By default, it calls forward(). Override to add any processing logic.

The predict_step() is used to scale inference on multi-devices.

To prevent an OOM error, it is possible to use BasePredictionWriter callback to write the predictions to disk or database after each batch or on epoch end.

The BasePredictionWriter should be used while using a spawn based accelerator. This happens for Trainer(strategy="ddp_spawn") or training on 8 TPU cores with Trainer(accelerator="tpu", devices=8) as predictions won’t be returned.

Parameters:
  • batch – The output of your data iterable, normally a DataLoader.

  • batch_idx – The index of this batch.

  • dataloader_idx – The index of the dataloader that produced this batch. (only if multiple dataloaders used)

Returns:

Predicted output (optional).

Example

class MyModel(LightningModule):

    def predict_step(self, batch, batch_idx, dataloader_idx=0):
        return self(batch)

dm = ...
model = MyModel()
trainer = Trainer(accelerator="gpu", devices=2)
predictions = trainer.predict(model, dm)
training_step(batch, batch_idx)[source]#

Here you compute and return the training loss and some additional metrics for e.g. the progress bar or logger.

Parameters:
  • batch – The output of your data iterable, normally a DataLoader.

  • batch_idx – The index of this batch.

  • dataloader_idx – The index of the dataloader that produced this batch. (only if multiple dataloaders used)

Returns:

  • Tensor - The loss tensor

  • dict - A dictionary which can include any keys, but must include the key 'loss' in the case of automatic optimization.

  • None - In automatic optimization, this will skip to the next batch (but is not supported for multi-GPU, TPU, or DeepSpeed). For manual optimization, this has no special meaning, as returning the loss is not required.

In this step you’d normally do the forward pass and calculate the loss for a batch. You can also do fancier things like multiple forward passes or something model specific.

Example:

def training_step(self, batch, batch_idx):
    x, y, z = batch
    out = self.encoder(x)
    loss = self.loss(out, x)
    return loss

To use multiple optimizers, you can switch to ‘manual optimization’ and control their stepping:

def __init__(self):
    super().__init__()
    self.automatic_optimization = False

# Multiple optimizers (e.g.: GANs)
def training_step(self, batch, batch_idx):
    opt1, opt2 = self.optimizers()

    # do training_step with encoder
    ...
    opt1.step()
    # do training_step with decoder
    ...
    opt2.step()

Note

When accumulate_grad_batches > 1, the loss returned here will be automatically normalized by accumulate_grad_batches internally.

validation_step(batch, batch_idx)[source]#

Operates on a single batch of data from the validation set. In this step you’d might generate examples or calculate anything of interest like accuracy.

Parameters:
  • batch – The output of your data iterable, normally a DataLoader.

  • batch_idx – The index of this batch.

  • dataloader_idx – The index of the dataloader that produced this batch. (only if multiple dataloaders used)

Returns:

  • Tensor - The loss tensor

  • dict - A dictionary. Can include any keys, but must include the key 'loss'.

  • None - Skip to the next batch.

# if you have one val dataloader:
def validation_step(self, batch, batch_idx): ...

# if you have multiple val dataloaders:
def validation_step(self, batch, batch_idx, dataloader_idx=0): ...

Examples:

# CASE 1: A single validation dataset
def validation_step(self, batch, batch_idx):
    x, y = batch

    # implement your own
    out = self(x)
    loss = self.loss(out, y)

    # log 6 example images
    # or generated text... or whatever
    sample_imgs = x[:6]
    grid = torchvision.utils.make_grid(sample_imgs)
    self.logger.experiment.add_image('example_images', grid, 0)

    # calculate acc
    labels_hat = torch.argmax(out, dim=1)
    val_acc = torch.sum(y == labels_hat).item() / (len(y) * 1.0)

    # log the outputs!
    self.log_dict({'val_loss': loss, 'val_acc': val_acc})

If you pass in multiple val dataloaders, validation_step() will have an additional argument. We recommend setting the default value of 0 so that you can quickly switch between single and multiple dataloaders.

# CASE 2: multiple validation dataloaders
def validation_step(self, batch, batch_idx, dataloader_idx=0):
    # dataloader_idx tells you which dataset this is.
    x, y = batch

    # implement your own
    out = self(x)

    if dataloader_idx == 0:
        loss = self.loss0(out, y)
    else:
        loss = self.loss1(out, y)

    # calculate acc
    labels_hat = torch.argmax(out, dim=1)
    acc = torch.sum(y == labels_hat).item() / (len(y) * 1.0)

    # log the outputs separately for each dataloader
    self.log_dict({f"val_loss_{dataloader_idx}": loss, f"val_acc_{dataloader_idx}": acc})

Note

If you don’t need to validate you don’t need to implement this method.

Note

When the validation_step() is called, the model has been put in eval mode and PyTorch gradients have been disabled. At the end of validation, the model goes back to training mode and gradients are enabled.

configure_optimizers()[source]#

Choose what optimizers and learning-rate schedulers to use in your optimization. Normally you’d need one. But in the case of GANs or similar you might have multiple. Optimization with multiple optimizers only works in the manual optimization mode.

Returns:

Any of these 6 options.

  • Single optimizer.

  • List or Tuple of optimizers.

  • Two lists - The first list has multiple optimizers, and the second has multiple LR schedulers (or multiple lr_scheduler_config).

  • Dictionary, with an "optimizer" key, and (optionally) a "lr_scheduler" key whose value is a single LR scheduler or lr_scheduler_config.

  • None - Fit will run without any optimizer.

The lr_scheduler_config is a dictionary which contains the scheduler and its associated configuration. The default configuration is shown below.

lr_scheduler_config = {
    # REQUIRED: The scheduler instance
    "scheduler": lr_scheduler,
    # The unit of the scheduler's step size, could also be 'step'.
    # 'epoch' updates the scheduler on epoch end whereas 'step'
    # updates it after a optimizer update.
    "interval": "epoch",
    # How many epochs/steps should pass between calls to
    # `scheduler.step()`. 1 corresponds to updating the learning
    # rate after every epoch/step.
    "frequency": 1,
    # Metric to monitor for schedulers like `ReduceLROnPlateau`
    "monitor": "val_loss",
    # If set to `True`, will enforce that the value specified 'monitor'
    # is available when the scheduler is updated, thus stopping
    # training if not found. If set to `False`, it will only produce a warning
    "strict": True,
    # If using the `LearningRateMonitor` callback to monitor the
    # learning rate progress, this keyword can be used to specify
    # a custom logged name
    "name": None,
}

When there are schedulers in which the .step() method is conditioned on a value, such as the torch.optim.lr_scheduler.ReduceLROnPlateau scheduler, Lightning requires that the lr_scheduler_config contains the keyword "monitor" set to the metric name that the scheduler should be conditioned on.

# The ReduceLROnPlateau scheduler requires a monitor
def configure_optimizers(self):
    optimizer = Adam(...)
    return {
        "optimizer": optimizer,
        "lr_scheduler": {
            "scheduler": ReduceLROnPlateau(optimizer, ...),
            "monitor": "metric_to_track",
            "frequency": "indicates how often the metric is updated",
            # If "monitor" references validation metrics, then "frequency" should be set to a
            # multiple of "trainer.check_val_every_n_epoch".
        },
    }

# In the case of two optimizers, only one using the ReduceLROnPlateau scheduler
def configure_optimizers(self):
    optimizer1 = Adam(...)
    optimizer2 = SGD(...)
    scheduler1 = ReduceLROnPlateau(optimizer1, ...)
    scheduler2 = LambdaLR(optimizer2, ...)
    return (
        {
            "optimizer": optimizer1,
            "lr_scheduler": {
                "scheduler": scheduler1,
                "monitor": "metric_to_track",
            },
        },
        {"optimizer": optimizer2, "lr_scheduler": scheduler2},
    )

Metrics can be made available to monitor by simply logging it using self.log('metric_to_track', metric_val) in your LightningModule.

Note

Some things to know:

  • Lightning calls .backward() and .step() automatically in case of automatic optimization.

  • If a learning rate scheduler is specified in configure_optimizers() with key "interval" (default “epoch”) in the scheduler configuration, Lightning will call the scheduler’s .step() method automatically in case of automatic optimization.

  • If you use 16-bit precision (precision=16), Lightning will automatically handle the optimizer.

  • If you use torch.optim.LBFGS, Lightning handles the closure function automatically for you.

  • If you use multiple optimizers, you will have to switch to ‘manual optimization’ mode and step them yourself.

  • If you need to control how often the optimizer steps, override the optimizer_step() hook.

predict(x)[source]#
class DFF(input_size, hidden_size, batch_size, lr, loss)[source]#

Bases: LightningModule

__init__(input_size, hidden_size, batch_size, lr, loss)[source]#
forward(x)[source]#

Same as torch.nn.Module.forward().

Parameters:
  • *args – Whatever you decide to pass into the forward method.

  • **kwargs – Keyword arguments are also possible.

Returns:

Your model’s output

predict_step(batch, batch_idx, **kwargs)[source]#

Step function called during predict(). By default, it calls forward(). Override to add any processing logic.

The predict_step() is used to scale inference on multi-devices.

To prevent an OOM error, it is possible to use BasePredictionWriter callback to write the predictions to disk or database after each batch or on epoch end.

The BasePredictionWriter should be used while using a spawn based accelerator. This happens for Trainer(strategy="ddp_spawn") or training on 8 TPU cores with Trainer(accelerator="tpu", devices=8) as predictions won’t be returned.

Parameters:
  • batch – The output of your data iterable, normally a DataLoader.

  • batch_idx – The index of this batch.

  • dataloader_idx – The index of the dataloader that produced this batch. (only if multiple dataloaders used)

Returns:

Predicted output (optional).

Example

class MyModel(LightningModule):

    def predict_step(self, batch, batch_idx, dataloader_idx=0):
        return self(batch)

dm = ...
model = MyModel()
trainer = Trainer(accelerator="gpu", devices=2)
predictions = trainer.predict(model, dm)
training_step(batch, batch_idx)[source]#

Here you compute and return the training loss and some additional metrics for e.g. the progress bar or logger.

Parameters:
  • batch – The output of your data iterable, normally a DataLoader.

  • batch_idx – The index of this batch.

  • dataloader_idx – The index of the dataloader that produced this batch. (only if multiple dataloaders used)

Returns:

  • Tensor - The loss tensor

  • dict - A dictionary which can include any keys, but must include the key 'loss' in the case of automatic optimization.

  • None - In automatic optimization, this will skip to the next batch (but is not supported for multi-GPU, TPU, or DeepSpeed). For manual optimization, this has no special meaning, as returning the loss is not required.

In this step you’d normally do the forward pass and calculate the loss for a batch. You can also do fancier things like multiple forward passes or something model specific.

Example:

def training_step(self, batch, batch_idx):
    x, y, z = batch
    out = self.encoder(x)
    loss = self.loss(out, x)
    return loss

To use multiple optimizers, you can switch to ‘manual optimization’ and control their stepping:

def __init__(self):
    super().__init__()
    self.automatic_optimization = False

# Multiple optimizers (e.g.: GANs)
def training_step(self, batch, batch_idx):
    opt1, opt2 = self.optimizers()

    # do training_step with encoder
    ...
    opt1.step()
    # do training_step with decoder
    ...
    opt2.step()

Note

When accumulate_grad_batches > 1, the loss returned here will be automatically normalized by accumulate_grad_batches internally.

validation_step(batch, batch_idx)[source]#

Operates on a single batch of data from the validation set. In this step you’d might generate examples or calculate anything of interest like accuracy.

Parameters:
  • batch – The output of your data iterable, normally a DataLoader.

  • batch_idx – The index of this batch.

  • dataloader_idx – The index of the dataloader that produced this batch. (only if multiple dataloaders used)

Returns:

  • Tensor - The loss tensor

  • dict - A dictionary. Can include any keys, but must include the key 'loss'.

  • None - Skip to the next batch.

# if you have one val dataloader:
def validation_step(self, batch, batch_idx): ...

# if you have multiple val dataloaders:
def validation_step(self, batch, batch_idx, dataloader_idx=0): ...

Examples:

# CASE 1: A single validation dataset
def validation_step(self, batch, batch_idx):
    x, y = batch

    # implement your own
    out = self(x)
    loss = self.loss(out, y)

    # log 6 example images
    # or generated text... or whatever
    sample_imgs = x[:6]
    grid = torchvision.utils.make_grid(sample_imgs)
    self.logger.experiment.add_image('example_images', grid, 0)

    # calculate acc
    labels_hat = torch.argmax(out, dim=1)
    val_acc = torch.sum(y == labels_hat).item() / (len(y) * 1.0)

    # log the outputs!
    self.log_dict({'val_loss': loss, 'val_acc': val_acc})

If you pass in multiple val dataloaders, validation_step() will have an additional argument. We recommend setting the default value of 0 so that you can quickly switch between single and multiple dataloaders.

# CASE 2: multiple validation dataloaders
def validation_step(self, batch, batch_idx, dataloader_idx=0):
    # dataloader_idx tells you which dataset this is.
    x, y = batch

    # implement your own
    out = self(x)

    if dataloader_idx == 0:
        loss = self.loss0(out, y)
    else:
        loss = self.loss1(out, y)

    # calculate acc
    labels_hat = torch.argmax(out, dim=1)
    acc = torch.sum(y == labels_hat).item() / (len(y) * 1.0)

    # log the outputs separately for each dataloader
    self.log_dict({f"val_loss_{dataloader_idx}": loss, f"val_acc_{dataloader_idx}": acc})

Note

If you don’t need to validate you don’t need to implement this method.

Note

When the validation_step() is called, the model has been put in eval mode and PyTorch gradients have been disabled. At the end of validation, the model goes back to training mode and gradients are enabled.

configure_optimizers()[source]#

Choose what optimizers and learning-rate schedulers to use in your optimization. Normally you’d need one. But in the case of GANs or similar you might have multiple. Optimization with multiple optimizers only works in the manual optimization mode.

Returns:

Any of these 6 options.

  • Single optimizer.

  • List or Tuple of optimizers.

  • Two lists - The first list has multiple optimizers, and the second has multiple LR schedulers (or multiple lr_scheduler_config).

  • Dictionary, with an "optimizer" key, and (optionally) a "lr_scheduler" key whose value is a single LR scheduler or lr_scheduler_config.

  • None - Fit will run without any optimizer.

The lr_scheduler_config is a dictionary which contains the scheduler and its associated configuration. The default configuration is shown below.

lr_scheduler_config = {
    # REQUIRED: The scheduler instance
    "scheduler": lr_scheduler,
    # The unit of the scheduler's step size, could also be 'step'.
    # 'epoch' updates the scheduler on epoch end whereas 'step'
    # updates it after a optimizer update.
    "interval": "epoch",
    # How many epochs/steps should pass between calls to
    # `scheduler.step()`. 1 corresponds to updating the learning
    # rate after every epoch/step.
    "frequency": 1,
    # Metric to monitor for schedulers like `ReduceLROnPlateau`
    "monitor": "val_loss",
    # If set to `True`, will enforce that the value specified 'monitor'
    # is available when the scheduler is updated, thus stopping
    # training if not found. If set to `False`, it will only produce a warning
    "strict": True,
    # If using the `LearningRateMonitor` callback to monitor the
    # learning rate progress, this keyword can be used to specify
    # a custom logged name
    "name": None,
}

When there are schedulers in which the .step() method is conditioned on a value, such as the torch.optim.lr_scheduler.ReduceLROnPlateau scheduler, Lightning requires that the lr_scheduler_config contains the keyword "monitor" set to the metric name that the scheduler should be conditioned on.

# The ReduceLROnPlateau scheduler requires a monitor
def configure_optimizers(self):
    optimizer = Adam(...)
    return {
        "optimizer": optimizer,
        "lr_scheduler": {
            "scheduler": ReduceLROnPlateau(optimizer, ...),
            "monitor": "metric_to_track",
            "frequency": "indicates how often the metric is updated",
            # If "monitor" references validation metrics, then "frequency" should be set to a
            # multiple of "trainer.check_val_every_n_epoch".
        },
    }

# In the case of two optimizers, only one using the ReduceLROnPlateau scheduler
def configure_optimizers(self):
    optimizer1 = Adam(...)
    optimizer2 = SGD(...)
    scheduler1 = ReduceLROnPlateau(optimizer1, ...)
    scheduler2 = LambdaLR(optimizer2, ...)
    return (
        {
            "optimizer": optimizer1,
            "lr_scheduler": {
                "scheduler": scheduler1,
                "monitor": "metric_to_track",
            },
        },
        {"optimizer": optimizer2, "lr_scheduler": scheduler2},
    )

Metrics can be made available to monitor by simply logging it using self.log('metric_to_track', metric_val) in your LightningModule.

Note

Some things to know:

  • Lightning calls .backward() and .step() automatically in case of automatic optimization.

  • If a learning rate scheduler is specified in configure_optimizers() with key "interval" (default “epoch”) in the scheduler configuration, Lightning will call the scheduler’s .step() method automatically in case of automatic optimization.

  • If you use 16-bit precision (precision=16), Lightning will automatically handle the optimizer.

  • If you use torch.optim.LBFGS, Lightning handles the closure function automatically for you.

  • If you use multiple optimizers, you will have to switch to ‘manual optimization’ mode and step them yourself.

  • If you need to control how often the optimizer steps, override the optimizer_step() hook.

class MDN(input_size, hidden_size, num_gaussians, lr, batch_size, loss_function='loglikelihood')[source]#

Bases: LightningModule

__init__(input_size, hidden_size, num_gaussians, lr, batch_size, loss_function='loglikelihood')[source]#

Init function for the MDN

Parameters:
  • input_size (int) – the number of dimensions in the input

  • hidden_size (int) – the number of dimensions in the hidden layer

  • num_gaussians (int) – the number of Gaussians per output dimensions

  • lr (float) – learning rate

  • batch_size (int) – Batch size.

  • loss_function (str) – Loss function can be either ‘loglikelihood’ or ‘mmd’ for Maximal Mean Discrepancy

Input:
minibatch (BxD): B is the batch size and D is the number of input

dimensions.

Output:
(pi, sigma, mu) (BxG, BxGxO, BxGxO): B is the batch size, G is the

number of Gaussians, and O is the number of dimensions for each Gaussian. Pi is a multinomial distribution of the Gaussians. Sigma is the standard deviation of each Gaussian. Mu is the mean of each Gaussian.

configure_optimizers()[source]#

Choose what optimizers and learning-rate schedulers to use in your optimization. Normally you’d need one. But in the case of GANs or similar you might have multiple. Optimization with multiple optimizers only works in the manual optimization mode.

Returns:

Any of these 6 options.

  • Single optimizer.

  • List or Tuple of optimizers.

  • Two lists - The first list has multiple optimizers, and the second has multiple LR schedulers (or multiple lr_scheduler_config).

  • Dictionary, with an "optimizer" key, and (optionally) a "lr_scheduler" key whose value is a single LR scheduler or lr_scheduler_config.

  • None - Fit will run without any optimizer.

The lr_scheduler_config is a dictionary which contains the scheduler and its associated configuration. The default configuration is shown below.

lr_scheduler_config = {
    # REQUIRED: The scheduler instance
    "scheduler": lr_scheduler,
    # The unit of the scheduler's step size, could also be 'step'.
    # 'epoch' updates the scheduler on epoch end whereas 'step'
    # updates it after a optimizer update.
    "interval": "epoch",
    # How many epochs/steps should pass between calls to
    # `scheduler.step()`. 1 corresponds to updating the learning
    # rate after every epoch/step.
    "frequency": 1,
    # Metric to monitor for schedulers like `ReduceLROnPlateau`
    "monitor": "val_loss",
    # If set to `True`, will enforce that the value specified 'monitor'
    # is available when the scheduler is updated, thus stopping
    # training if not found. If set to `False`, it will only produce a warning
    "strict": True,
    # If using the `LearningRateMonitor` callback to monitor the
    # learning rate progress, this keyword can be used to specify
    # a custom logged name
    "name": None,
}

When there are schedulers in which the .step() method is conditioned on a value, such as the torch.optim.lr_scheduler.ReduceLROnPlateau scheduler, Lightning requires that the lr_scheduler_config contains the keyword "monitor" set to the metric name that the scheduler should be conditioned on.

# The ReduceLROnPlateau scheduler requires a monitor
def configure_optimizers(self):
    optimizer = Adam(...)
    return {
        "optimizer": optimizer,
        "lr_scheduler": {
            "scheduler": ReduceLROnPlateau(optimizer, ...),
            "monitor": "metric_to_track",
            "frequency": "indicates how often the metric is updated",
            # If "monitor" references validation metrics, then "frequency" should be set to a
            # multiple of "trainer.check_val_every_n_epoch".
        },
    }

# In the case of two optimizers, only one using the ReduceLROnPlateau scheduler
def configure_optimizers(self):
    optimizer1 = Adam(...)
    optimizer2 = SGD(...)
    scheduler1 = ReduceLROnPlateau(optimizer1, ...)
    scheduler2 = LambdaLR(optimizer2, ...)
    return (
        {
            "optimizer": optimizer1,
            "lr_scheduler": {
                "scheduler": scheduler1,
                "monitor": "metric_to_track",
            },
        },
        {"optimizer": optimizer2, "lr_scheduler": scheduler2},
    )

Metrics can be made available to monitor by simply logging it using self.log('metric_to_track', metric_val) in your LightningModule.

Note

Some things to know:

  • Lightning calls .backward() and .step() automatically in case of automatic optimization.

  • If a learning rate scheduler is specified in configure_optimizers() with key "interval" (default “epoch”) in the scheduler configuration, Lightning will call the scheduler’s .step() method automatically in case of automatic optimization.

  • If you use 16-bit precision (precision=16), Lightning will automatically handle the optimizer.

  • If you use torch.optim.LBFGS, Lightning handles the closure function automatically for you.

  • If you use multiple optimizers, you will have to switch to ‘manual optimization’ mode and step them yourself.

  • If you need to control how often the optimizer steps, override the optimizer_step() hook.

training_step(batch, batch_idx)[source]#

Here you compute and return the training loss and some additional metrics for e.g. the progress bar or logger.

Parameters:
  • batch – The output of your data iterable, normally a DataLoader.

  • batch_idx – The index of this batch.

  • dataloader_idx – The index of the dataloader that produced this batch. (only if multiple dataloaders used)

Returns:

  • Tensor - The loss tensor

  • dict - A dictionary which can include any keys, but must include the key 'loss' in the case of automatic optimization.

  • None - In automatic optimization, this will skip to the next batch (but is not supported for multi-GPU, TPU, or DeepSpeed). For manual optimization, this has no special meaning, as returning the loss is not required.

In this step you’d normally do the forward pass and calculate the loss for a batch. You can also do fancier things like multiple forward passes or something model specific.

Example:

def training_step(self, batch, batch_idx):
    x, y, z = batch
    out = self.encoder(x)
    loss = self.loss(out, x)
    return loss

To use multiple optimizers, you can switch to ‘manual optimization’ and control their stepping:

def __init__(self):
    super().__init__()
    self.automatic_optimization = False

# Multiple optimizers (e.g.: GANs)
def training_step(self, batch, batch_idx):
    opt1, opt2 = self.optimizers()

    # do training_step with encoder
    ...
    opt1.step()
    # do training_step with decoder
    ...
    opt2.step()

Note

When accumulate_grad_batches > 1, the loss returned here will be automatically normalized by accumulate_grad_batches internally.

validation_step(batch, batch_idx)[source]#

Operates on a single batch of data from the validation set. In this step you’d might generate examples or calculate anything of interest like accuracy.

Parameters:
  • batch – The output of your data iterable, normally a DataLoader.

  • batch_idx – The index of this batch.

  • dataloader_idx – The index of the dataloader that produced this batch. (only if multiple dataloaders used)

Returns:

  • Tensor - The loss tensor

  • dict - A dictionary. Can include any keys, but must include the key 'loss'.

  • None - Skip to the next batch.

# if you have one val dataloader:
def validation_step(self, batch, batch_idx): ...

# if you have multiple val dataloaders:
def validation_step(self, batch, batch_idx, dataloader_idx=0): ...

Examples:

# CASE 1: A single validation dataset
def validation_step(self, batch, batch_idx):
    x, y = batch

    # implement your own
    out = self(x)
    loss = self.loss(out, y)

    # log 6 example images
    # or generated text... or whatever
    sample_imgs = x[:6]
    grid = torchvision.utils.make_grid(sample_imgs)
    self.logger.experiment.add_image('example_images', grid, 0)

    # calculate acc
    labels_hat = torch.argmax(out, dim=1)
    val_acc = torch.sum(y == labels_hat).item() / (len(y) * 1.0)

    # log the outputs!
    self.log_dict({'val_loss': loss, 'val_acc': val_acc})

If you pass in multiple val dataloaders, validation_step() will have an additional argument. We recommend setting the default value of 0 so that you can quickly switch between single and multiple dataloaders.

# CASE 2: multiple validation dataloaders
def validation_step(self, batch, batch_idx, dataloader_idx=0):
    # dataloader_idx tells you which dataset this is.
    x, y = batch

    # implement your own
    out = self(x)

    if dataloader_idx == 0:
        loss = self.loss0(out, y)
    else:
        loss = self.loss1(out, y)

    # calculate acc
    labels_hat = torch.argmax(out, dim=1)
    acc = torch.sum(y == labels_hat).item() / (len(y) * 1.0)

    # log the outputs separately for each dataloader
    self.log_dict({f"val_loss_{dataloader_idx}": loss, f"val_acc_{dataloader_idx}": acc})

Note

If you don’t need to validate you don’t need to implement this method.

Note

When the validation_step() is called, the model has been put in eval mode and PyTorch gradients have been disabled. At the end of validation, the model goes back to training mode and gradients are enabled.

common_step(batch)[source]#
forward(x)[source]#

Same as torch.nn.Module.forward().

Parameters:
  • *args – Whatever you decide to pass into the forward method.

  • **kwargs – Keyword arguments are also possible.

Returns:

Your model’s output

mdn_loss(pi, sigma, mu, y)[source]#

Calculates the error, given the MoG parameters and the target The loss is the negative log likelihood of the data given the MoG parameters.

static gaussian_probability(y, mu, sigma)[source]#
static mmd_loss(x, y, kernel)[source]#

https://www.kaggle.com/onurtunali/maximum-mean-discrepancy

Emprical maximum mean discrepancy. The lower the result, the more evidence that distributions are the same.

Parameters:
  • x – first sample, distribution P

  • y – second sample, distribution Q

  • kernel – kernel type such as “multiscale” or “rbf”

static add_noise(x)[source]#
static g_sample(pi, sigma, mu)[source]#

Gumbel sampling comes from here: hardmaru/pytorch_notebooks

static sample(pi, sigma, mu)[source]#

Draw samples from a MoG.

class ColumnsDataset(target_name, df)[source]#

Bases: Dataset

__init__(target_name, df)[source]#
class RBF(n_kernels=5, mul_factor=2.0, bandwidth=None)[source]#

Bases: Module

__init__(n_kernels=5, mul_factor=2.0, bandwidth=None)[source]#

Initialize internal Module state, shared by both nn.Module and ScriptModule.

get_bandwidth(L2_distances)[source]#

Get the bandwidth of the RBF kernel.

forward(X)[source]#

Define the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class MMDLoss(kernel=RBF())[source]#

Bases: Module

__init__(kernel=RBF())[source]#

Initialize internal Module state, shared by both nn.Module and ScriptModule.

forward(X, Y)[source]#

Define the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

Training helpers for neural models used in causal discovery.

class BaseModel(target, dataframe, test_size, batch_size, tb_suffix, seed, early_stop=True, patience=10, min_delta=0.001)[source]#

Bases: object

Base class for all models in the causalexplain package.

Parameters:
  • target (str) – The target variable name.

  • dataframe (pd.DataFrame) – The input dataframe.

  • test_size (float) – The proportion of the data to use for testing.

  • batch_size (int) – The batch size for training.

  • tb_suffix (str) – The suffix to append to the TensorBoard log directory.

  • seed (int) – The random seed for reproducibility.

  • early_stop (bool, optional) – Whether to use early stopping during training. Defaults to True.

  • patience (int, optional) – The patience value for early stopping. Defaults to 10.

  • min_delta (float, optional) – The minimum change in the monitored metric to be considered an improvement for early stopping. Defaults to 0.001.

model = None#
all_columns = None#
callbacks = None#
columns = None#
logger = None#
extra_trainer_args = None#
scaler = None#
train_loader = None#
val_loader = None#
n_rows = 0#
device = 'cpu'#
__init__(target, dataframe, test_size, batch_size, tb_suffix, seed, early_stop=True, patience=10, min_delta=0.001)[source]#
init_logger(suffix)[source]#

Initialize the logger for TensorBoard.

Parameters:

suffix (str) – The suffix to append to the logger name.

init_callbacks(early_stop=True, min_delta=0.001, patience=10, prog_bar=False)[source]#

Initialize the callbacks for the training process.

Parameters:
  • early_stop (bool, optional) – Whether to use early stopping. Defaults to True.

  • min_delta (float, optional) – The minimum change in the monitored metric to be considered an improvement for early stopping. Defaults to 0.001.

  • patience (int, optional) – The patience value for early stopping. Defaults to 10.

  • prog_bar (bool, optional) – Whether to use a progress bar during training. Defaults to False.

init_data()[source]#

Initialize the data loaders for training and validation.

override_extras(**kwargs)[source]#

Override the extra trainer arguments.

Parameters:

**kwargs – Additional keyword arguments to override the default values.

class MLPModel(target, input_size, hidden_dim, activation, learning_rate, batch_size, loss_fn, dropout, num_epochs, dataframe, test_size, device, seed, early_stop=True, patience=10, min_delta=0.001, **kwargs)[source]#

Bases: BaseModel

Implementation of the Multi-Layer Perceptron (MLP) model.

Parameters:
  • target (str) – The target variable name.

  • input_size (int) – The size of the input features.

  • hidden_dim (List[int]) – The dimensions of the hidden layers.

  • activation (nn.Module) – The activation function to use in the hidden layers.

  • learning_rate (float) – The learning rate for training.

  • batch_size (int) – The batch size for training.

  • loss_fn (str) – The loss function to use.

  • dropout (float) – The dropout rate.

  • num_epochs (int) – The number of training epochs.

  • dataframe (pd.DataFrame) – The input dataframe.

  • test_size (float) – The proportion of the data to use for testing.

  • device (Union[int, str]) – The device to use for training.

  • seed (int) – The random seed for reproducibility.

  • early_stop (bool, optional) – Whether to use early stopping during training. Defaults to True.

  • patience (int, optional) – The patience value for early stopping. Defaults to 10.

  • min_delta (float, optional) – The minimum change in the monitored metric to be considered an improvement for early stopping. Defaults to 0.001.

  • **kwargs – Additional keyword arguments to override the default values.

__init__(target, input_size, hidden_dim, activation, learning_rate, batch_size, loss_fn, dropout, num_epochs, dataframe, test_size, device, seed, early_stop=True, patience=10, min_delta=0.001, **kwargs)[source]#
train()[source]#

Train the MLP model.

Shared helpers for handling Optuna storage paths.

This module contains functions to extract and visualize the weights of a neural network model. The weights are extracted from the model and then visualized in different ways to help understand the relationships between the input features and the target variable. The functions in this module are used to visualize the weights of a neural network model and to identify relationships between the input features and the target variable.

extract_weights(model, verbose=False)[source]#

Extracts the weights from a given model.

Parameters: - model: The model from which to extract the weights. - verbose: If True, prints the names of the weights being extracted.

Returns: - weights: A list of the extracted weights.

see_weights_to_hidden(weights_matrix, input_names, target)[source]#

Visualizes the weights connecting the input layer to the hidden layer.

Parameters:
  • W (np.ndarray) – The weight matrix of shape (num_hidden, num_inputs) representing

  • layer. (the connections between the input layer and the hidden)

  • input_names (List[str]) – A list of input names corresponding to each input feature.

  • target (str) – The target variable.

Returns:

None

Return type:

None

see_weights_from_input(W, input_names, target)[source]#
plot_feature(result, axis=None)[source]#
plot_features(results, n_rows, n_cols, all_columns)[source]#
layer_weights(dff_net, target, layer=0)[source]#
summarize_weights(weights, feature_names, layer=0, scale=True)[source]#

Summarize the weights of a neural network model by calculating the mean, median, and positive semidefinite values of the weights for each feature.

Parameters: - weights: The weights of the neural network model. - feature_names: A list of feature names. - layer: The layer of the neural network model from which to extract the weights. - scale: If True, scale the summary values.

Returns: - psd: A DataFrame containing the summary values of the weights for each feature.

identify_relationships(weights, feature_names, eps=0.5, min_counts=2, plot=True)[source]#

Identify feature relationships using clustered weight summaries.

infer_causal_relationships(trained_models, feature_names, prune=False, verbose=False, plot=False, prog_bar=True, silent=False)[source]#

Infer causal relationships from SHAP values of trained models.

A class to train DFF networks for all variables in data. Each network will be trained to predict one of the variables in the data, using the rest as predictors plus one source of random noise.

  1. 2022,2023,2024,2025 J. Renero

class NNRegressor(hidden_dim=[75, 17], activation='relu', learning_rate=0.0046, dropout=0.001, batch_size=44, num_epochs=40, loss_fn='mse', device='cpu', test_size=0.1, early_stop=False, patience=10, min_delta=0.001, correlation_th=None, random_state=1234, verbose=False, prog_bar=True, silent=False, optuna_prog_bar=False, parallel_jobs=0)[source]#

Bases: BaseEstimator

A class to train DFF networks for all variables in data. Each network will be trained to predict one of the variables in the data, using the rest as predictors plus one source of random noise.

__init__(hidden_dim=[75, 17], activation='relu', learning_rate=0.0046, dropout=0.001, batch_size=44, num_epochs=40, loss_fn='mse', device='cpu', test_size=0.1, early_stop=False, patience=10, min_delta=0.001, correlation_th=None, random_state=1234, verbose=False, prog_bar=True, silent=False, optuna_prog_bar=False, parallel_jobs=0)[source]#

Train DFF networks for all variables in data. Each network will be trained to predict one of the variables in the data, using the rest as predictors plus one source of random noise.

Parameters:
  • data (pandas.DataFrame) – The dataframe with the continuous variables.

  • model_type (str) – The type of model to use. Either ‘dff’ or ‘mlp’.

  • hidden_dim (int) – The dimension(s) of the hidden layer(s). This value can be a single integer for DFF or an array with the dimension of each hidden layer for the MLP case.

  • activation (str) – The activation function to use, either ‘relu’ or ‘selu’. Default is ‘relu’.

  • learning_rate (float) – The learning rate for the optimizer.

  • dropout (float) – The dropout rate for the dropout layer.

  • batch_size (int) – The batch size for the optimizer.

  • num_epochs (int) – The number of epochs for the optimizer.

  • loss_fn (str) – The loss function to use. Default is “mmd”.

  • device (str) – The device to use. Either “cpu”, “cuda”, or “mps”. Default is “auto”.

  • test_size (float) – The proportion of the data to use for testing. Default is 0.1.

  • seed (int) – The seed for the random number generator. Default is 1234.

  • early_stop (bool) – Whether to use early stopping. Default is True.

  • patience (int) – The patience for early stopping. Default is 10.

  • min_delta (float) – The minimum delta for early stopping. Default is 0.001.

  • prog_bar (bool) – Whether to enable the progress bar. Default is False.

  • parallel_jobs (int) – Number of parallel jobs to use for CPU training. Default is 0 (sequential).

Returns:

A dictionary with the trained DFF networks, using the name of the

variables as the key.

Return type:

dict

fit(X)[source]#

A reference implementation of a fitting function.

Parameters:
  • X ({array-like, sparse matrix}, shape (n_samples, n_features)) – The training input samples.

  • y (array-like, shape (n_samples,) or (n_samples, n_outputs)) – The target values (class labels in classification, real numbers in regression).

Returns:

self – Returns self.

Return type:

object

predict(X)[source]#

Predicts the values for each target variable.

Parameters:

X (pd.DataFrame) – The input data to make predictions on.

Returns:

The predictions for each target variable.

Return type:

np.ndarray

score(X)[source]#

Scores the model using the loss function. It returns the list of losses for each target variable.

__repr__()[source]#

Return a readable snapshot of user-facing attributes.

Parameters:

None.

Returns:

A formatted summary of non-callable attributes.

Return type:

str

tune(training_data, test_data, study_name=None, min_loss=0.05, storage='sqlite:///rex_tuning.db', load_if_exists=True, n_trials=20)[source]#

Tune the hyperparameters of the model using Optuna.

tune_fit(X, hpo_study_name=None, hpo_min_loss=0.05, hpo_storage='sqlite:///rex_tuning.db', hpo_load_if_exists=True, hpo_n_trials=20)[source]#

Tune the hyperparameters of the model using Optuna, and the fit the model with the best parameters.

custom_main(score=False, tune=False)[source]#

Run a small local workflow for scoring or tuning.

Parameters:
  • score (bool) – Whether to load and score an existing model.

  • tune (bool) – Whether to run hyperparameter tuning.

Returns:

This function does not return a value.

Return type:

None

This module contains the GBTRegressor class, which is a wrapper around the GradientBoostingRegressor class from the scikit-learn library. The class implements the fit, predict, and score methods to fit a separate model for each feature in the dataframe, and predict and score the model for each feature in the dataframe.

The class also implements a tune method to tune the hyperparameters of the model using Optuna. The tune method uses the Objective class to define the objective function for the hyperparameter optimization. The Objective class is a nested class within the GBTRegressor class, and it defines the objective function for the hyperparameter optimization. The class is designed to be used with the Optuna library.

The module also contains a main function that can be used to run the GBTRegressor class with the tune method. The main function takes the name of the experiment as an argument, and loads the data and the reference graph for the experiment. The main function then splits the data into train and test, and runs the tune method to tune the hyperparameters of the model. The main function can be used to run the GBTRegressor class with the tune method for any experiment.

The module can be run as a script to run the main function with the tune method for a specific experiment. The experiment name is passed as an argument to the script, and the main function is called with the experiment name as an argument. The script can be used to run the GBTRegressor class with the tune method for any experiment.

Example

$ python gbt.py rex_generated_linear_6

This will run the GBTRegressor class with the tune method for the experiment ‘rex_generated_linear_6’.

The module can also be imported and used in other modules or scripts to run the GBTRegressor class with the tune method for any experiment.

Example

from causalexplain.models.gbt import custom_main

custom_main(“rex_generated_linear_6”)

class GBTRegressor(loss='squared_error', learning_rate=0.1, n_estimators=100, subsample=1.0, criterion='friedman_mse', min_samples_split=2, min_samples_leaf=1, min_weight_fraction_leaf=0.0, max_depth=3, min_impurity_decrease=0.0, init=None, random_state=42, max_features=None, max_leaf_nodes=None, warm_start=False, validation_fraction=0.1, n_iter_no_change=None, tol=0.0001, ccp_alpha=0.0, correlation_th=None, verbose=False, silent=False, prog_bar=True, optuna_prog_bar=False, parallel_jobs=0)[source]#

Bases: GradientBoostingRegressor

__init__(loss='squared_error', learning_rate=0.1, n_estimators=100, subsample=1.0, criterion='friedman_mse', min_samples_split=2, min_samples_leaf=1, min_weight_fraction_leaf=0.0, max_depth=3, min_impurity_decrease=0.0, init=None, random_state=42, max_features=None, max_leaf_nodes=None, warm_start=False, validation_fraction=0.1, n_iter_no_change=None, tol=0.0001, ccp_alpha=0.0, correlation_th=None, verbose=False, silent=False, prog_bar=True, optuna_prog_bar=False, parallel_jobs=0)[source]#

Initialize the gradient boosting regressor wrapper.

Parameters:
  • loss (str) – Loss function for the estimator.

  • learning_rate (float) – Learning rate for boosting.

  • n_estimators (int) – Number of boosting stages.

  • subsample (float) – Subsample ratio for training.

  • criterion (str) – Split quality criterion.

  • min_samples_split (int) – Minimum samples to split a node.

  • min_samples_leaf (int) – Minimum samples required at a leaf.

  • min_weight_fraction_leaf (float) – Minimum weighted fraction at a leaf.

  • max_depth (int) – Maximum depth of individual estimators.

  • min_impurity_decrease (float) – Impurity decrease threshold.

  • init (object) – Optional initialization estimator.

  • random_state (int) – Random seed for reproducibility.

  • max_features (str|int|float|None) – Features considered at each split.

  • max_leaf_nodes (int|None) – Maximum number of leaf nodes.

  • warm_start (bool) – Reuse previous solution if set.

  • validation_fraction (float) – Fraction for early stopping validation.

  • n_iter_no_change (int|None) – Early stopping patience.

  • tol (float) – Tolerance for early stopping.

  • ccp_alpha (float) – Complexity parameter for pruning.

  • correlation_th (float|None) – Deprecated; ignored.

  • verbose (bool) – Enable verbose logging.

  • silent (bool) – Suppress output and progress bars.

  • prog_bar (bool) – Enable progress bar.

  • optuna_prog_bar (bool) – Enable Optuna progress bar.

  • parallel_jobs (int) – Number of parallel jobs for CPU training.

Returns:

This method does not return a value.

Return type:

None

random_state = 42#
fit(X)[source]#

Call the fit method of the parent class with every feature from the “X” dataframe as a target variable. This will fit a separate model for each feature in the dataframe.

predict(X)[source]#

Call the predict method of the parent class with every feature from the “X” dataframe as a target variable. This will predict a separate value for each feature in the dataframe.

score(X)[source]#

Call the score method of the parent class with every feature from the “X” dataframe as a target variable. This will score a separate model for each feature in the dataframe.

tune(training_data, test_data, study_name=None, min_loss=0.05, storage='sqlite:///rex_tuning.db', load_if_exists=True, n_trials=20)[source]#

Tune the hyperparameters of the model using Optuna.

tune_fit(X, hpo_study_name=None, hpo_min_loss=0.05, hpo_storage='sqlite:///rex_tuning.db', hpo_load_if_exists=True, hpo_n_trials=20)[source]#

Tune the hyperparameters of the model using Optuna, and the fit the model with the best parameters.

custom_main(experiment_name='custom_rex', score=False, tune=False)[source]#

Run a local GBT workflow for scoring or tuning.

Parameters:
  • experiment_name (str) – Name of the experiment dataset.

  • score (bool) – Whether to load and score an existing model.

  • tune (bool) – Whether to run hyperparameter tuning.

Returns:

This function does not return a value.

Return type:

None

Module contents#

Model wrappers used by causal discovery estimators.

class GBTRegressor(loss='squared_error', learning_rate=0.1, n_estimators=100, subsample=1.0, criterion='friedman_mse', min_samples_split=2, min_samples_leaf=1, min_weight_fraction_leaf=0.0, max_depth=3, min_impurity_decrease=0.0, init=None, random_state=42, max_features=None, max_leaf_nodes=None, warm_start=False, validation_fraction=0.1, n_iter_no_change=None, tol=0.0001, ccp_alpha=0.0, correlation_th=None, verbose=False, silent=False, prog_bar=True, optuna_prog_bar=False, parallel_jobs=0)[source]#

Bases: GradientBoostingRegressor

__init__(loss='squared_error', learning_rate=0.1, n_estimators=100, subsample=1.0, criterion='friedman_mse', min_samples_split=2, min_samples_leaf=1, min_weight_fraction_leaf=0.0, max_depth=3, min_impurity_decrease=0.0, init=None, random_state=42, max_features=None, max_leaf_nodes=None, warm_start=False, validation_fraction=0.1, n_iter_no_change=None, tol=0.0001, ccp_alpha=0.0, correlation_th=None, verbose=False, silent=False, prog_bar=True, optuna_prog_bar=False, parallel_jobs=0)[source]#

Initialize the gradient boosting regressor wrapper.

Parameters:
  • loss (str) – Loss function for the estimator.

  • learning_rate (float) – Learning rate for boosting.

  • n_estimators (int) – Number of boosting stages.

  • subsample (float) – Subsample ratio for training.

  • criterion (str) – Split quality criterion.

  • min_samples_split (int) – Minimum samples to split a node.

  • min_samples_leaf (int) – Minimum samples required at a leaf.

  • min_weight_fraction_leaf (float) – Minimum weighted fraction at a leaf.

  • max_depth (int) – Maximum depth of individual estimators.

  • min_impurity_decrease (float) – Impurity decrease threshold.

  • init (object) – Optional initialization estimator.

  • random_state (int) – Random seed for reproducibility.

  • max_features (str|int|float|None) – Features considered at each split.

  • max_leaf_nodes (int|None) – Maximum number of leaf nodes.

  • warm_start (bool) – Reuse previous solution if set.

  • validation_fraction (float) – Fraction for early stopping validation.

  • n_iter_no_change (int|None) – Early stopping patience.

  • tol (float) – Tolerance for early stopping.

  • ccp_alpha (float) – Complexity parameter for pruning.

  • correlation_th (float|None) – Deprecated; ignored.

  • verbose (bool) – Enable verbose logging.

  • silent (bool) – Suppress output and progress bars.

  • prog_bar (bool) – Enable progress bar.

  • optuna_prog_bar (bool) – Enable Optuna progress bar.

  • parallel_jobs (int) – Number of parallel jobs for CPU training.

Returns:

This method does not return a value.

Return type:

None

random_state = 42#
fit(X)[source]#

Call the fit method of the parent class with every feature from the “X” dataframe as a target variable. This will fit a separate model for each feature in the dataframe.

predict(X)[source]#

Call the predict method of the parent class with every feature from the “X” dataframe as a target variable. This will predict a separate value for each feature in the dataframe.

score(X)[source]#

Call the score method of the parent class with every feature from the “X” dataframe as a target variable. This will score a separate model for each feature in the dataframe.

tune(training_data, test_data, study_name=None, min_loss=0.05, storage='sqlite:///rex_tuning.db', load_if_exists=True, n_trials=20)[source]#

Tune the hyperparameters of the model using Optuna.

tune_fit(X, hpo_study_name=None, hpo_min_loss=0.05, hpo_storage='sqlite:///rex_tuning.db', hpo_load_if_exists=True, hpo_n_trials=20)[source]#

Tune the hyperparameters of the model using Optuna, and the fit the model with the best parameters.

class NNRegressor(hidden_dim=[75, 17], activation='relu', learning_rate=0.0046, dropout=0.001, batch_size=44, num_epochs=40, loss_fn='mse', device='cpu', test_size=0.1, early_stop=False, patience=10, min_delta=0.001, correlation_th=None, random_state=1234, verbose=False, prog_bar=True, silent=False, optuna_prog_bar=False, parallel_jobs=0)[source]#

Bases: BaseEstimator

A class to train DFF networks for all variables in data. Each network will be trained to predict one of the variables in the data, using the rest as predictors plus one source of random noise.

__init__(hidden_dim=[75, 17], activation='relu', learning_rate=0.0046, dropout=0.001, batch_size=44, num_epochs=40, loss_fn='mse', device='cpu', test_size=0.1, early_stop=False, patience=10, min_delta=0.001, correlation_th=None, random_state=1234, verbose=False, prog_bar=True, silent=False, optuna_prog_bar=False, parallel_jobs=0)[source]#

Train DFF networks for all variables in data. Each network will be trained to predict one of the variables in the data, using the rest as predictors plus one source of random noise.

Parameters:
  • data (pandas.DataFrame) – The dataframe with the continuous variables.

  • model_type (str) – The type of model to use. Either ‘dff’ or ‘mlp’.

  • hidden_dim (int) – The dimension(s) of the hidden layer(s). This value can be a single integer for DFF or an array with the dimension of each hidden layer for the MLP case.

  • activation (str) – The activation function to use, either ‘relu’ or ‘selu’. Default is ‘relu’.

  • learning_rate (float) – The learning rate for the optimizer.

  • dropout (float) – The dropout rate for the dropout layer.

  • batch_size (int) – The batch size for the optimizer.

  • num_epochs (int) – The number of epochs for the optimizer.

  • loss_fn (str) – The loss function to use. Default is “mmd”.

  • device (str) – The device to use. Either “cpu”, “cuda”, or “mps”. Default is “auto”.

  • test_size (float) – The proportion of the data to use for testing. Default is 0.1.

  • seed (int) – The seed for the random number generator. Default is 1234.

  • early_stop (bool) – Whether to use early stopping. Default is True.

  • patience (int) – The patience for early stopping. Default is 10.

  • min_delta (float) – The minimum delta for early stopping. Default is 0.001.

  • prog_bar (bool) – Whether to enable the progress bar. Default is False.

  • parallel_jobs (int) – Number of parallel jobs to use for CPU training. Default is 0 (sequential).

Returns:

A dictionary with the trained DFF networks, using the name of the

variables as the key.

Return type:

dict

fit(X)[source]#

A reference implementation of a fitting function.

Parameters:
  • X ({array-like, sparse matrix}, shape (n_samples, n_features)) – The training input samples.

  • y (array-like, shape (n_samples,) or (n_samples, n_outputs)) – The target values (class labels in classification, real numbers in regression).

Returns:

self – Returns self.

Return type:

object

predict(X)[source]#

Predicts the values for each target variable.

Parameters:

X (pd.DataFrame) – The input data to make predictions on.

Returns:

The predictions for each target variable.

Return type:

np.ndarray

score(X)[source]#

Scores the model using the loss function. It returns the list of losses for each target variable.

__repr__()[source]#

Return a readable snapshot of user-facing attributes.

Parameters:

None.

Returns:

A formatted summary of non-callable attributes.

Return type:

str

tune(training_data, test_data, study_name=None, min_loss=0.05, storage='sqlite:///rex_tuning.db', load_if_exists=True, n_trials=20)[source]#

Tune the hyperparameters of the model using Optuna.

tune_fit(X, hpo_study_name=None, hpo_min_loss=0.05, hpo_storage='sqlite:///rex_tuning.db', hpo_load_if_exists=True, hpo_n_trials=20)[source]#

Tune the hyperparameters of the model using Optuna, and the fit the model with the best parameters.