causalexplain.models package#

Submodules#

class MLP(input_size, layers_dimensions, activation, batch_size, lr, loss, dropout)[source]#

Bases: LightningModule

Attributes:

automatic_optimization: If set to False you are responsible for calling .backward(), .step(), .zero_grad().
current_epoch: The current epoch in the Trainer, or 0 if not attached.
dtype
example_input_array: The example input array is a specification of what the module can consume in the forward() method.
fabric
global_rank: The index of the current process across all nodes and devices.
global_step: Total training batches seen across all epochs.
hparams: The collection of hyperparameters saved with save_hyperparameters().
hparams_initial: The collection of hyperparameters saved with save_hyperparameters().
local_rank: The index of the current process within a single node.
logger: Reference to the logger object in the Trainer.
loggers: Reference to the list of loggers in the Trainer.
on_gpu: Returns True if this model is currently located on a GPU.
trainer

Methods

`Block`(d_in, d_out, activation, bias, ...)	The main building block of MLP.
`add_module`(name, module)	Add a child module to the current module.
`all_gather`(data[, group, sync_grads])	Gather tensors or collections of tensors from multiple processes.
`apply`(fn)	Apply `fn` recursively to every submodule (as returned by `.children()`) as well as self.
`backward`(loss, args, *kwargs)	Called to perform backward on the loss returned in `training_step()`.
`bfloat16`()	Casts all floating point parameters and buffers to `bfloat16` datatype.
`buffers`([recurse])	Return an iterator over module buffers.
`children`()	Return an iterator over immediate children modules.
`clip_gradients`(optimizer[, ...])	Handles gradient clipping internally.
`compile`(args, *kwargs)	Compile this Module's forward using `torch.compile()`.
`configure_callbacks`()	Configure model-specific callbacks.
`configure_gradient_clipping`(optimizer[, ...])	Perform gradient clipping for the optimizer parameters.
`configure_optimizers`()	Choose what optimizers and learning-rate schedulers to use in your optimization.
`configure_sharded_model`()	Hook to create modules in a distributed aware context.
`cpu`()	See `torch.nn.Module.cpu()`.
`cuda`([device])	Moves all model parameters and buffers to the GPU.
`double`()	See `torch.nn.Module.double()`.
`eval`()	Set the module in evaluation mode.
`extra_repr`()	Return the extra representation of the module.
`float`()	See `torch.nn.Module.float()`.
`forward`(x)	Same as `torch.nn.Module.forward()`.
`freeze`()	Freeze all params for inference.
`get_buffer`(target)	Return the buffer given by `target` if it exists, otherwise throw an error.
`get_extra_state`()	Return any extra state to include in the module's state_dict.
`get_parameter`(target)	Return the parameter given by `target` if it exists, otherwise throw an error.
`get_submodule`(target)	Return the submodule given by `target` if it exists, otherwise throw an error.
`half`()	See `torch.nn.Module.half()`.
`ipu`([device])	Move all model parameters and buffers to the IPU.
`load_from_checkpoint`(checkpoint_path[, ...])	Primary way of loading a model from a checkpoint.
`load_state_dict`(state_dict[, strict, assign])	Copy parameters and buffers from `state_dict` into this module and its descendants.
`log`(name, value[, prog_bar, logger, ...])	Log a key, value pair.
`log_dict`(dictionary[, prog_bar, logger, ...])	Log a dictionary of values at once.
`lr_scheduler_step`(scheduler, metric)	Override this method to adjust the default way the `Trainer` calls each scheduler.
`lr_schedulers`()	Returns the learning rate scheduler(s) that are being used during training.
`manual_backward`(loss, args, *kwargs)	Call this directly from your `training_step()` when doing optimizations manually.
`modules`()	Return an iterator over all modules in the network.
`mtia`([device])	Move all model parameters and buffers to the MTIA.
`named_buffers`([prefix, recurse, ...])	Return an iterator over module buffers, yielding both the name of the buffer as well as the buffer itself.
`named_children`()	Return an iterator over immediate children modules, yielding both the name of the module as well as the module itself.
`named_modules`([memo, prefix, remove_duplicate])	Return an iterator over all modules in the network, yielding both the name of the module as well as the module itself.
`named_parameters`([prefix, recurse, ...])	Return an iterator over module parameters, yielding both the name of the parameter as well as the parameter itself.
`on_after_backward`()	Called after `loss.backward()` and before optimizers are stepped.
`on_after_batch_transfer`(batch, dataloader_idx)	Override to alter or apply batch augmentations to your batch after it is transferred to the device.
`on_before_backward`(loss)	Called before `loss.backward()`.
`on_before_batch_transfer`(batch, dataloader_idx)	Override to alter or apply batch augmentations to your batch before it is transferred to the device.
`on_before_optimizer_step`(optimizer)	Called before `optimizer.step()`.
`on_before_zero_grad`(optimizer)	Called after `training_step()` and before `optimizer.zero_grad()`.
`on_fit_end`()	Called at the very end of fit.
`on_fit_start`()	Called at the very beginning of fit.
`on_load_checkpoint`(checkpoint)	Called by Lightning to restore your model.
`on_predict_batch_end`(outputs, batch, batch_idx)	Called in the predict loop after the batch.
`on_predict_batch_start`(batch, batch_idx[, ...])	Called in the predict loop before anything happens for that batch.
`on_predict_end`()	Called at the end of predicting.
`on_predict_epoch_end`()	Called at the end of predicting.
`on_predict_epoch_start`()	Called at the beginning of predicting.
`on_predict_model_eval`()	Sets the model to eval during the predict loop.
`on_predict_start`()	Called at the beginning of predicting.
`on_save_checkpoint`(checkpoint)	Called by Lightning when saving a checkpoint to give you a chance to store anything else you might want to save.
`on_test_batch_end`(outputs, batch, batch_idx)	Called in the test loop after the batch.
`on_test_batch_start`(batch, batch_idx[, ...])	Called in the test loop before anything happens for that batch.
`on_test_end`()	Called at the end of testing.
`on_test_epoch_end`()	Called in the test loop at the very end of the epoch.
`on_test_epoch_start`()	Called in the test loop at the very beginning of the epoch.
`on_test_model_eval`()	Sets the model to eval during the test loop.
`on_test_model_train`()	Sets the model to train during the test loop.
`on_test_start`()	Called at the beginning of testing.
`on_train_batch_end`(outputs, batch, batch_idx)	Called in the training loop after the batch.
`on_train_batch_start`(batch, batch_idx)	Called in the training loop before anything happens for that batch.
`on_train_end`()	Called at the end of training before logger experiment is closed.
`on_train_epoch_end`()	Called in the training loop at the very end of the epoch.
`on_train_epoch_start`()	Called in the training loop at the very beginning of the epoch.
`on_train_start`()	Called at the beginning of training after sanity check.
`on_validation_batch_end`(outputs, batch, ...)	Called in the validation loop after the batch.
`on_validation_batch_start`(batch, batch_idx)	Called in the validation loop before anything happens for that batch.
`on_validation_end`()	Called at the end of validation.
`on_validation_epoch_end`()	Called in the validation loop at the very end of the epoch.
`on_validation_epoch_start`()	Called in the validation loop at the very beginning of the epoch.
`on_validation_model_eval`()	Sets the model to eval during the val loop.
`on_validation_model_train`()	Sets the model to train during the val loop.
`on_validation_start`()	Called at the beginning of validation.
`optimizer_step`(epoch, batch_idx, optimizer)	Override this method to adjust the default way the `Trainer` calls the optimizer.
`optimizer_zero_grad`(epoch, batch_idx, optimizer)	Override this method to change the default behaviour of `optimizer.zero_grad()`.
`optimizers`([use_pl_optimizer])	Returns the optimizer(s) that are being used during training.
`parameters`([recurse])	Return an iterator over module parameters.
`predict`(x)
`predict_dataloader`()	An iterable or collection of iterables specifying prediction samples.
`predict_step`(batch, batch_idx, **kwargs)	Step function called during `predict()`.
`prepare_data`()	Use this to download and prepare data.
`print`(args, *kwargs)	Prints only from process 0.
`register_backward_hook`(hook)	Register a backward hook on the module.
`register_buffer`(name, tensor[, persistent])	Add a buffer to the module.
`register_forward_hook`(hook, *[, prepend, ...])	Register a forward hook on the module.
`register_forward_pre_hook`(hook, *[, ...])	Register a forward pre-hook on the module.
`register_full_backward_hook`(hook[, prepend])	Register a backward hook on the module.
`register_full_backward_pre_hook`(hook[, prepend])	Register a backward pre-hook on the module.
`register_load_state_dict_post_hook`(hook)	Register a post-hook to be run after module's `load_state_dict()` is called.
`register_load_state_dict_pre_hook`(hook)	Register a pre-hook to be run before module's `load_state_dict()` is called.
`register_module`(name, module)	Alias for `add_module()`.
`register_parameter`(name, param)	Add a parameter to the module.
`register_state_dict_post_hook`(hook)	Register a post-hook for the `state_dict()` method.
`register_state_dict_pre_hook`(hook)	Register a pre-hook for the `state_dict()` method.
`requires_grad_`([requires_grad])	Change if autograd should record operations on parameters in this module.
`save_hyperparameters`(*args[, ignore, frame, ...])	Save arguments to `hparams` attribute.
`set_extra_state`(state)	Set extra state contained in the loaded state_dict.
`set_submodule`(target, module[, strict])	Set the submodule given by `target` if it exists, otherwise throw an error.
`setup`(stage)	Called at the beginning of fit (train + validate), validate, test, or predict.
`share_memory`()	See `torch.Tensor.share_memory_()`.
`state_dict`(*args[, destination, prefix, ...])	Return a dictionary containing references to the whole state of the module.
`teardown`(stage)	Called at the end of fit (train + validate), validate, test, or predict.
`test_dataloader`()	An iterable or collection of iterables specifying test samples.
`test_step`(args, *kwargs)	Operates on a single batch of data from the test set.
`to`(args, *kwargs)	See `torch.nn.Module.to()`.
`to_empty`(*, device[, recurse])	Move the parameters and buffers to the specified device without copying storage.
`to_onnx`(file_path[, input_sample])	Saves the model in ONNX format.
`to_torchscript`([file_path, method, ...])	By default compiles the whole model to a `ScriptModule`.
`toggle_optimizer`(optimizer)	Makes sure only the gradients of the current optimizer's parameters are calculated in the training step to prevent dangling gradients in multiple-optimizer setup.
`train`([mode])	Set the module in training mode.
`train_dataloader`()	An iterable or collection of iterables specifying training samples.
`training_step`(batch, batch_idx)	Here you compute and return the training loss and some additional metrics for e.g. the progress bar or logger.
`transfer_batch_to_device`(batch, device, ...)	Override this hook if your `DataLoader` returns tensors wrapped in a custom data structure.
`type`(dst_type)	See `torch.nn.Module.type()`.
`unfreeze`()	Unfreeze all parameters for training.
`untoggle_optimizer`(optimizer)	Resets the state of required gradients that were toggled with `toggle_optimizer()`.
`val_dataloader`()	An iterable or collection of iterables specifying validation samples.
`validation_step`(batch, batch_idx)	Operates on a single batch of data from the validation set.
`xpu`([device])	Move all model parameters and buffers to the XPU.
`zero_grad`([set_to_none])	Reset gradients of all model parameters.

__call__

device = 'cpu'#

class Block(d_in, d_out, activation, bias, dropout, device)[source]#

Bases: Module

The main building block of MLP.

Methods

`add_module`(name, module)	Add a child module to the current module.
`apply`(fn)	Apply `fn` recursively to every submodule (as returned by `.children()`) as well as self.
`bfloat16`()	Casts all floating point parameters and buffers to `bfloat16` datatype.
`buffers`([recurse])	Return an iterator over module buffers.
`children`()	Return an iterator over immediate children modules.
`compile`(args, *kwargs)	Compile this Module's forward using `torch.compile()`.
`cpu`()	Move all model parameters and buffers to the CPU.
`cuda`([device])	Move all model parameters and buffers to the GPU.
`double`()	Casts all floating point parameters and buffers to `double` datatype.
`eval`()	Set the module in evaluation mode.
`extra_repr`()	Return the extra representation of the module.
`float`()	Casts all floating point parameters and buffers to `float` datatype.
`forward`(x)	Define the computation performed at every call.
`get_buffer`(target)	Return the buffer given by `target` if it exists, otherwise throw an error.
`get_extra_state`()	Return any extra state to include in the module's state_dict.
`get_parameter`(target)	Return the parameter given by `target` if it exists, otherwise throw an error.
`get_submodule`(target)	Return the submodule given by `target` if it exists, otherwise throw an error.
`half`()	Casts all floating point parameters and buffers to `half` datatype.
`ipu`([device])	Move all model parameters and buffers to the IPU.
`load_state_dict`(state_dict[, strict, assign])	Copy parameters and buffers from `state_dict` into this module and its descendants.
`modules`()	Return an iterator over all modules in the network.
`mtia`([device])	Move all model parameters and buffers to the MTIA.
`named_buffers`([prefix, recurse, ...])	Return an iterator over module buffers, yielding both the name of the buffer as well as the buffer itself.
`named_children`()	Return an iterator over immediate children modules, yielding both the name of the module as well as the module itself.
`named_modules`([memo, prefix, remove_duplicate])	Return an iterator over all modules in the network, yielding both the name of the module as well as the module itself.
`named_parameters`([prefix, recurse, ...])	Return an iterator over module parameters, yielding both the name of the parameter as well as the parameter itself.
`parameters`([recurse])	Return an iterator over module parameters.
`register_backward_hook`(hook)	Register a backward hook on the module.
`register_buffer`(name, tensor[, persistent])	Add a buffer to the module.
`register_forward_hook`(hook, *[, prepend, ...])	Register a forward hook on the module.
`register_forward_pre_hook`(hook, *[, ...])	Register a forward pre-hook on the module.
`register_full_backward_hook`(hook[, prepend])	Register a backward hook on the module.
`register_full_backward_pre_hook`(hook[, prepend])	Register a backward pre-hook on the module.
`register_load_state_dict_post_hook`(hook)	Register a post-hook to be run after module's `load_state_dict()` is called.
`register_load_state_dict_pre_hook`(hook)	Register a pre-hook to be run before module's `load_state_dict()` is called.
`register_module`(name, module)	Alias for `add_module()`.
`register_parameter`(name, param)	Add a parameter to the module.
`register_state_dict_post_hook`(hook)	Register a post-hook for the `state_dict()` method.
`register_state_dict_pre_hook`(hook)	Register a pre-hook for the `state_dict()` method.
`requires_grad_`([requires_grad])	Change if autograd should record operations on parameters in this module.
`set_extra_state`(state)	Set extra state contained in the loaded state_dict.
`set_submodule`(target, module[, strict])	Set the submodule given by `target` if it exists, otherwise throw an error.
`share_memory`()	See `torch.Tensor.share_memory_()`.
`state_dict`(*args[, destination, prefix, ...])	Return a dictionary containing references to the whole state of the module.
`to`(args, *kwargs)	Move and/or cast the parameters and buffers.
`to_empty`(*, device[, recurse])	Move the parameters and buffers to the specified device without copying storage.
`train`([mode])	Set the module in training mode.
`type`(dst_type)	Casts all parameters and buffers to `dst_type`.
`xpu`([device])	Move all model parameters and buffers to the XPU.
`zero_grad`([set_to_none])	Reset gradients of all model parameters.

__call__

__init__(d_in, d_out, activation, bias, dropout, device)[source]#

Initialize internal Module state, shared by both nn.Module and ScriptModule.

forward(x)[source]#

Define the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

__init__(input_size, layers_dimensions, activation, batch_size, lr, loss, dropout)[source]#

forward(x)[source]#

Same as torch.nn.Module.forward().

Parameters:

*args – Whatever you decide to pass into the forward method.
**kwargs – Keyword arguments are also possible.

Returns:

Your model’s output

predict_step(batch, batch_idx, **kwargs)[source]#

Step function called during predict(). By default, it calls forward(). Override to add any processing logic.

The predict_step() is used to scale inference on multi-devices.

To prevent an OOM error, it is possible to use BasePredictionWriter callback to write the predictions to disk or database after each batch or on epoch end.

The BasePredictionWriter should be used while using a spawn based accelerator. This happens for Trainer(strategy="ddp_spawn") or training on 8 TPU cores with Trainer(accelerator="tpu", devices=8) as predictions won’t be returned.

Example

class MyModel(LightningModule):

    def predict_step(self, batch, batch_idx, dataloader_idx=0):
        return self(batch)

dm = ...
model = MyModel()
trainer = Trainer(accelerator="gpu", devices=2)
predictions = trainer.predict(model, dm)

Parameters:

batch – Current batch.
batch_idx – Index of current batch.
dataloader_idx – Index of the current dataloader.

Returns:

Predicted output

training_step(batch, batch_idx)[source]#

Here you compute and return the training loss and some additional metrics for e.g. the progress bar or logger.

Parameters:

batch (Tensor | (Tensor, …) | [Tensor, …]) – The output of your DataLoader. A tensor, tuple or list.
batch_idx (int) – Integer displaying index of this batch

Returns:

Any of.

Tensor - The loss tensor
dict - A dictionary. Can include any keys, but must include the key 'loss'
None - Training will skip to the next batch. This is only for automatic optimization.
This is not supported for multi-GPU, TPU, IPU, or DeepSpeed.

In this step you’d normally do the forward pass and calculate the loss for a batch. You can also do fancier things like multiple forward passes or something model specific.

Example:

def training_step(self, batch, batch_idx):
    x, y, z = batch
    out = self.encoder(x)
    loss = self.loss(out, x)
    return loss

To use multiple optimizers, you can switch to ‘manual optimization’ and control their stepping:

def __init__(self):
    super().__init__()
    self.automatic_optimization = False

# Multiple optimizers (e.g.: GANs)
def training_step(self, batch, batch_idx):
    opt1, opt2 = self.optimizers()

    # do training_step with encoder
    ...
    opt1.step()
    # do training_step with decoder
    ...
    opt2.step()

Note

When accumulate_grad_batches > 1, the loss returned here will be automatically normalized by accumulate_grad_batches internally.

validation_step(batch, batch_idx)[source]#

Operates on a single batch of data from the validation set. In this step you’d might generate examples or calculate anything of interest like accuracy.

Parameters:

batch – The output of your DataLoader.
batch_idx – The index of this batch.
dataloader_idx – The index of the dataloader that produced this batch. (only if multiple val dataloaders used)

Returns:

Any object or value
None - Validation will skip to the next batch

# if you have one val dataloader:
def validation_step(self, batch, batch_idx):
    ...

# if you have multiple val dataloaders:
def validation_step(self, batch, batch_idx, dataloader_idx=0):
    ...

Examples:

# CASE 1: A single validation dataset
def validation_step(self, batch, batch_idx):
    x, y = batch

    # implement your own
    out = self(x)
    loss = self.loss(out, y)

    # log 6 example images
    # or generated text... or whatever
    sample_imgs = x[:6]
    grid = torchvision.utils.make_grid(sample_imgs)
    self.logger.experiment.add_image('example_images', grid, 0)

    # calculate acc
    labels_hat = torch.argmax(out, dim=1)
    val_acc = torch.sum(y == labels_hat).item() / (len(y) * 1.0)

    # log the outputs!
    self.log_dict({'val_loss': loss, 'val_acc': val_acc})

If you pass in multiple val dataloaders, validation_step() will have an additional argument. We recommend setting the default value of 0 so that you can quickly switch between single and multiple dataloaders.

# CASE 2: multiple validation dataloaders
def validation_step(self, batch, batch_idx, dataloader_idx=0):
    # dataloader_idx tells you which dataset this is.
    ...

Note

If you don’t need to validate you don’t need to implement this method.

Note

When the validation_step() is called, the model has been put in eval mode and PyTorch gradients have been disabled. At the end of validation, the model goes back to training mode and gradients are enabled.

configure_optimizers()[source]#

Choose what optimizers and learning-rate schedulers to use in your optimization. Normally you’d need one. But in the case of GANs or similar you might have multiple. Optimization with multiple optimizers only works in the manual optimization mode.

Returns:

Any of these 6 options.

Single optimizer.
List or Tuple of optimizers.
Two lists - The first list has multiple optimizers, and the second has multiple LR schedulers (or multiple lr_scheduler_config).
Dictionary, with an "optimizer" key, and (optionally) a "lr_scheduler" key whose value is a single LR scheduler or lr_scheduler_config.
None - Fit will run without any optimizer.

The lr_scheduler_config is a dictionary which contains the scheduler and its associated configuration. The default configuration is shown below.

lr_scheduler_config = {
    # REQUIRED: The scheduler instance
    "scheduler": lr_scheduler,
    # The unit of the scheduler's step size, could also be 'step'.
    # 'epoch' updates the scheduler on epoch end whereas 'step'
    # updates it after a optimizer update.
    "interval": "epoch",
    # How many epochs/steps should pass between calls to
    # `scheduler.step()`. 1 corresponds to updating the learning
    # rate after every epoch/step.
    "frequency": 1,
    # Metric to to monitor for schedulers like `ReduceLROnPlateau`
    "monitor": "val_loss",
    # If set to `True`, will enforce that the value specified 'monitor'
    # is available when the scheduler is updated, thus stopping
    # training if not found. If set to `False`, it will only produce a warning
    "strict": True,
    # If using the `LearningRateMonitor` callback to monitor the
    # learning rate progress, this keyword can be used to specify
    # a custom logged name
    "name": None,
}

When there are schedulers in which the .step() method is conditioned on a value, such as the torch.optim.lr_scheduler.ReduceLROnPlateau scheduler, Lightning requires that the lr_scheduler_config contains the keyword "monitor" set to the metric name that the scheduler should be conditioned on.

Metrics can be made available to monitor by simply logging it using self.log('metric_to_track', metric_val) in your LightningModule.

Note

Some things to know:

Lightning calls .backward() and .step() automatically in case of automatic optimization.
If a learning rate scheduler is specified in configure_optimizers() with key "interval" (default “epoch”) in the scheduler configuration, Lightning will call the scheduler’s .step() method automatically in case of automatic optimization.
If you use 16-bit precision (precision=16), Lightning will automatically handle the optimizer.
If you use torch.optim.LBFGS, Lightning handles the closure function automatically for you.
If you use multiple optimizers, you will have to switch to ‘manual optimization’ mode and step them yourself.
If you need to control how often the optimizer steps, override the optimizer_step() hook.

predict(x)[source]#

class DFF(input_size, hidden_size, batch_size, lr, loss)[source]#

Bases: LightningModule

Attributes:

automatic_optimization: If set to False you are responsible for calling .backward(), .step(), .zero_grad().
current_epoch: The current epoch in the Trainer, or 0 if not attached.
device
dtype
example_input_array: The example input array is a specification of what the module can consume in the forward() method.
fabric
global_rank: The index of the current process across all nodes and devices.
global_step: Total training batches seen across all epochs.
hparams: The collection of hyperparameters saved with save_hyperparameters().
hparams_initial: The collection of hyperparameters saved with save_hyperparameters().
local_rank: The index of the current process within a single node.
logger: Reference to the logger object in the Trainer.
loggers: Reference to the list of loggers in the Trainer.
on_gpu: Returns True if this model is currently located on a GPU.
trainer

Methods

`add_module`(name, module)	Add a child module to the current module.
`all_gather`(data[, group, sync_grads])	Gather tensors or collections of tensors from multiple processes.
`apply`(fn)	Apply `fn` recursively to every submodule (as returned by `.children()`) as well as self.
`backward`(loss, args, *kwargs)	Called to perform backward on the loss returned in `training_step()`.
`bfloat16`()	Casts all floating point parameters and buffers to `bfloat16` datatype.
`buffers`([recurse])	Return an iterator over module buffers.
`children`()	Return an iterator over immediate children modules.
`clip_gradients`(optimizer[, ...])	Handles gradient clipping internally.
`compile`(args, *kwargs)	Compile this Module's forward using `torch.compile()`.
`configure_callbacks`()	Configure model-specific callbacks.
`configure_gradient_clipping`(optimizer[, ...])	Perform gradient clipping for the optimizer parameters.
`configure_optimizers`()	Choose what optimizers and learning-rate schedulers to use in your optimization.
`configure_sharded_model`()	Hook to create modules in a distributed aware context.
`cpu`()	See `torch.nn.Module.cpu()`.
`cuda`([device])	Moves all model parameters and buffers to the GPU.
`double`()	See `torch.nn.Module.double()`.
`eval`()	Set the module in evaluation mode.
`extra_repr`()	Return the extra representation of the module.
`float`()	See `torch.nn.Module.float()`.
`forward`(x)	Same as `torch.nn.Module.forward()`.
`freeze`()	Freeze all params for inference.
`get_buffer`(target)	Return the buffer given by `target` if it exists, otherwise throw an error.
`get_extra_state`()	Return any extra state to include in the module's state_dict.
`get_parameter`(target)	Return the parameter given by `target` if it exists, otherwise throw an error.
`get_submodule`(target)	Return the submodule given by `target` if it exists, otherwise throw an error.
`half`()	See `torch.nn.Module.half()`.
`ipu`([device])	Move all model parameters and buffers to the IPU.
`load_from_checkpoint`(checkpoint_path[, ...])	Primary way of loading a model from a checkpoint.
`load_state_dict`(state_dict[, strict, assign])	Copy parameters and buffers from `state_dict` into this module and its descendants.
`log`(name, value[, prog_bar, logger, ...])	Log a key, value pair.
`log_dict`(dictionary[, prog_bar, logger, ...])	Log a dictionary of values at once.
`lr_scheduler_step`(scheduler, metric)	Override this method to adjust the default way the `Trainer` calls each scheduler.
`lr_schedulers`()	Returns the learning rate scheduler(s) that are being used during training.
`manual_backward`(loss, args, *kwargs)	Call this directly from your `training_step()` when doing optimizations manually.
`modules`()	Return an iterator over all modules in the network.
`mtia`([device])	Move all model parameters and buffers to the MTIA.
`named_buffers`([prefix, recurse, ...])	Return an iterator over module buffers, yielding both the name of the buffer as well as the buffer itself.
`named_children`()	Return an iterator over immediate children modules, yielding both the name of the module as well as the module itself.
`named_modules`([memo, prefix, remove_duplicate])	Return an iterator over all modules in the network, yielding both the name of the module as well as the module itself.
`named_parameters`([prefix, recurse, ...])	Return an iterator over module parameters, yielding both the name of the parameter as well as the parameter itself.
`on_after_backward`()	Called after `loss.backward()` and before optimizers are stepped.
`on_after_batch_transfer`(batch, dataloader_idx)	Override to alter or apply batch augmentations to your batch after it is transferred to the device.
`on_before_backward`(loss)	Called before `loss.backward()`.
`on_before_batch_transfer`(batch, dataloader_idx)	Override to alter or apply batch augmentations to your batch before it is transferred to the device.
`on_before_optimizer_step`(optimizer)	Called before `optimizer.step()`.
`on_before_zero_grad`(optimizer)	Called after `training_step()` and before `optimizer.zero_grad()`.
`on_fit_end`()	Called at the very end of fit.
`on_fit_start`()	Called at the very beginning of fit.
`on_load_checkpoint`(checkpoint)	Called by Lightning to restore your model.
`on_predict_batch_end`(outputs, batch, batch_idx)	Called in the predict loop after the batch.
`on_predict_batch_start`(batch, batch_idx[, ...])	Called in the predict loop before anything happens for that batch.
`on_predict_end`()	Called at the end of predicting.
`on_predict_epoch_end`()	Called at the end of predicting.
`on_predict_epoch_start`()	Called at the beginning of predicting.
`on_predict_model_eval`()	Sets the model to eval during the predict loop.
`on_predict_start`()	Called at the beginning of predicting.
`on_save_checkpoint`(checkpoint)	Called by Lightning when saving a checkpoint to give you a chance to store anything else you might want to save.
`on_test_batch_end`(outputs, batch, batch_idx)	Called in the test loop after the batch.
`on_test_batch_start`(batch, batch_idx[, ...])	Called in the test loop before anything happens for that batch.
`on_test_end`()	Called at the end of testing.
`on_test_epoch_end`()	Called in the test loop at the very end of the epoch.
`on_test_epoch_start`()	Called in the test loop at the very beginning of the epoch.
`on_test_model_eval`()	Sets the model to eval during the test loop.
`on_test_model_train`()	Sets the model to train during the test loop.
`on_test_start`()	Called at the beginning of testing.
`on_train_batch_end`(outputs, batch, batch_idx)	Called in the training loop after the batch.
`on_train_batch_start`(batch, batch_idx)	Called in the training loop before anything happens for that batch.
`on_train_end`()	Called at the end of training before logger experiment is closed.
`on_train_epoch_end`()	Called in the training loop at the very end of the epoch.
`on_train_epoch_start`()	Called in the training loop at the very beginning of the epoch.
`on_train_start`()	Called at the beginning of training after sanity check.
`on_validation_batch_end`(outputs, batch, ...)	Called in the validation loop after the batch.
`on_validation_batch_start`(batch, batch_idx)	Called in the validation loop before anything happens for that batch.
`on_validation_end`()	Called at the end of validation.
`on_validation_epoch_end`()	Called in the validation loop at the very end of the epoch.
`on_validation_epoch_start`()	Called in the validation loop at the very beginning of the epoch.
`on_validation_model_eval`()	Sets the model to eval during the val loop.
`on_validation_model_train`()	Sets the model to train during the val loop.
`on_validation_start`()	Called at the beginning of validation.
`optimizer_step`(epoch, batch_idx, optimizer)	Override this method to adjust the default way the `Trainer` calls the optimizer.
`optimizer_zero_grad`(epoch, batch_idx, optimizer)	Override this method to change the default behaviour of `optimizer.zero_grad()`.
`optimizers`([use_pl_optimizer])	Returns the optimizer(s) that are being used during training.
`parameters`([recurse])	Return an iterator over module parameters.
`predict_dataloader`()	An iterable or collection of iterables specifying prediction samples.
`predict_step`(batch, batch_idx, **kwargs)	Step function called during `predict()`.
`prepare_data`()	Use this to download and prepare data.
`print`(args, *kwargs)	Prints only from process 0.
`register_backward_hook`(hook)	Register a backward hook on the module.
`register_buffer`(name, tensor[, persistent])	Add a buffer to the module.
`register_forward_hook`(hook, *[, prepend, ...])	Register a forward hook on the module.
`register_forward_pre_hook`(hook, *[, ...])	Register a forward pre-hook on the module.
`register_full_backward_hook`(hook[, prepend])	Register a backward hook on the module.
`register_full_backward_pre_hook`(hook[, prepend])	Register a backward pre-hook on the module.
`register_load_state_dict_post_hook`(hook)	Register a post-hook to be run after module's `load_state_dict()` is called.
`register_load_state_dict_pre_hook`(hook)	Register a pre-hook to be run before module's `load_state_dict()` is called.
`register_module`(name, module)	Alias for `add_module()`.
`register_parameter`(name, param)	Add a parameter to the module.
`register_state_dict_post_hook`(hook)	Register a post-hook for the `state_dict()` method.
`register_state_dict_pre_hook`(hook)	Register a pre-hook for the `state_dict()` method.
`requires_grad_`([requires_grad])	Change if autograd should record operations on parameters in this module.
`save_hyperparameters`(*args[, ignore, frame, ...])	Save arguments to `hparams` attribute.
`set_extra_state`(state)	Set extra state contained in the loaded state_dict.
`set_submodule`(target, module[, strict])	Set the submodule given by `target` if it exists, otherwise throw an error.
`setup`(stage)	Called at the beginning of fit (train + validate), validate, test, or predict.
`share_memory`()	See `torch.Tensor.share_memory_()`.
`state_dict`(*args[, destination, prefix, ...])	Return a dictionary containing references to the whole state of the module.
`teardown`(stage)	Called at the end of fit (train + validate), validate, test, or predict.
`test_dataloader`()	An iterable or collection of iterables specifying test samples.
`test_step`(args, *kwargs)	Operates on a single batch of data from the test set.
`to`(args, *kwargs)	See `torch.nn.Module.to()`.
`to_empty`(*, device[, recurse])	Move the parameters and buffers to the specified device without copying storage.
`to_onnx`(file_path[, input_sample])	Saves the model in ONNX format.
`to_torchscript`([file_path, method, ...])	By default compiles the whole model to a `ScriptModule`.
`toggle_optimizer`(optimizer)	Makes sure only the gradients of the current optimizer's parameters are calculated in the training step to prevent dangling gradients in multiple-optimizer setup.
`train`([mode])	Set the module in training mode.
`train_dataloader`()	An iterable or collection of iterables specifying training samples.
`training_step`(batch, batch_idx)	Here you compute and return the training loss and some additional metrics for e.g. the progress bar or logger.
`transfer_batch_to_device`(batch, device, ...)	Override this hook if your `DataLoader` returns tensors wrapped in a custom data structure.
`type`(dst_type)	See `torch.nn.Module.type()`.
`unfreeze`()	Unfreeze all parameters for training.
`untoggle_optimizer`(optimizer)	Resets the state of required gradients that were toggled with `toggle_optimizer()`.
`val_dataloader`()	An iterable or collection of iterables specifying validation samples.
`validation_step`(batch, batch_idx)	Operates on a single batch of data from the validation set.
`xpu`([device])	Move all model parameters and buffers to the XPU.
`zero_grad`([set_to_none])	Reset gradients of all model parameters.

__call__

__init__(input_size, hidden_size, batch_size, lr, loss)[source]#

forward(x)[source]#

Same as torch.nn.Module.forward().

Parameters:

*args – Whatever you decide to pass into the forward method.
**kwargs – Keyword arguments are also possible.

Returns:

Your model’s output

predict_step(batch, batch_idx, **kwargs)[source]#

Step function called during predict(). By default, it calls forward(). Override to add any processing logic.

The predict_step() is used to scale inference on multi-devices.

To prevent an OOM error, it is possible to use BasePredictionWriter callback to write the predictions to disk or database after each batch or on epoch end.

The BasePredictionWriter should be used while using a spawn based accelerator. This happens for Trainer(strategy="ddp_spawn") or training on 8 TPU cores with Trainer(accelerator="tpu", devices=8) as predictions won’t be returned.

Example

class MyModel(LightningModule):

    def predict_step(self, batch, batch_idx, dataloader_idx=0):
        return self(batch)

dm = ...
model = MyModel()
trainer = Trainer(accelerator="gpu", devices=2)
predictions = trainer.predict(model, dm)

Parameters:

batch – Current batch.
batch_idx – Index of current batch.
dataloader_idx – Index of the current dataloader.

Returns:

Predicted output

training_step(batch, batch_idx)[source]#

Here you compute and return the training loss and some additional metrics for e.g. the progress bar or logger.

Parameters:

batch (Tensor | (Tensor, …) | [Tensor, …]) – The output of your DataLoader. A tensor, tuple or list.
batch_idx (int) – Integer displaying index of this batch

Returns:

Any of.

Tensor - The loss tensor
dict - A dictionary. Can include any keys, but must include the key 'loss'
None - Training will skip to the next batch. This is only for automatic optimization.
This is not supported for multi-GPU, TPU, IPU, or DeepSpeed.

In this step you’d normally do the forward pass and calculate the loss for a batch. You can also do fancier things like multiple forward passes or something model specific.

Example:

def training_step(self, batch, batch_idx):
    x, y, z = batch
    out = self.encoder(x)
    loss = self.loss(out, x)
    return loss

To use multiple optimizers, you can switch to ‘manual optimization’ and control their stepping:

def __init__(self):
    super().__init__()
    self.automatic_optimization = False

# Multiple optimizers (e.g.: GANs)
def training_step(self, batch, batch_idx):
    opt1, opt2 = self.optimizers()

    # do training_step with encoder
    ...
    opt1.step()
    # do training_step with decoder
    ...
    opt2.step()

Note

When accumulate_grad_batches > 1, the loss returned here will be automatically normalized by accumulate_grad_batches internally.

validation_step(batch, batch_idx)[source]#

Operates on a single batch of data from the validation set. In this step you’d might generate examples or calculate anything of interest like accuracy.

Parameters:

batch – The output of your DataLoader.
batch_idx – The index of this batch.
dataloader_idx – The index of the dataloader that produced this batch. (only if multiple val dataloaders used)

Returns:

Any object or value
None - Validation will skip to the next batch

# if you have one val dataloader:
def validation_step(self, batch, batch_idx):
    ...

# if you have multiple val dataloaders:
def validation_step(self, batch, batch_idx, dataloader_idx=0):
    ...

Examples:

# CASE 1: A single validation dataset
def validation_step(self, batch, batch_idx):
    x, y = batch

    # implement your own
    out = self(x)
    loss = self.loss(out, y)

    # log 6 example images
    # or generated text... or whatever
    sample_imgs = x[:6]
    grid = torchvision.utils.make_grid(sample_imgs)
    self.logger.experiment.add_image('example_images', grid, 0)

    # calculate acc
    labels_hat = torch.argmax(out, dim=1)
    val_acc = torch.sum(y == labels_hat).item() / (len(y) * 1.0)

    # log the outputs!
    self.log_dict({'val_loss': loss, 'val_acc': val_acc})

If you pass in multiple val dataloaders, validation_step() will have an additional argument. We recommend setting the default value of 0 so that you can quickly switch between single and multiple dataloaders.

# CASE 2: multiple validation dataloaders
def validation_step(self, batch, batch_idx, dataloader_idx=0):
    # dataloader_idx tells you which dataset this is.
    ...

Note

If you don’t need to validate you don’t need to implement this method.

Note

When the validation_step() is called, the model has been put in eval mode and PyTorch gradients have been disabled. At the end of validation, the model goes back to training mode and gradients are enabled.

configure_optimizers()[source]#

Choose what optimizers and learning-rate schedulers to use in your optimization. Normally you’d need one. But in the case of GANs or similar you might have multiple. Optimization with multiple optimizers only works in the manual optimization mode.

Returns:

Any of these 6 options.

Single optimizer.
List or Tuple of optimizers.
Two lists - The first list has multiple optimizers, and the second has multiple LR schedulers (or multiple lr_scheduler_config).
Dictionary, with an "optimizer" key, and (optionally) a "lr_scheduler" key whose value is a single LR scheduler or lr_scheduler_config.
None - Fit will run without any optimizer.

The lr_scheduler_config is a dictionary which contains the scheduler and its associated configuration. The default configuration is shown below.

lr_scheduler_config = {
    # REQUIRED: The scheduler instance
    "scheduler": lr_scheduler,
    # The unit of the scheduler's step size, could also be 'step'.
    # 'epoch' updates the scheduler on epoch end whereas 'step'
    # updates it after a optimizer update.
    "interval": "epoch",
    # How many epochs/steps should pass between calls to
    # `scheduler.step()`. 1 corresponds to updating the learning
    # rate after every epoch/step.
    "frequency": 1,
    # Metric to to monitor for schedulers like `ReduceLROnPlateau`
    "monitor": "val_loss",
    # If set to `True`, will enforce that the value specified 'monitor'
    # is available when the scheduler is updated, thus stopping
    # training if not found. If set to `False`, it will only produce a warning
    "strict": True,
    # If using the `LearningRateMonitor` callback to monitor the
    # learning rate progress, this keyword can be used to specify
    # a custom logged name
    "name": None,
}

When there are schedulers in which the .step() method is conditioned on a value, such as the torch.optim.lr_scheduler.ReduceLROnPlateau scheduler, Lightning requires that the lr_scheduler_config contains the keyword "monitor" set to the metric name that the scheduler should be conditioned on.

Metrics can be made available to monitor by simply logging it using self.log('metric_to_track', metric_val) in your LightningModule.

Note

Some things to know:

Lightning calls .backward() and .step() automatically in case of automatic optimization.
If a learning rate scheduler is specified in configure_optimizers() with key "interval" (default “epoch”) in the scheduler configuration, Lightning will call the scheduler’s .step() method automatically in case of automatic optimization.
If you use 16-bit precision (precision=16), Lightning will automatically handle the optimizer.
If you use torch.optim.LBFGS, Lightning handles the closure function automatically for you.
If you use multiple optimizers, you will have to switch to ‘manual optimization’ mode and step them yourself.
If you need to control how often the optimizer steps, override the optimizer_step() hook.

class MDN(input_size, hidden_size, num_gaussians, lr, batch_size, loss_function='loglikelihood')[source]#

Bases: LightningModule

Attributes:

automatic_optimization: If set to False you are responsible for calling .backward(), .step(), .zero_grad().
current_epoch: The current epoch in the Trainer, or 0 if not attached.
device
dtype
example_input_array: The example input array is a specification of what the module can consume in the forward() method.
fabric
global_rank: The index of the current process across all nodes and devices.
global_step: Total training batches seen across all epochs.
hparams: The collection of hyperparameters saved with save_hyperparameters().
hparams_initial: The collection of hyperparameters saved with save_hyperparameters().
local_rank: The index of the current process within a single node.
logger: Reference to the logger object in the Trainer.
loggers: Reference to the list of loggers in the Trainer.
on_gpu: Returns True if this model is currently located on a GPU.
trainer

Methods

`add_module`(name, module)	Add a child module to the current module.
`all_gather`(data[, group, sync_grads])	Gather tensors or collections of tensors from multiple processes.
`apply`(fn)	Apply `fn` recursively to every submodule (as returned by `.children()`) as well as self.
`backward`(loss, args, *kwargs)	Called to perform backward on the loss returned in `training_step()`.
`bfloat16`()	Casts all floating point parameters and buffers to `bfloat16` datatype.
`buffers`([recurse])	Return an iterator over module buffers.
`children`()	Return an iterator over immediate children modules.
`clip_gradients`(optimizer[, ...])	Handles gradient clipping internally.
`compile`(args, *kwargs)	Compile this Module's forward using `torch.compile()`.
`configure_callbacks`()	Configure model-specific callbacks.
`configure_gradient_clipping`(optimizer[, ...])	Perform gradient clipping for the optimizer parameters.
`configure_optimizers`()	Choose what optimizers and learning-rate schedulers to use in your optimization.
`configure_sharded_model`()	Hook to create modules in a distributed aware context.
`cpu`()	See `torch.nn.Module.cpu()`.
`cuda`([device])	Moves all model parameters and buffers to the GPU.
`double`()	See `torch.nn.Module.double()`.
`eval`()	Set the module in evaluation mode.
`extra_repr`()	Return the extra representation of the module.
`float`()	See `torch.nn.Module.float()`.
`forward`(x)	Same as `torch.nn.Module.forward()`.
`freeze`()	Freeze all params for inference.
`g_sample`(pi, sigma, mu)	Gumbel sampling comes from here: hardmaru/pytorch_notebooks
`get_buffer`(target)	Return the buffer given by `target` if it exists, otherwise throw an error.
`get_extra_state`()	Return any extra state to include in the module's state_dict.
`get_parameter`(target)	Return the parameter given by `target` if it exists, otherwise throw an error.
`get_submodule`(target)	Return the submodule given by `target` if it exists, otherwise throw an error.
`half`()	See `torch.nn.Module.half()`.
`ipu`([device])	Move all model parameters and buffers to the IPU.
`load_from_checkpoint`(checkpoint_path[, ...])	Primary way of loading a model from a checkpoint.
`load_state_dict`(state_dict[, strict, assign])	Copy parameters and buffers from `state_dict` into this module and its descendants.
`log`(name, value[, prog_bar, logger, ...])	Log a key, value pair.
`log_dict`(dictionary[, prog_bar, logger, ...])	Log a dictionary of values at once.
`lr_scheduler_step`(scheduler, metric)	Override this method to adjust the default way the `Trainer` calls each scheduler.
`lr_schedulers`()	Returns the learning rate scheduler(s) that are being used during training.
`manual_backward`(loss, args, *kwargs)	Call this directly from your `training_step()` when doing optimizations manually.
`mdn_loss`(pi, sigma, mu, y)	Calculates the error, given the MoG parameters and the target The loss is the negative log likelihood of the data given the MoG parameters.
`mmd_loss`(x, y, kernel)	https://www.kaggle.com/onurtunali/maximum-mean-discrepancy
`modules`()	Return an iterator over all modules in the network.
`mtia`([device])	Move all model parameters and buffers to the MTIA.
`named_buffers`([prefix, recurse, ...])	Return an iterator over module buffers, yielding both the name of the buffer as well as the buffer itself.
`named_children`()	Return an iterator over immediate children modules, yielding both the name of the module as well as the module itself.
`named_modules`([memo, prefix, remove_duplicate])	Return an iterator over all modules in the network, yielding both the name of the module as well as the module itself.
`named_parameters`([prefix, recurse, ...])	Return an iterator over module parameters, yielding both the name of the parameter as well as the parameter itself.
`on_after_backward`()	Called after `loss.backward()` and before optimizers are stepped.
`on_after_batch_transfer`(batch, dataloader_idx)	Override to alter or apply batch augmentations to your batch after it is transferred to the device.
`on_before_backward`(loss)	Called before `loss.backward()`.
`on_before_batch_transfer`(batch, dataloader_idx)	Override to alter or apply batch augmentations to your batch before it is transferred to the device.
`on_before_optimizer_step`(optimizer)	Called before `optimizer.step()`.
`on_before_zero_grad`(optimizer)	Called after `training_step()` and before `optimizer.zero_grad()`.
`on_fit_end`()	Called at the very end of fit.
`on_fit_start`()	Called at the very beginning of fit.
`on_load_checkpoint`(checkpoint)	Called by Lightning to restore your model.
`on_predict_batch_end`(outputs, batch, batch_idx)	Called in the predict loop after the batch.
`on_predict_batch_start`(batch, batch_idx[, ...])	Called in the predict loop before anything happens for that batch.
`on_predict_end`()	Called at the end of predicting.
`on_predict_epoch_end`()	Called at the end of predicting.
`on_predict_epoch_start`()	Called at the beginning of predicting.
`on_predict_model_eval`()	Sets the model to eval during the predict loop.
`on_predict_start`()	Called at the beginning of predicting.
`on_save_checkpoint`(checkpoint)	Called by Lightning when saving a checkpoint to give you a chance to store anything else you might want to save.
`on_test_batch_end`(outputs, batch, batch_idx)	Called in the test loop after the batch.
`on_test_batch_start`(batch, batch_idx[, ...])	Called in the test loop before anything happens for that batch.
`on_test_end`()	Called at the end of testing.
`on_test_epoch_end`()	Called in the test loop at the very end of the epoch.
`on_test_epoch_start`()	Called in the test loop at the very beginning of the epoch.
`on_test_model_eval`()	Sets the model to eval during the test loop.
`on_test_model_train`()	Sets the model to train during the test loop.
`on_test_start`()	Called at the beginning of testing.
`on_train_batch_end`(outputs, batch, batch_idx)	Called in the training loop after the batch.
`on_train_batch_start`(batch, batch_idx)	Called in the training loop before anything happens for that batch.
`on_train_end`()	Called at the end of training before logger experiment is closed.
`on_train_epoch_end`()	Called in the training loop at the very end of the epoch.
`on_train_epoch_start`()	Called in the training loop at the very beginning of the epoch.
`on_train_start`()	Called at the beginning of training after sanity check.
`on_validation_batch_end`(outputs, batch, ...)	Called in the validation loop after the batch.
`on_validation_batch_start`(batch, batch_idx)	Called in the validation loop before anything happens for that batch.
`on_validation_end`()	Called at the end of validation.
`on_validation_epoch_end`()	Called in the validation loop at the very end of the epoch.
`on_validation_epoch_start`()	Called in the validation loop at the very beginning of the epoch.
`on_validation_model_eval`()	Sets the model to eval during the val loop.
`on_validation_model_train`()	Sets the model to train during the val loop.
`on_validation_start`()	Called at the beginning of validation.
`optimizer_step`(epoch, batch_idx, optimizer)	Override this method to adjust the default way the `Trainer` calls the optimizer.
`optimizer_zero_grad`(epoch, batch_idx, optimizer)	Override this method to change the default behaviour of `optimizer.zero_grad()`.
`optimizers`([use_pl_optimizer])	Returns the optimizer(s) that are being used during training.
`parameters`([recurse])	Return an iterator over module parameters.
`predict_dataloader`()	An iterable or collection of iterables specifying prediction samples.
`predict_step`(batch, batch_idx[, dataloader_idx])	Step function called during `predict()`.
`prepare_data`()	Use this to download and prepare data.
`print`(args, *kwargs)	Prints only from process 0.
`register_backward_hook`(hook)	Register a backward hook on the module.
`register_buffer`(name, tensor[, persistent])	Add a buffer to the module.
`register_forward_hook`(hook, *[, prepend, ...])	Register a forward hook on the module.
`register_forward_pre_hook`(hook, *[, ...])	Register a forward pre-hook on the module.
`register_full_backward_hook`(hook[, prepend])	Register a backward hook on the module.
`register_full_backward_pre_hook`(hook[, prepend])	Register a backward pre-hook on the module.
`register_load_state_dict_post_hook`(hook)	Register a post-hook to be run after module's `load_state_dict()` is called.
`register_load_state_dict_pre_hook`(hook)	Register a pre-hook to be run before module's `load_state_dict()` is called.
`register_module`(name, module)	Alias for `add_module()`.
`register_parameter`(name, param)	Add a parameter to the module.
`register_state_dict_post_hook`(hook)	Register a post-hook for the `state_dict()` method.
`register_state_dict_pre_hook`(hook)	Register a pre-hook for the `state_dict()` method.
`requires_grad_`([requires_grad])	Change if autograd should record operations on parameters in this module.
`sample`(pi, sigma, mu)	Draw samples from a MoG.
`save_hyperparameters`(*args[, ignore, frame, ...])	Save arguments to `hparams` attribute.
`set_extra_state`(state)	Set extra state contained in the loaded state_dict.
`set_submodule`(target, module[, strict])	Set the submodule given by `target` if it exists, otherwise throw an error.
`setup`(stage)	Called at the beginning of fit (train + validate), validate, test, or predict.
`share_memory`()	See `torch.Tensor.share_memory_()`.
`state_dict`(*args[, destination, prefix, ...])	Return a dictionary containing references to the whole state of the module.
`teardown`(stage)	Called at the end of fit (train + validate), validate, test, or predict.
`test_dataloader`()	An iterable or collection of iterables specifying test samples.
`test_step`(args, *kwargs)	Operates on a single batch of data from the test set.
`to`(args, *kwargs)	See `torch.nn.Module.to()`.
`to_empty`(*, device[, recurse])	Move the parameters and buffers to the specified device without copying storage.
`to_onnx`(file_path[, input_sample])	Saves the model in ONNX format.
`to_torchscript`([file_path, method, ...])	By default compiles the whole model to a `ScriptModule`.
`toggle_optimizer`(optimizer)	Makes sure only the gradients of the current optimizer's parameters are calculated in the training step to prevent dangling gradients in multiple-optimizer setup.
`train`([mode])	Set the module in training mode.
`train_dataloader`()	An iterable or collection of iterables specifying training samples.
`training_step`(batch, batch_idx)	Here you compute and return the training loss and some additional metrics for e.g. the progress bar or logger.
`transfer_batch_to_device`(batch, device, ...)	Override this hook if your `DataLoader` returns tensors wrapped in a custom data structure.
`type`(dst_type)	See `torch.nn.Module.type()`.
`unfreeze`()	Unfreeze all parameters for training.
`untoggle_optimizer`(optimizer)	Resets the state of required gradients that were toggled with `toggle_optimizer()`.
`val_dataloader`()	An iterable or collection of iterables specifying validation samples.
`validation_step`(batch, batch_idx)	Operates on a single batch of data from the validation set.
`xpu`([device])	Move all model parameters and buffers to the XPU.
`zero_grad`([set_to_none])	Reset gradients of all model parameters.

__call__
add_noise
common_step
gaussian_probability

__init__(input_size, hidden_size, num_gaussians, lr, batch_size, loss_function='loglikelihood')[source]#

Init function for the MDN

Parameters:

input_size (int) – the number of dimensions in the input
hidden_size (int) – the number of dimensions in the hidden layer
num_gaussians (int) – the number of Gaussians per output dimensions
lr (float) – learning rate
batch_size (int) – Batch size.
loss_function (str) – Loss function can be either ‘loglikelihood’ or ‘mmd’ for Maximal Mean Discrepancy

Input:

minibatch (BxD): B is the batch size and D is the number of input: dimensions.

Output:

(pi, sigma, mu) (BxG, BxGxO, BxGxO): B is the batch size, G is the: number of Gaussians, and O is the number of dimensions for each Gaussian. Pi is a multinomial distribution of the Gaussians. Sigma is the standard deviation of each Gaussian. Mu is the mean of each Gaussian.

configure_optimizers()[source]#

Choose what optimizers and learning-rate schedulers to use in your optimization. Normally you’d need one. But in the case of GANs or similar you might have multiple. Optimization with multiple optimizers only works in the manual optimization mode.

Returns:

Any of these 6 options.

Single optimizer.
List or Tuple of optimizers.
Two lists - The first list has multiple optimizers, and the second has multiple LR schedulers (or multiple lr_scheduler_config).
Dictionary, with an "optimizer" key, and (optionally) a "lr_scheduler" key whose value is a single LR scheduler or lr_scheduler_config.
None - Fit will run without any optimizer.

The lr_scheduler_config is a dictionary which contains the scheduler and its associated configuration. The default configuration is shown below.

lr_scheduler_config = {
    # REQUIRED: The scheduler instance
    "scheduler": lr_scheduler,
    # The unit of the scheduler's step size, could also be 'step'.
    # 'epoch' updates the scheduler on epoch end whereas 'step'
    # updates it after a optimizer update.
    "interval": "epoch",
    # How many epochs/steps should pass between calls to
    # `scheduler.step()`. 1 corresponds to updating the learning
    # rate after every epoch/step.
    "frequency": 1,
    # Metric to to monitor for schedulers like `ReduceLROnPlateau`
    "monitor": "val_loss",
    # If set to `True`, will enforce that the value specified 'monitor'
    # is available when the scheduler is updated, thus stopping
    # training if not found. If set to `False`, it will only produce a warning
    "strict": True,
    # If using the `LearningRateMonitor` callback to monitor the
    # learning rate progress, this keyword can be used to specify
    # a custom logged name
    "name": None,
}

When there are schedulers in which the .step() method is conditioned on a value, such as the torch.optim.lr_scheduler.ReduceLROnPlateau scheduler, Lightning requires that the lr_scheduler_config contains the keyword "monitor" set to the metric name that the scheduler should be conditioned on.

Metrics can be made available to monitor by simply logging it using self.log('metric_to_track', metric_val) in your LightningModule.

Note

Some things to know:

Lightning calls .backward() and .step() automatically in case of automatic optimization.
If a learning rate scheduler is specified in configure_optimizers() with key "interval" (default “epoch”) in the scheduler configuration, Lightning will call the scheduler’s .step() method automatically in case of automatic optimization.
If you use 16-bit precision (precision=16), Lightning will automatically handle the optimizer.
If you use torch.optim.LBFGS, Lightning handles the closure function automatically for you.
If you use multiple optimizers, you will have to switch to ‘manual optimization’ mode and step them yourself.
If you need to control how often the optimizer steps, override the optimizer_step() hook.

training_step(batch, batch_idx)[source]#

Here you compute and return the training loss and some additional metrics for e.g. the progress bar or logger.

Parameters:

batch (Tensor | (Tensor, …) | [Tensor, …]) – The output of your DataLoader. A tensor, tuple or list.
batch_idx (int) – Integer displaying index of this batch

Returns:

Any of.

Tensor - The loss tensor
dict - A dictionary. Can include any keys, but must include the key 'loss'
None - Training will skip to the next batch. This is only for automatic optimization.
This is not supported for multi-GPU, TPU, IPU, or DeepSpeed.

In this step you’d normally do the forward pass and calculate the loss for a batch. You can also do fancier things like multiple forward passes or something model specific.

Example:

def training_step(self, batch, batch_idx):
    x, y, z = batch
    out = self.encoder(x)
    loss = self.loss(out, x)
    return loss

To use multiple optimizers, you can switch to ‘manual optimization’ and control their stepping:

def __init__(self):
    super().__init__()
    self.automatic_optimization = False

# Multiple optimizers (e.g.: GANs)
def training_step(self, batch, batch_idx):
    opt1, opt2 = self.optimizers()

    # do training_step with encoder
    ...
    opt1.step()
    # do training_step with decoder
    ...
    opt2.step()

Note

When accumulate_grad_batches > 1, the loss returned here will be automatically normalized by accumulate_grad_batches internally.

validation_step(batch, batch_idx)[source]#

Operates on a single batch of data from the validation set. In this step you’d might generate examples or calculate anything of interest like accuracy.

Parameters:

batch – The output of your DataLoader.
batch_idx – The index of this batch.
dataloader_idx – The index of the dataloader that produced this batch. (only if multiple val dataloaders used)

Returns:

Any object or value
None - Validation will skip to the next batch

# if you have one val dataloader:
def validation_step(self, batch, batch_idx):
    ...

# if you have multiple val dataloaders:
def validation_step(self, batch, batch_idx, dataloader_idx=0):
    ...

Examples:

# CASE 1: A single validation dataset
def validation_step(self, batch, batch_idx):
    x, y = batch

    # implement your own
    out = self(x)
    loss = self.loss(out, y)

    # log 6 example images
    # or generated text... or whatever
    sample_imgs = x[:6]
    grid = torchvision.utils.make_grid(sample_imgs)
    self.logger.experiment.add_image('example_images', grid, 0)

    # calculate acc
    labels_hat = torch.argmax(out, dim=1)
    val_acc = torch.sum(y == labels_hat).item() / (len(y) * 1.0)

    # log the outputs!
    self.log_dict({'val_loss': loss, 'val_acc': val_acc})

If you pass in multiple val dataloaders, validation_step() will have an additional argument. We recommend setting the default value of 0 so that you can quickly switch between single and multiple dataloaders.

# CASE 2: multiple validation dataloaders
def validation_step(self, batch, batch_idx, dataloader_idx=0):
    # dataloader_idx tells you which dataset this is.
    ...

Note

If you don’t need to validate you don’t need to implement this method.

Note

When the validation_step() is called, the model has been put in eval mode and PyTorch gradients have been disabled. At the end of validation, the model goes back to training mode and gradients are enabled.

common_step(batch)[source]#

forward(x)[source]#

Same as torch.nn.Module.forward().

Parameters:

*args – Whatever you decide to pass into the forward method.
**kwargs – Keyword arguments are also possible.

Returns:

Your model’s output

mdn_loss(pi, sigma, mu, y)[source]#: Calculates the error, given the MoG parameters and the target The loss is the negative log likelihood of the data given the MoG parameters.

static gaussian_probability(y, mu, sigma)[source]#

static mmd_loss(x, y, kernel)[source]#

https://www.kaggle.com/onurtunali/maximum-mean-discrepancy

Emprical maximum mean discrepancy. The lower the result, the more evidence that distributions are the same.

Parameters:

x – first sample, distribution P
y – second sample, distribution Q
kernel – kernel type such as “multiscale” or “rbf”

static add_noise(x)[source]#

static g_sample(pi, sigma, mu)[source]#

Gumbel sampling comes from here: hardmaru/pytorch_notebooks

static sample(pi, sigma, mu)[source]#: Draw samples from a MoG.

class ColumnsDataset(target_name, df)[source]#

Bases: Dataset

__init__(target_name, df)[source]#

class RBF(n_kernels=5, mul_factor=2.0, bandwidth=None)[source]#

Bases: Module

Methods

`add_module`(name, module)	Add a child module to the current module.
`apply`(fn)	Apply `fn` recursively to every submodule (as returned by `.children()`) as well as self.
`bfloat16`()	Casts all floating point parameters and buffers to `bfloat16` datatype.
`buffers`([recurse])	Return an iterator over module buffers.
`children`()	Return an iterator over immediate children modules.
`compile`(args, *kwargs)	Compile this Module's forward using `torch.compile()`.
`cpu`()	Move all model parameters and buffers to the CPU.
`cuda`([device])	Move all model parameters and buffers to the GPU.
`double`()	Casts all floating point parameters and buffers to `double` datatype.
`eval`()	Set the module in evaluation mode.
`extra_repr`()	Return the extra representation of the module.
`float`()	Casts all floating point parameters and buffers to `float` datatype.
`forward`(X)	Define the computation performed at every call.
`get_bandwidth`(L2_distances)	Get the bandwidth of the RBF kernel.
`get_buffer`(target)	Return the buffer given by `target` if it exists, otherwise throw an error.
`get_extra_state`()	Return any extra state to include in the module's state_dict.
`get_parameter`(target)	Return the parameter given by `target` if it exists, otherwise throw an error.
`get_submodule`(target)	Return the submodule given by `target` if it exists, otherwise throw an error.
`half`()	Casts all floating point parameters and buffers to `half` datatype.
`ipu`([device])	Move all model parameters and buffers to the IPU.
`load_state_dict`(state_dict[, strict, assign])	Copy parameters and buffers from `state_dict` into this module and its descendants.
`modules`()	Return an iterator over all modules in the network.
`mtia`([device])	Move all model parameters and buffers to the MTIA.
`named_buffers`([prefix, recurse, ...])	Return an iterator over module buffers, yielding both the name of the buffer as well as the buffer itself.
`named_children`()	Return an iterator over immediate children modules, yielding both the name of the module as well as the module itself.
`named_modules`([memo, prefix, remove_duplicate])	Return an iterator over all modules in the network, yielding both the name of the module as well as the module itself.
`named_parameters`([prefix, recurse, ...])	Return an iterator over module parameters, yielding both the name of the parameter as well as the parameter itself.
`parameters`([recurse])	Return an iterator over module parameters.
`register_backward_hook`(hook)	Register a backward hook on the module.
`register_buffer`(name, tensor[, persistent])	Add a buffer to the module.
`register_forward_hook`(hook, *[, prepend, ...])	Register a forward hook on the module.
`register_forward_pre_hook`(hook, *[, ...])	Register a forward pre-hook on the module.
`register_full_backward_hook`(hook[, prepend])	Register a backward hook on the module.
`register_full_backward_pre_hook`(hook[, prepend])	Register a backward pre-hook on the module.
`register_load_state_dict_post_hook`(hook)	Register a post-hook to be run after module's `load_state_dict()` is called.
`register_load_state_dict_pre_hook`(hook)	Register a pre-hook to be run before module's `load_state_dict()` is called.
`register_module`(name, module)	Alias for `add_module()`.
`register_parameter`(name, param)	Add a parameter to the module.
`register_state_dict_post_hook`(hook)	Register a post-hook for the `state_dict()` method.
`register_state_dict_pre_hook`(hook)	Register a pre-hook for the `state_dict()` method.
`requires_grad_`([requires_grad])	Change if autograd should record operations on parameters in this module.
`set_extra_state`(state)	Set extra state contained in the loaded state_dict.
`set_submodule`(target, module[, strict])	Set the submodule given by `target` if it exists, otherwise throw an error.
`share_memory`()	See `torch.Tensor.share_memory_()`.
`state_dict`(*args[, destination, prefix, ...])	Return a dictionary containing references to the whole state of the module.
`to`(args, *kwargs)	Move and/or cast the parameters and buffers.
`to_empty`(*, device[, recurse])	Move the parameters and buffers to the specified device without copying storage.
`train`([mode])	Set the module in training mode.
`type`(dst_type)	Casts all parameters and buffers to `dst_type`.
`xpu`([device])	Move all model parameters and buffers to the XPU.
`zero_grad`([set_to_none])	Reset gradients of all model parameters.

__call__

__init__(n_kernels=5, mul_factor=2.0, bandwidth=None)[source]#: Initialize internal Module state, shared by both nn.Module and ScriptModule.

get_bandwidth(L2_distances)[source]#: Get the bandwidth of the RBF kernel.

forward(X)[source]#

Define the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class MMDLoss(kernel=RBF())[source]#

Bases: Module

Methods

`add_module`(name, module)	Add a child module to the current module.
`apply`(fn)	Apply `fn` recursively to every submodule (as returned by `.children()`) as well as self.
`bfloat16`()	Casts all floating point parameters and buffers to `bfloat16` datatype.
`buffers`([recurse])	Return an iterator over module buffers.
`children`()	Return an iterator over immediate children modules.
`compile`(args, *kwargs)	Compile this Module's forward using `torch.compile()`.
`cpu`()	Move all model parameters and buffers to the CPU.
`cuda`([device])	Move all model parameters and buffers to the GPU.
`double`()	Casts all floating point parameters and buffers to `double` datatype.
`eval`()	Set the module in evaluation mode.
`extra_repr`()	Return the extra representation of the module.
`float`()	Casts all floating point parameters and buffers to `float` datatype.
`forward`(X, Y)	Define the computation performed at every call.
`get_buffer`(target)	Return the buffer given by `target` if it exists, otherwise throw an error.
`get_extra_state`()	Return any extra state to include in the module's state_dict.
`get_parameter`(target)	Return the parameter given by `target` if it exists, otherwise throw an error.
`get_submodule`(target)	Return the submodule given by `target` if it exists, otherwise throw an error.
`half`()	Casts all floating point parameters and buffers to `half` datatype.
`ipu`([device])	Move all model parameters and buffers to the IPU.
`load_state_dict`(state_dict[, strict, assign])	Copy parameters and buffers from `state_dict` into this module and its descendants.
`modules`()	Return an iterator over all modules in the network.
`mtia`([device])	Move all model parameters and buffers to the MTIA.
`named_buffers`([prefix, recurse, ...])	Return an iterator over module buffers, yielding both the name of the buffer as well as the buffer itself.
`named_children`()	Return an iterator over immediate children modules, yielding both the name of the module as well as the module itself.
`named_modules`([memo, prefix, remove_duplicate])	Return an iterator over all modules in the network, yielding both the name of the module as well as the module itself.
`named_parameters`([prefix, recurse, ...])	Return an iterator over module parameters, yielding both the name of the parameter as well as the parameter itself.
`parameters`([recurse])	Return an iterator over module parameters.
`register_backward_hook`(hook)	Register a backward hook on the module.
`register_buffer`(name, tensor[, persistent])	Add a buffer to the module.
`register_forward_hook`(hook, *[, prepend, ...])	Register a forward hook on the module.
`register_forward_pre_hook`(hook, *[, ...])	Register a forward pre-hook on the module.
`register_full_backward_hook`(hook[, prepend])	Register a backward hook on the module.
`register_full_backward_pre_hook`(hook[, prepend])	Register a backward pre-hook on the module.
`register_load_state_dict_post_hook`(hook)	Register a post-hook to be run after module's `load_state_dict()` is called.
`register_load_state_dict_pre_hook`(hook)	Register a pre-hook to be run before module's `load_state_dict()` is called.
`register_module`(name, module)	Alias for `add_module()`.
`register_parameter`(name, param)	Add a parameter to the module.
`register_state_dict_post_hook`(hook)	Register a post-hook for the `state_dict()` method.
`register_state_dict_pre_hook`(hook)	Register a pre-hook for the `state_dict()` method.
`requires_grad_`([requires_grad])	Change if autograd should record operations on parameters in this module.
`set_extra_state`(state)	Set extra state contained in the loaded state_dict.
`set_submodule`(target, module[, strict])	Set the submodule given by `target` if it exists, otherwise throw an error.
`share_memory`()	See `torch.Tensor.share_memory_()`.
`state_dict`(*args[, destination, prefix, ...])	Return a dictionary containing references to the whole state of the module.
`to`(args, *kwargs)	Move and/or cast the parameters and buffers.
`to_empty`(*, device[, recurse])	Move the parameters and buffers to the specified device without copying storage.
`train`([mode])	Set the module in training mode.
`type`(dst_type)	Casts all parameters and buffers to `dst_type`.
`xpu`([device])	Move all model parameters and buffers to the XPU.
`zero_grad`([set_to_none])	Reset gradients of all model parameters.

__call__

__init__(kernel=RBF())[source]#: Initialize internal Module state, shared by both nn.Module and ScriptModule.

forward(X, Y)[source]#

Define the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

This module contains the implementation of the BaseModel and MLPModel classes.

The BaseModel class serves as the base class for all models in the causalexplain package. It provides common functionality such as data initialization, logger initialization, and callback initialization.

The MLPModel class is a specific implementation of the BaseModel class, representing a Multi-Layer Perceptron (MLP) model. It defines the architecture and training process for the MLP model.

Example usage:

data = pd.read_csv(“~/phd/data/generated_linear_10.csv”) mlp = MLPModel(

target=’V0’, input_size=data.shape[1], hidden_dim=[64, 128, 64], activation=nn.ReLU(), learning_rate=0.05, batch_size=32, loss_fn=”mse”, dropout=0.05, num_epochs=200, dataframe=data, test_size=0.1, device=”auto”, seed=1234, early_stop=False)

mlp.train()

class BaseModel(target, dataframe, test_size, batch_size, tb_suffix, seed, early_stop=True, patience=10, min_delta=0.001)[source]#

Bases: object

Base class for all models in the causalexplain package.

Parameters:

target (str) – The target variable name.
dataframe (pd.DataFrame) – The input dataframe.
test_size (float) – The proportion of the data to use for testing.
batch_size (int) – The batch size for training.
tb_suffix (str) – The suffix to append to the TensorBoard log directory.
seed (int) – The random seed for reproducibility.
early_stop (bool, optional) – Whether to use early stopping during training. Defaults to True.
patience (int, optional) – The patience value for early stopping. Defaults to 10.
min_delta (float, optional) – The minimum change in the monitored metric to be considered an improvement for early stopping. Defaults to 0.001.

Attributes:

all_columns
callbacks
columns
extra_trainer_args
logger
model
scaler
train_loader
val_loader

Methods

`init_callbacks`([early_stop, min_delta, ...])	Initialize the callbacks for the training process.
`init_data`()	Initialize the data loaders for training and validation.
`init_logger`(suffix)	Initialize the logger for TensorBoard.
`override_extras`(**kwargs)	Override the extra trainer arguments.

model = None#

all_columns = None#

callbacks = None#

columns = None#

logger = None#

extra_trainer_args = None#

scaler = None#

train_loader = None#

val_loader = None#

n_rows = 0#

device = 'cpu'#

__init__(target, dataframe, test_size, batch_size, tb_suffix, seed, early_stop=True, patience=10, min_delta=0.001)[source]#

init_logger(suffix)[source]#

Initialize the logger for TensorBoard.

Parameters:: suffix (str) – The suffix to append to the logger name.

init_callbacks(early_stop=True, min_delta=0.001, patience=10, prog_bar=False)[source]#

Initialize the callbacks for the training process.

Parameters:

early_stop (bool, optional) – Whether to use early stopping. Defaults to True.
min_delta (float, optional) – The minimum change in the monitored metric to be considered an improvement for early stopping. Defaults to 0.001.
patience (int, optional) – The patience value for early stopping. Defaults to 10.
prog_bar (bool, optional) – Whether to use a progress bar during training. Defaults to False.

init_data()[source]#: Initialize the data loaders for training and validation.

override_extras(**kwargs)[source]#

Override the extra trainer arguments.

Parameters:: **kwargs – Additional keyword arguments to override the default values.

class MLPModel(target, input_size, hidden_dim, activation, learning_rate, batch_size, loss_fn, dropout, num_epochs, dataframe, test_size, device, seed, early_stop=True, patience=10, min_delta=0.001, **kwargs)[source]#

Bases: BaseModel

Implementation of the Multi-Layer Perceptron (MLP) model.

Parameters:

target (str) – The target variable name.
input_size (int) – The size of the input features.
hidden_dim (List[int]) – The dimensions of the hidden layers.
activation (nn.Module) – The activation function to use in the hidden layers.
learning_rate (float) – The learning rate for training.
batch_size (int) – The batch size for training.
loss_fn (str) – The loss function to use.
dropout (float) – The dropout rate.
num_epochs (int) – The number of training epochs.
dataframe (pd.DataFrame) – The input dataframe.
test_size (float) – The proportion of the data to use for testing.
device (Union[int, str]) – The device to use for training.
seed (int) – The random seed for reproducibility.
early_stop (bool, optional) – Whether to use early stopping during training. Defaults to True.
patience (int, optional) – The patience value for early stopping. Defaults to 10.
min_delta (float, optional) – The minimum change in the monitored metric to be considered an improvement for early stopping. Defaults to 0.001.
**kwargs – Additional keyword arguments to override the default values.

Attributes:

all_columns
callbacks
columns
extra_trainer_args
logger
model
scaler
train_loader
val_loader

Methods

`init_callbacks`([early_stop, min_delta, ...])	Initialize the callbacks for the training process.
`init_data`()	Initialize the data loaders for training and validation.
`init_logger`(suffix)	Initialize the logger for TensorBoard.
`override_extras`(**kwargs)	Override the extra trainer arguments.
`train`()	Train the MLP model.

__init__(target, input_size, hidden_dim, activation, learning_rate, batch_size, loss_fn, dropout, num_epochs, dataframe, test_size, device, seed, early_stop=True, patience=10, min_delta=0.001, **kwargs)[source]#

train()[source]#: Train the MLP model.

This module contains functions to extract and visualize the weights of a neural network model. The weights are extracted from the model and then visualized in different ways to help understand the relationships between the input features and the target variable. The functions in this module are used to visualize the weights of a neural network model and to identify relationships between the input features and the target variable.

extract_weights(model, verbose=False)[source]#

Extracts the weights from a given model.

Parameters: - model: The model from which to extract the weights. - verbose: If True, prints the names of the weights being extracted.

Returns: - weights: A list of the extracted weights.

see_weights_to_hidden(weights_matrix, input_names, target)[source]#

Visualizes the weights connecting the input layer to the hidden layer.

Parameters:

W (np.ndarray) – The weight matrix of shape (num_hidden, num_inputs) representing
layer. (the connections between the input layer and the hidden)
input_names (List[str]) – A list of input names corresponding to each input feature.
target (str) – The target variable.

Returns:

None

Return type:

None

see_weights_from_input(W, input_names, target)[source]#

plot_feature(result, axis=None)[source]#

plot_features(results, n_rows, n_cols, all_columns)[source]#

layer_weights(dff_net, target, layer=0)[source]#

summarize_weights(weights, feature_names, layer=0, scale=True)[source]#

Summarize the weights of a neural network model by calculating the mean, median, and positive semidefinite values of the weights for each feature.

Parameters: - weights: The weights of the neural network model. - feature_names: A list of feature names. - layer: The layer of the neural network model from which to extract the weights. - scale: If True, scale the summary values.

Returns: - psd: A DataFrame containing the summary values of the weights for each feature.

identify_relationships(weights, feature_names, eps=0.5, min_counts=2, plot=True)[source]#

Run a clustering algorithm on the summary values of weights coming out of input cells in the neural network. Summary values are the mean, the median and the positive semidefinite values. Those three dimensions are then clustered to identify what clusters have less or equal than min_count elements, to consider that cluster as relevant to produce the regression for that given feature the NN has been trained for.

Parameters: - weights: The weights of the neural network model. - feature_names: A list of feature names. - eps: The maximum distance between two samples for one to be considered as in

the neighborhood of the other.

min_counts: The minimum number of elements in a cluster to consider it relevant.
plot: If True, plot the clusters.

Returns: - rels: A dictionary containing the relevant features for each target feature.

infer_causal_relationships(trained_models, feature_names, prune=False, verbose=False, plot=False, prog_bar=True, silent=False)[source]#

Infer causal relationships between the input features and the target variable based on the SHAP values of the trained models.

Parameters: - trained_models: A dictionary of trained models, where the keys are the target

variable names and the values are the trained models.

feature_names: A list of input feature names.
prune: If True, remove asymmetric edges from the graph.
verbose: If True, print additional information.
plot: If True, plot the results.
prog_bar: If True, show a progress bar.
silent: If True, do not show any output.

Returns: - A dictionary containing the SHAP values, the average SHAP values, the thresholds,

the raw graph, and the oriented graph.

A class to train DFF networks for all variables in data. Each network will be trained to predict one of the variables in the data, using the rest as predictors plus one source of random noise.

2022,2023,2024, J. Renero

class NNRegressor(hidden_dim=[75, 17], activation='relu', learning_rate=0.0046, dropout=0.001, batch_size=44, num_epochs=40, loss_fn='mse', device='cpu', test_size=0.1, early_stop=False, patience=10, min_delta=0.001, correlation_th=None, random_state=1234, verbose=False, prog_bar=True, silent=False, optuna_prog_bar=False)[source]#

Bases: BaseEstimator

A class to train DFF networks for all variables in data. Each network will be trained to predict one of the variables in the data, using the rest as predictors plus one source of random noise.

Methods

`fit`(X)	A reference implementation of a fitting function.
`get_metadata_routing`()	Get metadata routing of this object.
`get_params`([deep])	Get parameters for this estimator.
`predict`(X)	Predicts the values for each target variable.
`score`(X)	Scores the model using the loss function.
`set_params`(**params)	Set the parameters of this estimator.
`tune`(training_data, test_data[, study_name, ...])	Tune the hyperparameters of the model using Optuna.
`tune_fit`(X[, hpo_study_name, hpo_min_loss, ...])	Tune the hyperparameters of the model using Optuna, and the fit the model with the best parameters.

__init__(hidden_dim=[75, 17], activation='relu', learning_rate=0.0046, dropout=0.001, batch_size=44, num_epochs=40, loss_fn='mse', device='cpu', test_size=0.1, early_stop=False, patience=10, min_delta=0.001, correlation_th=None, random_state=1234, verbose=False, prog_bar=True, silent=False, optuna_prog_bar=False)[source]#

Train DFF networks for all variables in data. Each network will be trained to predict one of the variables in the data, using the rest as predictors plus one source of random noise.

Parameters:

data (pandas.DataFrame) – The dataframe with the continuous variables.
model_type (str) – The type of model to use. Either ‘dff’ or ‘mlp’.
hidden_dim (int) – The dimension(s) of the hidden layer(s). This value can be a single integer for DFF or an array with the dimension of each hidden layer for the MLP case.
activation (str) – The activation function to use, either ‘relu’ or ‘selu’. Default is ‘relu’.
learning_rate (float) – The learning rate for the optimizer.
dropout (float) – The dropout rate for the dropout layer.
batch_size (int) – The batch size for the optimizer.
num_epochs (int) – The number of epochs for the optimizer.
loss_fn (str) – The loss function to use. Default is “mmd”.
device (str) – The device to use. Either “cpu”, “cuda”, or “mps”. Default is “auto”.
test_size (float) – The proportion of the data to use for testing. Default is 0.1.
seed (int) – The seed for the random number generator. Default is 1234.
early_stop (bool) – Whether to use early stopping. Default is True.
patience (int) – The patience for early stopping. Default is 10.
min_delta (float) – The minimum delta for early stopping. Default is 0.001.
prog_bar (bool) – Whether to enable the progress bar. Default is False.

Returns:

A dictionary with the trained DFF networks, using the name of the: variables as the key.

Return type:

dict

fit(X)[source]#

A reference implementation of a fitting function.

Parameters:

X ({array-like, sparse matrix}, shape (n_samples, n_features)) – The training input samples.
y (array-like, shape (n_samples,) or (n_samples, n_outputs)) – The target values (class labels in classification, real numbers in regression).

Returns:

self – Returns self.

Return type:

object

predict(X)[source]#

Predicts the values for each target variable.

Parameters:: X (pd.DataFrame) – The input data to make predictions on.
Returns:: The predictions for each target variable.
Return type:: np.ndarray

score(X)[source]#: Scores the model using the loss function. It returns the list of losses for each target variable.

tune(training_data, test_data, study_name=None, min_loss=0.05, storage='sqlite:///rex_tuning.db', load_if_exists=True, n_trials=20)[source]#

Tune the hyperparameters of the model using Optuna.

tune_fit(X, hpo_study_name=None, hpo_min_loss=0.05, hpo_storage='sqlite:///rex_tuning.db', hpo_load_if_exists=True, hpo_n_trials=20)[source]#

Tune the hyperparameters of the model using Optuna, and the fit the model with the best parameters.

custom_main(score=False, tune=False)[source]#

This module contains the GBTRegressor class, which is a wrapper around the GradientBoostingRegressor class from the scikit-learn library. The class implements the fit, predict, and score methods to fit a separate model for each feature in the dataframe, and predict and score the model for each feature in the dataframe.

The class also implements a tune method to tune the hyperparameters of the model using Optuna. The tune method uses the Objective class to define the objective function for the hyperparameter optimization. The Objective class is a nested class within the GBTRegressor class, and it defines the objective function for the hyperparameter optimization. The class is designed to be used with the Optuna library.

The module also contains a main function that can be used to run the GBTRegressor class with the tune method. The main function takes the name of the experiment as an argument, and loads the data and the reference graph for the experiment. The main function then splits the data into train and test, and runs the tune method to tune the hyperparameters of the model. The main function can be used to run the GBTRegressor class with the tune method for any experiment.

The module can be run as a script to run the main function with the tune method for a specific experiment. The experiment name is passed as an argument to the script, and the main function is called with the experiment name as an argument. The script can be used to run the GBTRegressor class with the tune method for any experiment.

Example

$ python gbt.py rex_generated_linear_6

This will run the GBTRegressor class with the tune method for the experiment ‘rex_generated_linear_6’.

The module can also be imported and used in other modules or scripts to run the GBTRegressor class with the tune method for any experiment.

Example

from causalexplain.models.gbt import custom_main

custom_main(“rex_generated_linear_6”)

class GBTRegressor(loss='squared_error', learning_rate=0.1, n_estimators=100, subsample=1.0, criterion='friedman_mse', min_samples_split=2, min_samples_leaf=1, min_weight_fraction_leaf=0.0, max_depth=3, min_impurity_decrease=0.0, init=None, random_state=42, max_features=None, max_leaf_nodes=None, warm_start=False, validation_fraction=0.1, n_iter_no_change=None, tol=0.0001, ccp_alpha=0.0, correlation_th=None, verbose=False, silent=False, prog_bar=True, optuna_prog_bar=False)[source]#

Bases: GradientBoostingRegressor

Attributes:

feature_importances_: The impurity-based feature importances.

Methods

`apply`(X)	Apply trees in the ensemble to X, return leaf indices.
`fit`(X)	Call the fit method of the parent class with every feature from the "X" dataframe as a target variable.
`get_metadata_routing`()	Get metadata routing of this object.
`get_params`([deep])	Get parameters for this estimator.
`predict`(X)	Call the predict method of the parent class with every feature from the "X" dataframe as a target variable.
`score`(X)	Call the score method of the parent class with every feature from the "X" dataframe as a target variable.
`set_fit_request`(*[, monitor, sample_weight])	Request metadata passed to the `fit` method.
`set_params`(**params)	Set the parameters of this estimator.
`set_score_request`(*[, sample_weight])	Request metadata passed to the `score` method.
`staged_predict`(X)	Predict regression target at each stage for X.
`tune`(training_data, test_data[, study_name, ...])	Tune the hyperparameters of the model using Optuna.
`tune_fit`(X[, hpo_study_name, hpo_min_loss, ...])	Tune the hyperparameters of the model using Optuna, and the fit the model with the best parameters.

__init__(loss='squared_error', learning_rate=0.1, n_estimators=100, subsample=1.0, criterion='friedman_mse', min_samples_split=2, min_samples_leaf=1, min_weight_fraction_leaf=0.0, max_depth=3, min_impurity_decrease=0.0, init=None, random_state=42, max_features=None, max_leaf_nodes=None, warm_start=False, validation_fraction=0.1, n_iter_no_change=None, tol=0.0001, ccp_alpha=0.0, correlation_th=None, verbose=False, silent=False, prog_bar=True, optuna_prog_bar=False)[source]#

random_state = 42#

fit(X)[source]#: Call the fit method of the parent class with every feature from the “X” dataframe as a target variable. This will fit a separate model for each feature in the dataframe.

predict(X)[source]#: Call the predict method of the parent class with every feature from the “X” dataframe as a target variable. This will predict a separate value for each feature in the dataframe.

score(X)[source]#: Call the score method of the parent class with every feature from the “X” dataframe as a target variable. This will score a separate model for each feature in the dataframe.

tune(training_data, test_data, study_name=None, min_loss=0.05, storage='sqlite:///rex_tuning.db', load_if_exists=True, n_trials=20)[source]#

Tune the hyperparameters of the model using Optuna.

tune_fit(X, hpo_study_name=None, hpo_min_loss=0.05, hpo_storage='sqlite:///rex_tuning.db', hpo_load_if_exists=True, hpo_n_trials=20)[source]#

Tune the hyperparameters of the model using Optuna, and the fit the model with the best parameters.

custom_main(experiment_name='custom_rex', score=False, tune=False)[source]#

Module contents#