causalexplain.models package#
Submodules#
- class MLP(input_size, layers_dimensions, activation, batch_size, lr, loss, dropout)[source]#
Bases:
LightningModule
- Attributes:
automatic_optimization
If set to
False
you are responsible for calling.backward()
,.step()
,.zero_grad()
.current_epoch
The current epoch in the
Trainer
, or 0 if not attached.- dtype
example_input_array
The example input array is a specification of what the module can consume in the
forward()
method.- fabric
global_rank
The index of the current process across all nodes and devices.
global_step
Total training batches seen across all epochs.
hparams
The collection of hyperparameters saved with
save_hyperparameters()
.hparams_initial
The collection of hyperparameters saved with
save_hyperparameters()
.local_rank
The index of the current process within a single node.
logger
Reference to the logger object in the Trainer.
loggers
Reference to the list of loggers in the Trainer.
on_gpu
Returns
True
if this model is currently located on a GPU.- trainer
Methods
Block
(d_in, d_out, activation, bias, ...)The main building block of MLP.
add_module
(name, module)Add a child module to the current module.
all_gather
(data[, group, sync_grads])Gather tensors or collections of tensors from multiple processes.
apply
(fn)Apply
fn
recursively to every submodule (as returned by.children()
) as well as self.backward
(loss, *args, **kwargs)Called to perform backward on the loss returned in
training_step()
.bfloat16
()Casts all floating point parameters and buffers to
bfloat16
datatype.buffers
([recurse])Return an iterator over module buffers.
children
()Return an iterator over immediate children modules.
clip_gradients
(optimizer[, ...])Handles gradient clipping internally.
compile
(*args, **kwargs)Compile this Module's forward using
torch.compile()
.configure_callbacks
()Configure model-specific callbacks.
configure_gradient_clipping
(optimizer[, ...])Perform gradient clipping for the optimizer parameters.
Choose what optimizers and learning-rate schedulers to use in your optimization.
configure_sharded_model
()Hook to create modules in a distributed aware context.
cpu
()See
torch.nn.Module.cpu()
.cuda
([device])Moves all model parameters and buffers to the GPU.
double
()See
torch.nn.Module.double()
.eval
()Set the module in evaluation mode.
extra_repr
()Return the extra representation of the module.
float
()See
torch.nn.Module.float()
.forward
(x)Same as
torch.nn.Module.forward()
.freeze
()Freeze all params for inference.
get_buffer
(target)Return the buffer given by
target
if it exists, otherwise throw an error.get_extra_state
()Return any extra state to include in the module's state_dict.
get_parameter
(target)Return the parameter given by
target
if it exists, otherwise throw an error.get_submodule
(target)Return the submodule given by
target
if it exists, otherwise throw an error.half
()See
torch.nn.Module.half()
.ipu
([device])Move all model parameters and buffers to the IPU.
load_from_checkpoint
(checkpoint_path[, ...])Primary way of loading a model from a checkpoint.
load_state_dict
(state_dict[, strict, assign])Copy parameters and buffers from
state_dict
into this module and its descendants.log
(name, value[, prog_bar, logger, ...])Log a key, value pair.
log_dict
(dictionary[, prog_bar, logger, ...])Log a dictionary of values at once.
lr_scheduler_step
(scheduler, metric)Override this method to adjust the default way the
Trainer
calls each scheduler.lr_schedulers
()Returns the learning rate scheduler(s) that are being used during training.
manual_backward
(loss, *args, **kwargs)Call this directly from your
training_step()
when doing optimizations manually.modules
()Return an iterator over all modules in the network.
mtia
([device])Move all model parameters and buffers to the MTIA.
named_buffers
([prefix, recurse, ...])Return an iterator over module buffers, yielding both the name of the buffer as well as the buffer itself.
named_children
()Return an iterator over immediate children modules, yielding both the name of the module as well as the module itself.
named_modules
([memo, prefix, remove_duplicate])Return an iterator over all modules in the network, yielding both the name of the module as well as the module itself.
named_parameters
([prefix, recurse, ...])Return an iterator over module parameters, yielding both the name of the parameter as well as the parameter itself.
on_after_backward
()Called after
loss.backward()
and before optimizers are stepped.on_after_batch_transfer
(batch, dataloader_idx)Override to alter or apply batch augmentations to your batch after it is transferred to the device.
on_before_backward
(loss)Called before
loss.backward()
.on_before_batch_transfer
(batch, dataloader_idx)Override to alter or apply batch augmentations to your batch before it is transferred to the device.
on_before_optimizer_step
(optimizer)Called before
optimizer.step()
.on_before_zero_grad
(optimizer)Called after
training_step()
and beforeoptimizer.zero_grad()
.on_fit_end
()Called at the very end of fit.
on_fit_start
()Called at the very beginning of fit.
on_load_checkpoint
(checkpoint)Called by Lightning to restore your model.
on_predict_batch_end
(outputs, batch, batch_idx)Called in the predict loop after the batch.
on_predict_batch_start
(batch, batch_idx[, ...])Called in the predict loop before anything happens for that batch.
on_predict_end
()Called at the end of predicting.
on_predict_epoch_end
()Called at the end of predicting.
on_predict_epoch_start
()Called at the beginning of predicting.
on_predict_model_eval
()Sets the model to eval during the predict loop.
on_predict_start
()Called at the beginning of predicting.
on_save_checkpoint
(checkpoint)Called by Lightning when saving a checkpoint to give you a chance to store anything else you might want to save.
on_test_batch_end
(outputs, batch, batch_idx)Called in the test loop after the batch.
on_test_batch_start
(batch, batch_idx[, ...])Called in the test loop before anything happens for that batch.
on_test_end
()Called at the end of testing.
on_test_epoch_end
()Called in the test loop at the very end of the epoch.
on_test_epoch_start
()Called in the test loop at the very beginning of the epoch.
on_test_model_eval
()Sets the model to eval during the test loop.
on_test_model_train
()Sets the model to train during the test loop.
on_test_start
()Called at the beginning of testing.
on_train_batch_end
(outputs, batch, batch_idx)Called in the training loop after the batch.
on_train_batch_start
(batch, batch_idx)Called in the training loop before anything happens for that batch.
on_train_end
()Called at the end of training before logger experiment is closed.
on_train_epoch_end
()Called in the training loop at the very end of the epoch.
on_train_epoch_start
()Called in the training loop at the very beginning of the epoch.
on_train_start
()Called at the beginning of training after sanity check.
on_validation_batch_end
(outputs, batch, ...)Called in the validation loop after the batch.
on_validation_batch_start
(batch, batch_idx)Called in the validation loop before anything happens for that batch.
on_validation_end
()Called at the end of validation.
on_validation_epoch_end
()Called in the validation loop at the very end of the epoch.
on_validation_epoch_start
()Called in the validation loop at the very beginning of the epoch.
on_validation_model_eval
()Sets the model to eval during the val loop.
on_validation_model_train
()Sets the model to train during the val loop.
on_validation_start
()Called at the beginning of validation.
optimizer_step
(epoch, batch_idx, optimizer)Override this method to adjust the default way the
Trainer
calls the optimizer.optimizer_zero_grad
(epoch, batch_idx, optimizer)Override this method to change the default behaviour of
optimizer.zero_grad()
.optimizers
([use_pl_optimizer])Returns the optimizer(s) that are being used during training.
parameters
([recurse])Return an iterator over module parameters.
predict
(x)predict_dataloader
()An iterable or collection of iterables specifying prediction samples.
predict_step
(batch, batch_idx, **kwargs)Step function called during
predict()
.prepare_data
()Use this to download and prepare data.
print
(*args, **kwargs)Prints only from process 0.
register_backward_hook
(hook)Register a backward hook on the module.
register_buffer
(name, tensor[, persistent])Add a buffer to the module.
register_forward_hook
(hook, *[, prepend, ...])Register a forward hook on the module.
register_forward_pre_hook
(hook, *[, ...])Register a forward pre-hook on the module.
register_full_backward_hook
(hook[, prepend])Register a backward hook on the module.
register_full_backward_pre_hook
(hook[, prepend])Register a backward pre-hook on the module.
register_load_state_dict_post_hook
(hook)Register a post-hook to be run after module's
load_state_dict()
is called.register_load_state_dict_pre_hook
(hook)Register a pre-hook to be run before module's
load_state_dict()
is called.register_module
(name, module)Alias for
add_module()
.register_parameter
(name, param)Add a parameter to the module.
register_state_dict_post_hook
(hook)Register a post-hook for the
state_dict()
method.register_state_dict_pre_hook
(hook)Register a pre-hook for the
state_dict()
method.requires_grad_
([requires_grad])Change if autograd should record operations on parameters in this module.
save_hyperparameters
(*args[, ignore, frame, ...])Save arguments to
hparams
attribute.set_extra_state
(state)Set extra state contained in the loaded state_dict.
set_submodule
(target, module[, strict])Set the submodule given by
target
if it exists, otherwise throw an error.setup
(stage)Called at the beginning of fit (train + validate), validate, test, or predict.
share_memory
()See
torch.Tensor.share_memory_()
.state_dict
(*args[, destination, prefix, ...])Return a dictionary containing references to the whole state of the module.
teardown
(stage)Called at the end of fit (train + validate), validate, test, or predict.
test_dataloader
()An iterable or collection of iterables specifying test samples.
test_step
(*args, **kwargs)Operates on a single batch of data from the test set.
to
(*args, **kwargs)See
torch.nn.Module.to()
.to_empty
(*, device[, recurse])Move the parameters and buffers to the specified device without copying storage.
to_onnx
(file_path[, input_sample])Saves the model in ONNX format.
to_torchscript
([file_path, method, ...])By default compiles the whole model to a
ScriptModule
.toggle_optimizer
(optimizer)Makes sure only the gradients of the current optimizer's parameters are calculated in the training step to prevent dangling gradients in multiple-optimizer setup.
train
([mode])Set the module in training mode.
train_dataloader
()An iterable or collection of iterables specifying training samples.
training_step
(batch, batch_idx)Here you compute and return the training loss and some additional metrics for e.g. the progress bar or logger.
transfer_batch_to_device
(batch, device, ...)Override this hook if your
DataLoader
returns tensors wrapped in a custom data structure.type
(dst_type)See
torch.nn.Module.type()
.unfreeze
()Unfreeze all parameters for training.
untoggle_optimizer
(optimizer)Resets the state of required gradients that were toggled with
toggle_optimizer()
.val_dataloader
()An iterable or collection of iterables specifying validation samples.
validation_step
(batch, batch_idx)Operates on a single batch of data from the validation set.
xpu
([device])Move all model parameters and buffers to the XPU.
zero_grad
([set_to_none])Reset gradients of all model parameters.
__call__
- device = 'cpu'#
- class Block(d_in, d_out, activation, bias, dropout, device)[source]#
Bases:
Module
The main building block of MLP.
Methods
add_module
(name, module)Add a child module to the current module.
apply
(fn)Apply
fn
recursively to every submodule (as returned by.children()
) as well as self.bfloat16
()Casts all floating point parameters and buffers to
bfloat16
datatype.buffers
([recurse])Return an iterator over module buffers.
children
()Return an iterator over immediate children modules.
compile
(*args, **kwargs)Compile this Module's forward using
torch.compile()
.cpu
()Move all model parameters and buffers to the CPU.
cuda
([device])Move all model parameters and buffers to the GPU.
double
()Casts all floating point parameters and buffers to
double
datatype.eval
()Set the module in evaluation mode.
extra_repr
()Return the extra representation of the module.
float
()Casts all floating point parameters and buffers to
float
datatype.forward
(x)Define the computation performed at every call.
get_buffer
(target)Return the buffer given by
target
if it exists, otherwise throw an error.get_extra_state
()Return any extra state to include in the module's state_dict.
get_parameter
(target)Return the parameter given by
target
if it exists, otherwise throw an error.get_submodule
(target)Return the submodule given by
target
if it exists, otherwise throw an error.half
()Casts all floating point parameters and buffers to
half
datatype.ipu
([device])Move all model parameters and buffers to the IPU.
load_state_dict
(state_dict[, strict, assign])Copy parameters and buffers from
state_dict
into this module and its descendants.modules
()Return an iterator over all modules in the network.
mtia
([device])Move all model parameters and buffers to the MTIA.
named_buffers
([prefix, recurse, ...])Return an iterator over module buffers, yielding both the name of the buffer as well as the buffer itself.
named_children
()Return an iterator over immediate children modules, yielding both the name of the module as well as the module itself.
named_modules
([memo, prefix, remove_duplicate])Return an iterator over all modules in the network, yielding both the name of the module as well as the module itself.
named_parameters
([prefix, recurse, ...])Return an iterator over module parameters, yielding both the name of the parameter as well as the parameter itself.
parameters
([recurse])Return an iterator over module parameters.
register_backward_hook
(hook)Register a backward hook on the module.
register_buffer
(name, tensor[, persistent])Add a buffer to the module.
register_forward_hook
(hook, *[, prepend, ...])Register a forward hook on the module.
register_forward_pre_hook
(hook, *[, ...])Register a forward pre-hook on the module.
register_full_backward_hook
(hook[, prepend])Register a backward hook on the module.
register_full_backward_pre_hook
(hook[, prepend])Register a backward pre-hook on the module.
register_load_state_dict_post_hook
(hook)Register a post-hook to be run after module's
load_state_dict()
is called.register_load_state_dict_pre_hook
(hook)Register a pre-hook to be run before module's
load_state_dict()
is called.register_module
(name, module)Alias for
add_module()
.register_parameter
(name, param)Add a parameter to the module.
register_state_dict_post_hook
(hook)Register a post-hook for the
state_dict()
method.register_state_dict_pre_hook
(hook)Register a pre-hook for the
state_dict()
method.requires_grad_
([requires_grad])Change if autograd should record operations on parameters in this module.
set_extra_state
(state)Set extra state contained in the loaded state_dict.
set_submodule
(target, module[, strict])Set the submodule given by
target
if it exists, otherwise throw an error.share_memory
()See
torch.Tensor.share_memory_()
.state_dict
(*args[, destination, prefix, ...])Return a dictionary containing references to the whole state of the module.
to
(*args, **kwargs)Move and/or cast the parameters and buffers.
to_empty
(*, device[, recurse])Move the parameters and buffers to the specified device without copying storage.
train
([mode])Set the module in training mode.
type
(dst_type)Casts all parameters and buffers to
dst_type
.xpu
([device])Move all model parameters and buffers to the XPU.
zero_grad
([set_to_none])Reset gradients of all model parameters.
__call__
- __init__(d_in, d_out, activation, bias, dropout, device)[source]#
Initialize internal Module state, shared by both nn.Module and ScriptModule.
- forward(x)[source]#
Define the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- forward(x)[source]#
Same as
torch.nn.Module.forward()
.- Parameters:
*args – Whatever you decide to pass into the forward method.
**kwargs – Keyword arguments are also possible.
- Returns:
Your model’s output
- predict_step(batch, batch_idx, **kwargs)[source]#
Step function called during
predict()
. By default, it callsforward()
. Override to add any processing logic.The
predict_step()
is used to scale inference on multi-devices.To prevent an OOM error, it is possible to use
BasePredictionWriter
callback to write the predictions to disk or database after each batch or on epoch end.The
BasePredictionWriter
should be used while using a spawn based accelerator. This happens forTrainer(strategy="ddp_spawn")
or training on 8 TPU cores withTrainer(accelerator="tpu", devices=8)
as predictions won’t be returned.Example
class MyModel(LightningModule): def predict_step(self, batch, batch_idx, dataloader_idx=0): return self(batch) dm = ... model = MyModel() trainer = Trainer(accelerator="gpu", devices=2) predictions = trainer.predict(model, dm)
- Parameters:
batch – Current batch.
batch_idx – Index of current batch.
dataloader_idx – Index of the current dataloader.
- Returns:
Predicted output
- training_step(batch, batch_idx)[source]#
Here you compute and return the training loss and some additional metrics for e.g. the progress bar or logger.
- Parameters:
batch (
Tensor
| (Tensor
, …) | [Tensor
, …]) – The output of yourDataLoader
. A tensor, tuple or list.batch_idx (
int
) – Integer displaying index of this batch
- Returns:
Any of.
Tensor
- The loss tensordict
- A dictionary. Can include any keys, but must include the key'loss'
None
- Training will skip to the next batch. This is only for automatic optimization.This is not supported for multi-GPU, TPU, IPU, or DeepSpeed.
In this step you’d normally do the forward pass and calculate the loss for a batch. You can also do fancier things like multiple forward passes or something model specific.
Example:
def training_step(self, batch, batch_idx): x, y, z = batch out = self.encoder(x) loss = self.loss(out, x) return loss
To use multiple optimizers, you can switch to ‘manual optimization’ and control their stepping:
def __init__(self): super().__init__() self.automatic_optimization = False # Multiple optimizers (e.g.: GANs) def training_step(self, batch, batch_idx): opt1, opt2 = self.optimizers() # do training_step with encoder ... opt1.step() # do training_step with decoder ... opt2.step()
Note
When
accumulate_grad_batches
> 1, the loss returned here will be automatically normalized byaccumulate_grad_batches
internally.
- validation_step(batch, batch_idx)[source]#
Operates on a single batch of data from the validation set. In this step you’d might generate examples or calculate anything of interest like accuracy.
- Parameters:
batch – The output of your
DataLoader
.batch_idx – The index of this batch.
dataloader_idx – The index of the dataloader that produced this batch. (only if multiple val dataloaders used)
- Returns:
Any object or value
None
- Validation will skip to the next batch
# if you have one val dataloader: def validation_step(self, batch, batch_idx): ... # if you have multiple val dataloaders: def validation_step(self, batch, batch_idx, dataloader_idx=0): ...
Examples:
# CASE 1: A single validation dataset def validation_step(self, batch, batch_idx): x, y = batch # implement your own out = self(x) loss = self.loss(out, y) # log 6 example images # or generated text... or whatever sample_imgs = x[:6] grid = torchvision.utils.make_grid(sample_imgs) self.logger.experiment.add_image('example_images', grid, 0) # calculate acc labels_hat = torch.argmax(out, dim=1) val_acc = torch.sum(y == labels_hat).item() / (len(y) * 1.0) # log the outputs! self.log_dict({'val_loss': loss, 'val_acc': val_acc})
If you pass in multiple val dataloaders,
validation_step()
will have an additional argument. We recommend setting the default value of 0 so that you can quickly switch between single and multiple dataloaders.# CASE 2: multiple validation dataloaders def validation_step(self, batch, batch_idx, dataloader_idx=0): # dataloader_idx tells you which dataset this is. ...
Note
If you don’t need to validate you don’t need to implement this method.
Note
When the
validation_step()
is called, the model has been put in eval mode and PyTorch gradients have been disabled. At the end of validation, the model goes back to training mode and gradients are enabled.
- configure_optimizers()[source]#
Choose what optimizers and learning-rate schedulers to use in your optimization. Normally you’d need one. But in the case of GANs or similar you might have multiple. Optimization with multiple optimizers only works in the manual optimization mode.
- Returns:
Any of these 6 options.
Single optimizer.
List or Tuple of optimizers.
Two lists - The first list has multiple optimizers, and the second has multiple LR schedulers (or multiple
lr_scheduler_config
).Dictionary, with an
"optimizer"
key, and (optionally) a"lr_scheduler"
key whose value is a single LR scheduler orlr_scheduler_config
.None - Fit will run without any optimizer.
The
lr_scheduler_config
is a dictionary which contains the scheduler and its associated configuration. The default configuration is shown below.lr_scheduler_config = { # REQUIRED: The scheduler instance "scheduler": lr_scheduler, # The unit of the scheduler's step size, could also be 'step'. # 'epoch' updates the scheduler on epoch end whereas 'step' # updates it after a optimizer update. "interval": "epoch", # How many epochs/steps should pass between calls to # `scheduler.step()`. 1 corresponds to updating the learning # rate after every epoch/step. "frequency": 1, # Metric to to monitor for schedulers like `ReduceLROnPlateau` "monitor": "val_loss", # If set to `True`, will enforce that the value specified 'monitor' # is available when the scheduler is updated, thus stopping # training if not found. If set to `False`, it will only produce a warning "strict": True, # If using the `LearningRateMonitor` callback to monitor the # learning rate progress, this keyword can be used to specify # a custom logged name "name": None, }
When there are schedulers in which the
.step()
method is conditioned on a value, such as thetorch.optim.lr_scheduler.ReduceLROnPlateau
scheduler, Lightning requires that thelr_scheduler_config
contains the keyword"monitor"
set to the metric name that the scheduler should be conditioned on.Metrics can be made available to monitor by simply logging it using
self.log('metric_to_track', metric_val)
in yourLightningModule
.Note
Some things to know:
Lightning calls
.backward()
and.step()
automatically in case of automatic optimization.If a learning rate scheduler is specified in
configure_optimizers()
with key"interval"
(default “epoch”) in the scheduler configuration, Lightning will call the scheduler’s.step()
method automatically in case of automatic optimization.If you use 16-bit precision (
precision=16
), Lightning will automatically handle the optimizer.If you use
torch.optim.LBFGS
, Lightning handles the closure function automatically for you.If you use multiple optimizers, you will have to switch to ‘manual optimization’ mode and step them yourself.
If you need to control how often the optimizer steps, override the
optimizer_step()
hook.
- class DFF(input_size, hidden_size, batch_size, lr, loss)[source]#
Bases:
LightningModule
- Attributes:
automatic_optimization
If set to
False
you are responsible for calling.backward()
,.step()
,.zero_grad()
.current_epoch
The current epoch in the
Trainer
, or 0 if not attached.- device
- dtype
example_input_array
The example input array is a specification of what the module can consume in the
forward()
method.- fabric
global_rank
The index of the current process across all nodes and devices.
global_step
Total training batches seen across all epochs.
hparams
The collection of hyperparameters saved with
save_hyperparameters()
.hparams_initial
The collection of hyperparameters saved with
save_hyperparameters()
.local_rank
The index of the current process within a single node.
logger
Reference to the logger object in the Trainer.
loggers
Reference to the list of loggers in the Trainer.
on_gpu
Returns
True
if this model is currently located on a GPU.- trainer
Methods
add_module
(name, module)Add a child module to the current module.
all_gather
(data[, group, sync_grads])Gather tensors or collections of tensors from multiple processes.
apply
(fn)Apply
fn
recursively to every submodule (as returned by.children()
) as well as self.backward
(loss, *args, **kwargs)Called to perform backward on the loss returned in
training_step()
.bfloat16
()Casts all floating point parameters and buffers to
bfloat16
datatype.buffers
([recurse])Return an iterator over module buffers.
children
()Return an iterator over immediate children modules.
clip_gradients
(optimizer[, ...])Handles gradient clipping internally.
compile
(*args, **kwargs)Compile this Module's forward using
torch.compile()
.configure_callbacks
()Configure model-specific callbacks.
configure_gradient_clipping
(optimizer[, ...])Perform gradient clipping for the optimizer parameters.
Choose what optimizers and learning-rate schedulers to use in your optimization.
configure_sharded_model
()Hook to create modules in a distributed aware context.
cpu
()See
torch.nn.Module.cpu()
.cuda
([device])Moves all model parameters and buffers to the GPU.
double
()See
torch.nn.Module.double()
.eval
()Set the module in evaluation mode.
extra_repr
()Return the extra representation of the module.
float
()See
torch.nn.Module.float()
.forward
(x)Same as
torch.nn.Module.forward()
.freeze
()Freeze all params for inference.
get_buffer
(target)Return the buffer given by
target
if it exists, otherwise throw an error.get_extra_state
()Return any extra state to include in the module's state_dict.
get_parameter
(target)Return the parameter given by
target
if it exists, otherwise throw an error.get_submodule
(target)Return the submodule given by
target
if it exists, otherwise throw an error.half
()See
torch.nn.Module.half()
.ipu
([device])Move all model parameters and buffers to the IPU.
load_from_checkpoint
(checkpoint_path[, ...])Primary way of loading a model from a checkpoint.
load_state_dict
(state_dict[, strict, assign])Copy parameters and buffers from
state_dict
into this module and its descendants.log
(name, value[, prog_bar, logger, ...])Log a key, value pair.
log_dict
(dictionary[, prog_bar, logger, ...])Log a dictionary of values at once.
lr_scheduler_step
(scheduler, metric)Override this method to adjust the default way the
Trainer
calls each scheduler.lr_schedulers
()Returns the learning rate scheduler(s) that are being used during training.
manual_backward
(loss, *args, **kwargs)Call this directly from your
training_step()
when doing optimizations manually.modules
()Return an iterator over all modules in the network.
mtia
([device])Move all model parameters and buffers to the MTIA.
named_buffers
([prefix, recurse, ...])Return an iterator over module buffers, yielding both the name of the buffer as well as the buffer itself.
named_children
()Return an iterator over immediate children modules, yielding both the name of the module as well as the module itself.
named_modules
([memo, prefix, remove_duplicate])Return an iterator over all modules in the network, yielding both the name of the module as well as the module itself.
named_parameters
([prefix, recurse, ...])Return an iterator over module parameters, yielding both the name of the parameter as well as the parameter itself.
on_after_backward
()Called after
loss.backward()
and before optimizers are stepped.on_after_batch_transfer
(batch, dataloader_idx)Override to alter or apply batch augmentations to your batch after it is transferred to the device.
on_before_backward
(loss)Called before
loss.backward()
.on_before_batch_transfer
(batch, dataloader_idx)Override to alter or apply batch augmentations to your batch before it is transferred to the device.
on_before_optimizer_step
(optimizer)Called before
optimizer.step()
.on_before_zero_grad
(optimizer)Called after
training_step()
and beforeoptimizer.zero_grad()
.on_fit_end
()Called at the very end of fit.
on_fit_start
()Called at the very beginning of fit.
on_load_checkpoint
(checkpoint)Called by Lightning to restore your model.
on_predict_batch_end
(outputs, batch, batch_idx)Called in the predict loop after the batch.
on_predict_batch_start
(batch, batch_idx[, ...])Called in the predict loop before anything happens for that batch.
on_predict_end
()Called at the end of predicting.
on_predict_epoch_end
()Called at the end of predicting.
on_predict_epoch_start
()Called at the beginning of predicting.
on_predict_model_eval
()Sets the model to eval during the predict loop.
on_predict_start
()Called at the beginning of predicting.
on_save_checkpoint
(checkpoint)Called by Lightning when saving a checkpoint to give you a chance to store anything else you might want to save.
on_test_batch_end
(outputs, batch, batch_idx)Called in the test loop after the batch.
on_test_batch_start
(batch, batch_idx[, ...])Called in the test loop before anything happens for that batch.
on_test_end
()Called at the end of testing.
on_test_epoch_end
()Called in the test loop at the very end of the epoch.
on_test_epoch_start
()Called in the test loop at the very beginning of the epoch.
on_test_model_eval
()Sets the model to eval during the test loop.
on_test_model_train
()Sets the model to train during the test loop.
on_test_start
()Called at the beginning of testing.
on_train_batch_end
(outputs, batch, batch_idx)Called in the training loop after the batch.
on_train_batch_start
(batch, batch_idx)Called in the training loop before anything happens for that batch.
on_train_end
()Called at the end of training before logger experiment is closed.
on_train_epoch_end
()Called in the training loop at the very end of the epoch.
on_train_epoch_start
()Called in the training loop at the very beginning of the epoch.
on_train_start
()Called at the beginning of training after sanity check.
on_validation_batch_end
(outputs, batch, ...)Called in the validation loop after the batch.
on_validation_batch_start
(batch, batch_idx)Called in the validation loop before anything happens for that batch.
on_validation_end
()Called at the end of validation.
on_validation_epoch_end
()Called in the validation loop at the very end of the epoch.
on_validation_epoch_start
()Called in the validation loop at the very beginning of the epoch.
on_validation_model_eval
()Sets the model to eval during the val loop.
on_validation_model_train
()Sets the model to train during the val loop.
on_validation_start
()Called at the beginning of validation.
optimizer_step
(epoch, batch_idx, optimizer)Override this method to adjust the default way the
Trainer
calls the optimizer.optimizer_zero_grad
(epoch, batch_idx, optimizer)Override this method to change the default behaviour of
optimizer.zero_grad()
.optimizers
([use_pl_optimizer])Returns the optimizer(s) that are being used during training.
parameters
([recurse])Return an iterator over module parameters.
predict_dataloader
()An iterable or collection of iterables specifying prediction samples.
predict_step
(batch, batch_idx, **kwargs)Step function called during
predict()
.prepare_data
()Use this to download and prepare data.
print
(*args, **kwargs)Prints only from process 0.
register_backward_hook
(hook)Register a backward hook on the module.
register_buffer
(name, tensor[, persistent])Add a buffer to the module.
register_forward_hook
(hook, *[, prepend, ...])Register a forward hook on the module.
register_forward_pre_hook
(hook, *[, ...])Register a forward pre-hook on the module.
register_full_backward_hook
(hook[, prepend])Register a backward hook on the module.
register_full_backward_pre_hook
(hook[, prepend])Register a backward pre-hook on the module.
register_load_state_dict_post_hook
(hook)Register a post-hook to be run after module's
load_state_dict()
is called.register_load_state_dict_pre_hook
(hook)Register a pre-hook to be run before module's
load_state_dict()
is called.register_module
(name, module)Alias for
add_module()
.register_parameter
(name, param)Add a parameter to the module.
register_state_dict_post_hook
(hook)Register a post-hook for the
state_dict()
method.register_state_dict_pre_hook
(hook)Register a pre-hook for the
state_dict()
method.requires_grad_
([requires_grad])Change if autograd should record operations on parameters in this module.
save_hyperparameters
(*args[, ignore, frame, ...])Save arguments to
hparams
attribute.set_extra_state
(state)Set extra state contained in the loaded state_dict.
set_submodule
(target, module[, strict])Set the submodule given by
target
if it exists, otherwise throw an error.setup
(stage)Called at the beginning of fit (train + validate), validate, test, or predict.
share_memory
()See
torch.Tensor.share_memory_()
.state_dict
(*args[, destination, prefix, ...])Return a dictionary containing references to the whole state of the module.
teardown
(stage)Called at the end of fit (train + validate), validate, test, or predict.
test_dataloader
()An iterable or collection of iterables specifying test samples.
test_step
(*args, **kwargs)Operates on a single batch of data from the test set.
to
(*args, **kwargs)See
torch.nn.Module.to()
.to_empty
(*, device[, recurse])Move the parameters and buffers to the specified device without copying storage.
to_onnx
(file_path[, input_sample])Saves the model in ONNX format.
to_torchscript
([file_path, method, ...])By default compiles the whole model to a
ScriptModule
.toggle_optimizer
(optimizer)Makes sure only the gradients of the current optimizer's parameters are calculated in the training step to prevent dangling gradients in multiple-optimizer setup.
train
([mode])Set the module in training mode.
train_dataloader
()An iterable or collection of iterables specifying training samples.
training_step
(batch, batch_idx)Here you compute and return the training loss and some additional metrics for e.g. the progress bar or logger.
transfer_batch_to_device
(batch, device, ...)Override this hook if your
DataLoader
returns tensors wrapped in a custom data structure.type
(dst_type)See
torch.nn.Module.type()
.unfreeze
()Unfreeze all parameters for training.
untoggle_optimizer
(optimizer)Resets the state of required gradients that were toggled with
toggle_optimizer()
.val_dataloader
()An iterable or collection of iterables specifying validation samples.
validation_step
(batch, batch_idx)Operates on a single batch of data from the validation set.
xpu
([device])Move all model parameters and buffers to the XPU.
zero_grad
([set_to_none])Reset gradients of all model parameters.
__call__
- forward(x)[source]#
Same as
torch.nn.Module.forward()
.- Parameters:
*args – Whatever you decide to pass into the forward method.
**kwargs – Keyword arguments are also possible.
- Returns:
Your model’s output
- predict_step(batch, batch_idx, **kwargs)[source]#
Step function called during
predict()
. By default, it callsforward()
. Override to add any processing logic.The
predict_step()
is used to scale inference on multi-devices.To prevent an OOM error, it is possible to use
BasePredictionWriter
callback to write the predictions to disk or database after each batch or on epoch end.The
BasePredictionWriter
should be used while using a spawn based accelerator. This happens forTrainer(strategy="ddp_spawn")
or training on 8 TPU cores withTrainer(accelerator="tpu", devices=8)
as predictions won’t be returned.Example
class MyModel(LightningModule): def predict_step(self, batch, batch_idx, dataloader_idx=0): return self(batch) dm = ... model = MyModel() trainer = Trainer(accelerator="gpu", devices=2) predictions = trainer.predict(model, dm)
- Parameters:
batch – Current batch.
batch_idx – Index of current batch.
dataloader_idx – Index of the current dataloader.
- Returns:
Predicted output
- training_step(batch, batch_idx)[source]#
Here you compute and return the training loss and some additional metrics for e.g. the progress bar or logger.
- Parameters:
batch (
Tensor
| (Tensor
, …) | [Tensor
, …]) – The output of yourDataLoader
. A tensor, tuple or list.batch_idx (
int
) – Integer displaying index of this batch
- Returns:
Any of.
Tensor
- The loss tensordict
- A dictionary. Can include any keys, but must include the key'loss'
None
- Training will skip to the next batch. This is only for automatic optimization.This is not supported for multi-GPU, TPU, IPU, or DeepSpeed.
In this step you’d normally do the forward pass and calculate the loss for a batch. You can also do fancier things like multiple forward passes or something model specific.
Example:
def training_step(self, batch, batch_idx): x, y, z = batch out = self.encoder(x) loss = self.loss(out, x) return loss
To use multiple optimizers, you can switch to ‘manual optimization’ and control their stepping:
def __init__(self): super().__init__() self.automatic_optimization = False # Multiple optimizers (e.g.: GANs) def training_step(self, batch, batch_idx): opt1, opt2 = self.optimizers() # do training_step with encoder ... opt1.step() # do training_step with decoder ... opt2.step()
Note
When
accumulate_grad_batches
> 1, the loss returned here will be automatically normalized byaccumulate_grad_batches
internally.
- validation_step(batch, batch_idx)[source]#
Operates on a single batch of data from the validation set. In this step you’d might generate examples or calculate anything of interest like accuracy.
- Parameters:
batch – The output of your
DataLoader
.batch_idx – The index of this batch.
dataloader_idx – The index of the dataloader that produced this batch. (only if multiple val dataloaders used)
- Returns:
Any object or value
None
- Validation will skip to the next batch
# if you have one val dataloader: def validation_step(self, batch, batch_idx): ... # if you have multiple val dataloaders: def validation_step(self, batch, batch_idx, dataloader_idx=0): ...
Examples:
# CASE 1: A single validation dataset def validation_step(self, batch, batch_idx): x, y = batch # implement your own out = self(x) loss = self.loss(out, y) # log 6 example images # or generated text... or whatever sample_imgs = x[:6] grid = torchvision.utils.make_grid(sample_imgs) self.logger.experiment.add_image('example_images', grid, 0) # calculate acc labels_hat = torch.argmax(out, dim=1) val_acc = torch.sum(y == labels_hat).item() / (len(y) * 1.0) # log the outputs! self.log_dict({'val_loss': loss, 'val_acc': val_acc})
If you pass in multiple val dataloaders,
validation_step()
will have an additional argument. We recommend setting the default value of 0 so that you can quickly switch between single and multiple dataloaders.# CASE 2: multiple validation dataloaders def validation_step(self, batch, batch_idx, dataloader_idx=0): # dataloader_idx tells you which dataset this is. ...
Note
If you don’t need to validate you don’t need to implement this method.
Note
When the
validation_step()
is called, the model has been put in eval mode and PyTorch gradients have been disabled. At the end of validation, the model goes back to training mode and gradients are enabled.
- configure_optimizers()[source]#
Choose what optimizers and learning-rate schedulers to use in your optimization. Normally you’d need one. But in the case of GANs or similar you might have multiple. Optimization with multiple optimizers only works in the manual optimization mode.
- Returns:
Any of these 6 options.
Single optimizer.
List or Tuple of optimizers.
Two lists - The first list has multiple optimizers, and the second has multiple LR schedulers (or multiple
lr_scheduler_config
).Dictionary, with an
"optimizer"
key, and (optionally) a"lr_scheduler"
key whose value is a single LR scheduler orlr_scheduler_config
.None - Fit will run without any optimizer.
The
lr_scheduler_config
is a dictionary which contains the scheduler and its associated configuration. The default configuration is shown below.lr_scheduler_config = { # REQUIRED: The scheduler instance "scheduler": lr_scheduler, # The unit of the scheduler's step size, could also be 'step'. # 'epoch' updates the scheduler on epoch end whereas 'step' # updates it after a optimizer update. "interval": "epoch", # How many epochs/steps should pass between calls to # `scheduler.step()`. 1 corresponds to updating the learning # rate after every epoch/step. "frequency": 1, # Metric to to monitor for schedulers like `ReduceLROnPlateau` "monitor": "val_loss", # If set to `True`, will enforce that the value specified 'monitor' # is available when the scheduler is updated, thus stopping # training if not found. If set to `False`, it will only produce a warning "strict": True, # If using the `LearningRateMonitor` callback to monitor the # learning rate progress, this keyword can be used to specify # a custom logged name "name": None, }
When there are schedulers in which the
.step()
method is conditioned on a value, such as thetorch.optim.lr_scheduler.ReduceLROnPlateau
scheduler, Lightning requires that thelr_scheduler_config
contains the keyword"monitor"
set to the metric name that the scheduler should be conditioned on.Metrics can be made available to monitor by simply logging it using
self.log('metric_to_track', metric_val)
in yourLightningModule
.Note
Some things to know:
Lightning calls
.backward()
and.step()
automatically in case of automatic optimization.If a learning rate scheduler is specified in
configure_optimizers()
with key"interval"
(default “epoch”) in the scheduler configuration, Lightning will call the scheduler’s.step()
method automatically in case of automatic optimization.If you use 16-bit precision (
precision=16
), Lightning will automatically handle the optimizer.If you use
torch.optim.LBFGS
, Lightning handles the closure function automatically for you.If you use multiple optimizers, you will have to switch to ‘manual optimization’ mode and step them yourself.
If you need to control how often the optimizer steps, override the
optimizer_step()
hook.
- class MDN(input_size, hidden_size, num_gaussians, lr, batch_size, loss_function='loglikelihood')[source]#
Bases:
LightningModule
- Attributes:
automatic_optimization
If set to
False
you are responsible for calling.backward()
,.step()
,.zero_grad()
.current_epoch
The current epoch in the
Trainer
, or 0 if not attached.- device
- dtype
example_input_array
The example input array is a specification of what the module can consume in the
forward()
method.- fabric
global_rank
The index of the current process across all nodes and devices.
global_step
Total training batches seen across all epochs.
hparams
The collection of hyperparameters saved with
save_hyperparameters()
.hparams_initial
The collection of hyperparameters saved with
save_hyperparameters()
.local_rank
The index of the current process within a single node.
logger
Reference to the logger object in the Trainer.
loggers
Reference to the list of loggers in the Trainer.
on_gpu
Returns
True
if this model is currently located on a GPU.- trainer
Methods
add_module
(name, module)Add a child module to the current module.
all_gather
(data[, group, sync_grads])Gather tensors or collections of tensors from multiple processes.
apply
(fn)Apply
fn
recursively to every submodule (as returned by.children()
) as well as self.backward
(loss, *args, **kwargs)Called to perform backward on the loss returned in
training_step()
.bfloat16
()Casts all floating point parameters and buffers to
bfloat16
datatype.buffers
([recurse])Return an iterator over module buffers.
children
()Return an iterator over immediate children modules.
clip_gradients
(optimizer[, ...])Handles gradient clipping internally.
compile
(*args, **kwargs)Compile this Module's forward using
torch.compile()
.configure_callbacks
()Configure model-specific callbacks.
configure_gradient_clipping
(optimizer[, ...])Perform gradient clipping for the optimizer parameters.
Choose what optimizers and learning-rate schedulers to use in your optimization.
configure_sharded_model
()Hook to create modules in a distributed aware context.
cpu
()See
torch.nn.Module.cpu()
.cuda
([device])Moves all model parameters and buffers to the GPU.
double
()See
torch.nn.Module.double()
.eval
()Set the module in evaluation mode.
extra_repr
()Return the extra representation of the module.
float
()See
torch.nn.Module.float()
.forward
(x)Same as
torch.nn.Module.forward()
.freeze
()Freeze all params for inference.
g_sample
(pi, sigma, mu)Gumbel sampling comes from here: hardmaru/pytorch_notebooks
get_buffer
(target)Return the buffer given by
target
if it exists, otherwise throw an error.get_extra_state
()Return any extra state to include in the module's state_dict.
get_parameter
(target)Return the parameter given by
target
if it exists, otherwise throw an error.get_submodule
(target)Return the submodule given by
target
if it exists, otherwise throw an error.half
()See
torch.nn.Module.half()
.ipu
([device])Move all model parameters and buffers to the IPU.
load_from_checkpoint
(checkpoint_path[, ...])Primary way of loading a model from a checkpoint.
load_state_dict
(state_dict[, strict, assign])Copy parameters and buffers from
state_dict
into this module and its descendants.log
(name, value[, prog_bar, logger, ...])Log a key, value pair.
log_dict
(dictionary[, prog_bar, logger, ...])Log a dictionary of values at once.
lr_scheduler_step
(scheduler, metric)Override this method to adjust the default way the
Trainer
calls each scheduler.lr_schedulers
()Returns the learning rate scheduler(s) that are being used during training.
manual_backward
(loss, *args, **kwargs)Call this directly from your
training_step()
when doing optimizations manually.mdn_loss
(pi, sigma, mu, y)Calculates the error, given the MoG parameters and the target The loss is the negative log likelihood of the data given the MoG parameters.
mmd_loss
(x, y, kernel)modules
()Return an iterator over all modules in the network.
mtia
([device])Move all model parameters and buffers to the MTIA.
named_buffers
([prefix, recurse, ...])Return an iterator over module buffers, yielding both the name of the buffer as well as the buffer itself.
named_children
()Return an iterator over immediate children modules, yielding both the name of the module as well as the module itself.
named_modules
([memo, prefix, remove_duplicate])Return an iterator over all modules in the network, yielding both the name of the module as well as the module itself.
named_parameters
([prefix, recurse, ...])Return an iterator over module parameters, yielding both the name of the parameter as well as the parameter itself.
on_after_backward
()Called after
loss.backward()
and before optimizers are stepped.on_after_batch_transfer
(batch, dataloader_idx)Override to alter or apply batch augmentations to your batch after it is transferred to the device.
on_before_backward
(loss)Called before
loss.backward()
.on_before_batch_transfer
(batch, dataloader_idx)Override to alter or apply batch augmentations to your batch before it is transferred to the device.
on_before_optimizer_step
(optimizer)Called before
optimizer.step()
.on_before_zero_grad
(optimizer)Called after
training_step()
and beforeoptimizer.zero_grad()
.on_fit_end
()Called at the very end of fit.
on_fit_start
()Called at the very beginning of fit.
on_load_checkpoint
(checkpoint)Called by Lightning to restore your model.
on_predict_batch_end
(outputs, batch, batch_idx)Called in the predict loop after the batch.
on_predict_batch_start
(batch, batch_idx[, ...])Called in the predict loop before anything happens for that batch.
on_predict_end
()Called at the end of predicting.
on_predict_epoch_end
()Called at the end of predicting.
on_predict_epoch_start
()Called at the beginning of predicting.
on_predict_model_eval
()Sets the model to eval during the predict loop.
on_predict_start
()Called at the beginning of predicting.
on_save_checkpoint
(checkpoint)Called by Lightning when saving a checkpoint to give you a chance to store anything else you might want to save.
on_test_batch_end
(outputs, batch, batch_idx)Called in the test loop after the batch.
on_test_batch_start
(batch, batch_idx[, ...])Called in the test loop before anything happens for that batch.
on_test_end
()Called at the end of testing.
on_test_epoch_end
()Called in the test loop at the very end of the epoch.
on_test_epoch_start
()Called in the test loop at the very beginning of the epoch.
on_test_model_eval
()Sets the model to eval during the test loop.
on_test_model_train
()Sets the model to train during the test loop.
on_test_start
()Called at the beginning of testing.
on_train_batch_end
(outputs, batch, batch_idx)Called in the training loop after the batch.
on_train_batch_start
(batch, batch_idx)Called in the training loop before anything happens for that batch.
on_train_end
()Called at the end of training before logger experiment is closed.
on_train_epoch_end
()Called in the training loop at the very end of the epoch.
on_train_epoch_start
()Called in the training loop at the very beginning of the epoch.
on_train_start
()Called at the beginning of training after sanity check.
on_validation_batch_end
(outputs, batch, ...)Called in the validation loop after the batch.
on_validation_batch_start
(batch, batch_idx)Called in the validation loop before anything happens for that batch.
on_validation_end
()Called at the end of validation.
on_validation_epoch_end
()Called in the validation loop at the very end of the epoch.
on_validation_epoch_start
()Called in the validation loop at the very beginning of the epoch.
on_validation_model_eval
()Sets the model to eval during the val loop.
on_validation_model_train
()Sets the model to train during the val loop.
on_validation_start
()Called at the beginning of validation.
optimizer_step
(epoch, batch_idx, optimizer)Override this method to adjust the default way the
Trainer
calls the optimizer.optimizer_zero_grad
(epoch, batch_idx, optimizer)Override this method to change the default behaviour of
optimizer.zero_grad()
.optimizers
([use_pl_optimizer])Returns the optimizer(s) that are being used during training.
parameters
([recurse])Return an iterator over module parameters.
predict_dataloader
()An iterable or collection of iterables specifying prediction samples.
predict_step
(batch, batch_idx[, dataloader_idx])Step function called during
predict()
.prepare_data
()Use this to download and prepare data.
print
(*args, **kwargs)Prints only from process 0.
register_backward_hook
(hook)Register a backward hook on the module.
register_buffer
(name, tensor[, persistent])Add a buffer to the module.
register_forward_hook
(hook, *[, prepend, ...])Register a forward hook on the module.
register_forward_pre_hook
(hook, *[, ...])Register a forward pre-hook on the module.
register_full_backward_hook
(hook[, prepend])Register a backward hook on the module.
register_full_backward_pre_hook
(hook[, prepend])Register a backward pre-hook on the module.
register_load_state_dict_post_hook
(hook)Register a post-hook to be run after module's
load_state_dict()
is called.register_load_state_dict_pre_hook
(hook)Register a pre-hook to be run before module's
load_state_dict()
is called.register_module
(name, module)Alias for
add_module()
.register_parameter
(name, param)Add a parameter to the module.
register_state_dict_post_hook
(hook)Register a post-hook for the
state_dict()
method.register_state_dict_pre_hook
(hook)Register a pre-hook for the
state_dict()
method.requires_grad_
([requires_grad])Change if autograd should record operations on parameters in this module.
sample
(pi, sigma, mu)Draw samples from a MoG.
save_hyperparameters
(*args[, ignore, frame, ...])Save arguments to
hparams
attribute.set_extra_state
(state)Set extra state contained in the loaded state_dict.
set_submodule
(target, module[, strict])Set the submodule given by
target
if it exists, otherwise throw an error.setup
(stage)Called at the beginning of fit (train + validate), validate, test, or predict.
share_memory
()See
torch.Tensor.share_memory_()
.state_dict
(*args[, destination, prefix, ...])Return a dictionary containing references to the whole state of the module.
teardown
(stage)Called at the end of fit (train + validate), validate, test, or predict.
test_dataloader
()An iterable or collection of iterables specifying test samples.
test_step
(*args, **kwargs)Operates on a single batch of data from the test set.
to
(*args, **kwargs)See
torch.nn.Module.to()
.to_empty
(*, device[, recurse])Move the parameters and buffers to the specified device without copying storage.
to_onnx
(file_path[, input_sample])Saves the model in ONNX format.
to_torchscript
([file_path, method, ...])By default compiles the whole model to a
ScriptModule
.toggle_optimizer
(optimizer)Makes sure only the gradients of the current optimizer's parameters are calculated in the training step to prevent dangling gradients in multiple-optimizer setup.
train
([mode])Set the module in training mode.
train_dataloader
()An iterable or collection of iterables specifying training samples.
training_step
(batch, batch_idx)Here you compute and return the training loss and some additional metrics for e.g. the progress bar or logger.
transfer_batch_to_device
(batch, device, ...)Override this hook if your
DataLoader
returns tensors wrapped in a custom data structure.type
(dst_type)See
torch.nn.Module.type()
.unfreeze
()Unfreeze all parameters for training.
untoggle_optimizer
(optimizer)Resets the state of required gradients that were toggled with
toggle_optimizer()
.val_dataloader
()An iterable or collection of iterables specifying validation samples.
validation_step
(batch, batch_idx)Operates on a single batch of data from the validation set.
xpu
([device])Move all model parameters and buffers to the XPU.
zero_grad
([set_to_none])Reset gradients of all model parameters.
__call__
add_noise
common_step
gaussian_probability
- __init__(input_size, hidden_size, num_gaussians, lr, batch_size, loss_function='loglikelihood')[source]#
Init function for the MDN
- Parameters:
input_size (int) – the number of dimensions in the input
hidden_size (int) – the number of dimensions in the hidden layer
num_gaussians (int) – the number of Gaussians per output dimensions
lr (float) – learning rate
batch_size (int) – Batch size.
loss_function (str) – Loss function can be either ‘loglikelihood’ or ‘mmd’ for Maximal Mean Discrepancy
- Input:
- minibatch (BxD): B is the batch size and D is the number of input
dimensions.
- Output:
- (pi, sigma, mu) (BxG, BxGxO, BxGxO): B is the batch size, G is the
number of Gaussians, and O is the number of dimensions for each Gaussian. Pi is a multinomial distribution of the Gaussians. Sigma is the standard deviation of each Gaussian. Mu is the mean of each Gaussian.
- configure_optimizers()[source]#
Choose what optimizers and learning-rate schedulers to use in your optimization. Normally you’d need one. But in the case of GANs or similar you might have multiple. Optimization with multiple optimizers only works in the manual optimization mode.
- Returns:
Any of these 6 options.
Single optimizer.
List or Tuple of optimizers.
Two lists - The first list has multiple optimizers, and the second has multiple LR schedulers (or multiple
lr_scheduler_config
).Dictionary, with an
"optimizer"
key, and (optionally) a"lr_scheduler"
key whose value is a single LR scheduler orlr_scheduler_config
.None - Fit will run without any optimizer.
The
lr_scheduler_config
is a dictionary which contains the scheduler and its associated configuration. The default configuration is shown below.lr_scheduler_config = { # REQUIRED: The scheduler instance "scheduler": lr_scheduler, # The unit of the scheduler's step size, could also be 'step'. # 'epoch' updates the scheduler on epoch end whereas 'step' # updates it after a optimizer update. "interval": "epoch", # How many epochs/steps should pass between calls to # `scheduler.step()`. 1 corresponds to updating the learning # rate after every epoch/step. "frequency": 1, # Metric to to monitor for schedulers like `ReduceLROnPlateau` "monitor": "val_loss", # If set to `True`, will enforce that the value specified 'monitor' # is available when the scheduler is updated, thus stopping # training if not found. If set to `False`, it will only produce a warning "strict": True, # If using the `LearningRateMonitor` callback to monitor the # learning rate progress, this keyword can be used to specify # a custom logged name "name": None, }
When there are schedulers in which the
.step()
method is conditioned on a value, such as thetorch.optim.lr_scheduler.ReduceLROnPlateau
scheduler, Lightning requires that thelr_scheduler_config
contains the keyword"monitor"
set to the metric name that the scheduler should be conditioned on.Metrics can be made available to monitor by simply logging it using
self.log('metric_to_track', metric_val)
in yourLightningModule
.Note
Some things to know:
Lightning calls
.backward()
and.step()
automatically in case of automatic optimization.If a learning rate scheduler is specified in
configure_optimizers()
with key"interval"
(default “epoch”) in the scheduler configuration, Lightning will call the scheduler’s.step()
method automatically in case of automatic optimization.If you use 16-bit precision (
precision=16
), Lightning will automatically handle the optimizer.If you use
torch.optim.LBFGS
, Lightning handles the closure function automatically for you.If you use multiple optimizers, you will have to switch to ‘manual optimization’ mode and step them yourself.
If you need to control how often the optimizer steps, override the
optimizer_step()
hook.
- training_step(batch, batch_idx)[source]#
Here you compute and return the training loss and some additional metrics for e.g. the progress bar or logger.
- Parameters:
batch (
Tensor
| (Tensor
, …) | [Tensor
, …]) – The output of yourDataLoader
. A tensor, tuple or list.batch_idx (
int
) – Integer displaying index of this batch
- Returns:
Any of.
Tensor
- The loss tensordict
- A dictionary. Can include any keys, but must include the key'loss'
None
- Training will skip to the next batch. This is only for automatic optimization.This is not supported for multi-GPU, TPU, IPU, or DeepSpeed.
In this step you’d normally do the forward pass and calculate the loss for a batch. You can also do fancier things like multiple forward passes or something model specific.
Example:
def training_step(self, batch, batch_idx): x, y, z = batch out = self.encoder(x) loss = self.loss(out, x) return loss
To use multiple optimizers, you can switch to ‘manual optimization’ and control their stepping:
def __init__(self): super().__init__() self.automatic_optimization = False # Multiple optimizers (e.g.: GANs) def training_step(self, batch, batch_idx): opt1, opt2 = self.optimizers() # do training_step with encoder ... opt1.step() # do training_step with decoder ... opt2.step()
Note
When
accumulate_grad_batches
> 1, the loss returned here will be automatically normalized byaccumulate_grad_batches
internally.
- validation_step(batch, batch_idx)[source]#
Operates on a single batch of data from the validation set. In this step you’d might generate examples or calculate anything of interest like accuracy.
- Parameters:
batch – The output of your
DataLoader
.batch_idx – The index of this batch.
dataloader_idx – The index of the dataloader that produced this batch. (only if multiple val dataloaders used)
- Returns:
Any object or value
None
- Validation will skip to the next batch
# if you have one val dataloader: def validation_step(self, batch, batch_idx): ... # if you have multiple val dataloaders: def validation_step(self, batch, batch_idx, dataloader_idx=0): ...
Examples:
# CASE 1: A single validation dataset def validation_step(self, batch, batch_idx): x, y = batch # implement your own out = self(x) loss = self.loss(out, y) # log 6 example images # or generated text... or whatever sample_imgs = x[:6] grid = torchvision.utils.make_grid(sample_imgs) self.logger.experiment.add_image('example_images', grid, 0) # calculate acc labels_hat = torch.argmax(out, dim=1) val_acc = torch.sum(y == labels_hat).item() / (len(y) * 1.0) # log the outputs! self.log_dict({'val_loss': loss, 'val_acc': val_acc})
If you pass in multiple val dataloaders,
validation_step()
will have an additional argument. We recommend setting the default value of 0 so that you can quickly switch between single and multiple dataloaders.# CASE 2: multiple validation dataloaders def validation_step(self, batch, batch_idx, dataloader_idx=0): # dataloader_idx tells you which dataset this is. ...
Note
If you don’t need to validate you don’t need to implement this method.
Note
When the
validation_step()
is called, the model has been put in eval mode and PyTorch gradients have been disabled. At the end of validation, the model goes back to training mode and gradients are enabled.
- forward(x)[source]#
Same as
torch.nn.Module.forward()
.- Parameters:
*args – Whatever you decide to pass into the forward method.
**kwargs – Keyword arguments are also possible.
- Returns:
Your model’s output
- mdn_loss(pi, sigma, mu, y)[source]#
Calculates the error, given the MoG parameters and the target The loss is the negative log likelihood of the data given the MoG parameters.
- static mmd_loss(x, y, kernel)[source]#
https://www.kaggle.com/onurtunali/maximum-mean-discrepancy
Emprical maximum mean discrepancy. The lower the result, the more evidence that distributions are the same.
- Parameters:
x – first sample, distribution P
y – second sample, distribution Q
kernel – kernel type such as “multiscale” or “rbf”
- static g_sample(pi, sigma, mu)[source]#
Gumbel sampling comes from here: hardmaru/pytorch_notebooks
- class RBF(n_kernels=5, mul_factor=2.0, bandwidth=None)[source]#
Bases:
Module
Methods
add_module
(name, module)Add a child module to the current module.
apply
(fn)Apply
fn
recursively to every submodule (as returned by.children()
) as well as self.bfloat16
()Casts all floating point parameters and buffers to
bfloat16
datatype.buffers
([recurse])Return an iterator over module buffers.
children
()Return an iterator over immediate children modules.
compile
(*args, **kwargs)Compile this Module's forward using
torch.compile()
.cpu
()Move all model parameters and buffers to the CPU.
cuda
([device])Move all model parameters and buffers to the GPU.
double
()Casts all floating point parameters and buffers to
double
datatype.eval
()Set the module in evaluation mode.
extra_repr
()Return the extra representation of the module.
float
()Casts all floating point parameters and buffers to
float
datatype.forward
(X)Define the computation performed at every call.
get_bandwidth
(L2_distances)Get the bandwidth of the RBF kernel.
get_buffer
(target)Return the buffer given by
target
if it exists, otherwise throw an error.get_extra_state
()Return any extra state to include in the module's state_dict.
get_parameter
(target)Return the parameter given by
target
if it exists, otherwise throw an error.get_submodule
(target)Return the submodule given by
target
if it exists, otherwise throw an error.half
()Casts all floating point parameters and buffers to
half
datatype.ipu
([device])Move all model parameters and buffers to the IPU.
load_state_dict
(state_dict[, strict, assign])Copy parameters and buffers from
state_dict
into this module and its descendants.modules
()Return an iterator over all modules in the network.
mtia
([device])Move all model parameters and buffers to the MTIA.
named_buffers
([prefix, recurse, ...])Return an iterator over module buffers, yielding both the name of the buffer as well as the buffer itself.
named_children
()Return an iterator over immediate children modules, yielding both the name of the module as well as the module itself.
named_modules
([memo, prefix, remove_duplicate])Return an iterator over all modules in the network, yielding both the name of the module as well as the module itself.
named_parameters
([prefix, recurse, ...])Return an iterator over module parameters, yielding both the name of the parameter as well as the parameter itself.
parameters
([recurse])Return an iterator over module parameters.
register_backward_hook
(hook)Register a backward hook on the module.
register_buffer
(name, tensor[, persistent])Add a buffer to the module.
register_forward_hook
(hook, *[, prepend, ...])Register a forward hook on the module.
register_forward_pre_hook
(hook, *[, ...])Register a forward pre-hook on the module.
register_full_backward_hook
(hook[, prepend])Register a backward hook on the module.
register_full_backward_pre_hook
(hook[, prepend])Register a backward pre-hook on the module.
register_load_state_dict_post_hook
(hook)Register a post-hook to be run after module's
load_state_dict()
is called.register_load_state_dict_pre_hook
(hook)Register a pre-hook to be run before module's
load_state_dict()
is called.register_module
(name, module)Alias for
add_module()
.register_parameter
(name, param)Add a parameter to the module.
register_state_dict_post_hook
(hook)Register a post-hook for the
state_dict()
method.register_state_dict_pre_hook
(hook)Register a pre-hook for the
state_dict()
method.requires_grad_
([requires_grad])Change if autograd should record operations on parameters in this module.
set_extra_state
(state)Set extra state contained in the loaded state_dict.
set_submodule
(target, module[, strict])Set the submodule given by
target
if it exists, otherwise throw an error.share_memory
()See
torch.Tensor.share_memory_()
.state_dict
(*args[, destination, prefix, ...])Return a dictionary containing references to the whole state of the module.
to
(*args, **kwargs)Move and/or cast the parameters and buffers.
to_empty
(*, device[, recurse])Move the parameters and buffers to the specified device without copying storage.
train
([mode])Set the module in training mode.
type
(dst_type)Casts all parameters and buffers to
dst_type
.xpu
([device])Move all model parameters and buffers to the XPU.
zero_grad
([set_to_none])Reset gradients of all model parameters.
__call__
- __init__(n_kernels=5, mul_factor=2.0, bandwidth=None)[source]#
Initialize internal Module state, shared by both nn.Module and ScriptModule.
- forward(X)[source]#
Define the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class MMDLoss(kernel=RBF())[source]#
Bases:
Module
Methods
add_module
(name, module)Add a child module to the current module.
apply
(fn)Apply
fn
recursively to every submodule (as returned by.children()
) as well as self.bfloat16
()Casts all floating point parameters and buffers to
bfloat16
datatype.buffers
([recurse])Return an iterator over module buffers.
children
()Return an iterator over immediate children modules.
compile
(*args, **kwargs)Compile this Module's forward using
torch.compile()
.cpu
()Move all model parameters and buffers to the CPU.
cuda
([device])Move all model parameters and buffers to the GPU.
double
()Casts all floating point parameters and buffers to
double
datatype.eval
()Set the module in evaluation mode.
extra_repr
()Return the extra representation of the module.
float
()Casts all floating point parameters and buffers to
float
datatype.forward
(X, Y)Define the computation performed at every call.
get_buffer
(target)Return the buffer given by
target
if it exists, otherwise throw an error.get_extra_state
()Return any extra state to include in the module's state_dict.
get_parameter
(target)Return the parameter given by
target
if it exists, otherwise throw an error.get_submodule
(target)Return the submodule given by
target
if it exists, otherwise throw an error.half
()Casts all floating point parameters and buffers to
half
datatype.ipu
([device])Move all model parameters and buffers to the IPU.
load_state_dict
(state_dict[, strict, assign])Copy parameters and buffers from
state_dict
into this module and its descendants.modules
()Return an iterator over all modules in the network.
mtia
([device])Move all model parameters and buffers to the MTIA.
named_buffers
([prefix, recurse, ...])Return an iterator over module buffers, yielding both the name of the buffer as well as the buffer itself.
named_children
()Return an iterator over immediate children modules, yielding both the name of the module as well as the module itself.
named_modules
([memo, prefix, remove_duplicate])Return an iterator over all modules in the network, yielding both the name of the module as well as the module itself.
named_parameters
([prefix, recurse, ...])Return an iterator over module parameters, yielding both the name of the parameter as well as the parameter itself.
parameters
([recurse])Return an iterator over module parameters.
register_backward_hook
(hook)Register a backward hook on the module.
register_buffer
(name, tensor[, persistent])Add a buffer to the module.
register_forward_hook
(hook, *[, prepend, ...])Register a forward hook on the module.
register_forward_pre_hook
(hook, *[, ...])Register a forward pre-hook on the module.
register_full_backward_hook
(hook[, prepend])Register a backward hook on the module.
register_full_backward_pre_hook
(hook[, prepend])Register a backward pre-hook on the module.
register_load_state_dict_post_hook
(hook)Register a post-hook to be run after module's
load_state_dict()
is called.register_load_state_dict_pre_hook
(hook)Register a pre-hook to be run before module's
load_state_dict()
is called.register_module
(name, module)Alias for
add_module()
.register_parameter
(name, param)Add a parameter to the module.
register_state_dict_post_hook
(hook)Register a post-hook for the
state_dict()
method.register_state_dict_pre_hook
(hook)Register a pre-hook for the
state_dict()
method.requires_grad_
([requires_grad])Change if autograd should record operations on parameters in this module.
set_extra_state
(state)Set extra state contained in the loaded state_dict.
set_submodule
(target, module[, strict])Set the submodule given by
target
if it exists, otherwise throw an error.share_memory
()See
torch.Tensor.share_memory_()
.state_dict
(*args[, destination, prefix, ...])Return a dictionary containing references to the whole state of the module.
to
(*args, **kwargs)Move and/or cast the parameters and buffers.
to_empty
(*, device[, recurse])Move the parameters and buffers to the specified device without copying storage.
train
([mode])Set the module in training mode.
type
(dst_type)Casts all parameters and buffers to
dst_type
.xpu
([device])Move all model parameters and buffers to the XPU.
zero_grad
([set_to_none])Reset gradients of all model parameters.
__call__
- __init__(kernel=RBF())[source]#
Initialize internal Module state, shared by both nn.Module and ScriptModule.
- forward(X, Y)[source]#
Define the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
This module contains the implementation of the BaseModel and MLPModel classes.
The BaseModel class serves as the base class for all models in the causalexplain package. It provides common functionality such as data initialization, logger initialization, and callback initialization.
The MLPModel class is a specific implementation of the BaseModel class, representing a Multi-Layer Perceptron (MLP) model. It defines the architecture and training process for the MLP model.
- Example usage:
data = pd.read_csv(“~/phd/data/generated_linear_10.csv”) mlp = MLPModel(
target=’V0’, input_size=data.shape[1], hidden_dim=[64, 128, 64], activation=nn.ReLU(), learning_rate=0.05, batch_size=32, loss_fn=”mse”, dropout=0.05, num_epochs=200, dataframe=data, test_size=0.1, device=”auto”, seed=1234, early_stop=False)
mlp.train()
- class BaseModel(target, dataframe, test_size, batch_size, tb_suffix, seed, early_stop=True, patience=10, min_delta=0.001)[source]#
Bases:
object
Base class for all models in the causalexplain package.
- Parameters:
target (str) – The target variable name.
dataframe (pd.DataFrame) – The input dataframe.
test_size (float) – The proportion of the data to use for testing.
batch_size (int) – The batch size for training.
tb_suffix (str) – The suffix to append to the TensorBoard log directory.
seed (int) – The random seed for reproducibility.
early_stop (bool, optional) – Whether to use early stopping during training. Defaults to True.
patience (int, optional) – The patience value for early stopping. Defaults to 10.
min_delta (float, optional) – The minimum change in the monitored metric to be considered an improvement for early stopping. Defaults to 0.001.
- Attributes:
- all_columns
- callbacks
- columns
- extra_trainer_args
- logger
- model
- scaler
- train_loader
- val_loader
Methods
init_callbacks
([early_stop, min_delta, ...])Initialize the callbacks for the training process.
Initialize the data loaders for training and validation.
init_logger
(suffix)Initialize the logger for TensorBoard.
override_extras
(**kwargs)Override the extra trainer arguments.
- model = None#
- all_columns = None#
- callbacks = None#
- columns = None#
- logger = None#
- extra_trainer_args = None#
- scaler = None#
- train_loader = None#
- val_loader = None#
- n_rows = 0#
- device = 'cpu'#
- __init__(target, dataframe, test_size, batch_size, tb_suffix, seed, early_stop=True, patience=10, min_delta=0.001)[source]#
- init_logger(suffix)[source]#
Initialize the logger for TensorBoard.
- Parameters:
suffix (str) – The suffix to append to the logger name.
- init_callbacks(early_stop=True, min_delta=0.001, patience=10, prog_bar=False)[source]#
Initialize the callbacks for the training process.
- Parameters:
early_stop (bool, optional) – Whether to use early stopping. Defaults to True.
min_delta (float, optional) – The minimum change in the monitored metric to be considered an improvement for early stopping. Defaults to 0.001.
patience (int, optional) – The patience value for early stopping. Defaults to 10.
prog_bar (bool, optional) – Whether to use a progress bar during training. Defaults to False.
- class MLPModel(target, input_size, hidden_dim, activation, learning_rate, batch_size, loss_fn, dropout, num_epochs, dataframe, test_size, device, seed, early_stop=True, patience=10, min_delta=0.001, **kwargs)[source]#
Bases:
BaseModel
Implementation of the Multi-Layer Perceptron (MLP) model.
- Parameters:
target (str) – The target variable name.
input_size (int) – The size of the input features.
hidden_dim (List[int]) – The dimensions of the hidden layers.
activation (nn.Module) – The activation function to use in the hidden layers.
learning_rate (float) – The learning rate for training.
batch_size (int) – The batch size for training.
loss_fn (str) – The loss function to use.
dropout (float) – The dropout rate.
num_epochs (int) – The number of training epochs.
dataframe (pd.DataFrame) – The input dataframe.
test_size (float) – The proportion of the data to use for testing.
seed (int) – The random seed for reproducibility.
early_stop (bool, optional) – Whether to use early stopping during training. Defaults to True.
patience (int, optional) – The patience value for early stopping. Defaults to 10.
min_delta (float, optional) – The minimum change in the monitored metric to be considered an improvement for early stopping. Defaults to 0.001.
**kwargs – Additional keyword arguments to override the default values.
- Attributes:
- all_columns
- callbacks
- columns
- extra_trainer_args
- logger
- model
- scaler
- train_loader
- val_loader
Methods
init_callbacks
([early_stop, min_delta, ...])Initialize the callbacks for the training process.
init_data
()Initialize the data loaders for training and validation.
init_logger
(suffix)Initialize the logger for TensorBoard.
override_extras
(**kwargs)Override the extra trainer arguments.
train
()Train the MLP model.
This module contains functions to extract and visualize the weights of a neural network model. The weights are extracted from the model and then visualized in different ways to help understand the relationships between the input features and the target variable. The functions in this module are used to visualize the weights of a neural network model and to identify relationships between the input features and the target variable.
- extract_weights(model, verbose=False)[source]#
Extracts the weights from a given model.
Parameters: - model: The model from which to extract the weights. - verbose: If True, prints the names of the weights being extracted.
Returns: - weights: A list of the extracted weights.
Visualizes the weights connecting the input layer to the hidden layer.
- Parameters:
- Returns:
None
- Return type:
None
- summarize_weights(weights, feature_names, layer=0, scale=True)[source]#
Summarize the weights of a neural network model by calculating the mean, median, and positive semidefinite values of the weights for each feature.
Parameters: - weights: The weights of the neural network model. - feature_names: A list of feature names. - layer: The layer of the neural network model from which to extract the weights. - scale: If True, scale the summary values.
Returns: - psd: A DataFrame containing the summary values of the weights for each feature.
- identify_relationships(weights, feature_names, eps=0.5, min_counts=2, plot=True)[source]#
Run a clustering algorithm on the summary values of weights coming out of input cells in the neural network. Summary values are the mean, the median and the positive semidefinite values. Those three dimensions are then clustered to identify what clusters have less or equal than min_count elements, to consider that cluster as relevant to produce the regression for that given feature the NN has been trained for.
Parameters: - weights: The weights of the neural network model. - feature_names: A list of feature names. - eps: The maximum distance between two samples for one to be considered as in
the neighborhood of the other.
min_counts: The minimum number of elements in a cluster to consider it relevant.
plot: If True, plot the clusters.
Returns: - rels: A dictionary containing the relevant features for each target feature.
- infer_causal_relationships(trained_models, feature_names, prune=False, verbose=False, plot=False, prog_bar=True, silent=False)[source]#
Infer causal relationships between the input features and the target variable based on the SHAP values of the trained models.
Parameters: - trained_models: A dictionary of trained models, where the keys are the target
variable names and the values are the trained models.
feature_names: A list of input feature names.
prune: If True, remove asymmetric edges from the graph.
verbose: If True, print additional information.
plot: If True, plot the results.
prog_bar: If True, show a progress bar.
silent: If True, do not show any output.
Returns: - A dictionary containing the SHAP values, the average SHAP values, the thresholds,
the raw graph, and the oriented graph.
A class to train DFF networks for all variables in data. Each network will be trained to predict one of the variables in the data, using the rest as predictors plus one source of random noise.
2022,2023,2024, J. Renero
- class NNRegressor(hidden_dim=[75, 17], activation='relu', learning_rate=0.0046, dropout=0.001, batch_size=44, num_epochs=40, loss_fn='mse', device='cpu', test_size=0.1, early_stop=False, patience=10, min_delta=0.001, correlation_th=None, random_state=1234, verbose=False, prog_bar=True, silent=False, optuna_prog_bar=False)[source]#
Bases:
BaseEstimator
A class to train DFF networks for all variables in data. Each network will be trained to predict one of the variables in the data, using the rest as predictors plus one source of random noise.
Methods
fit
(X)A reference implementation of a fitting function.
get_metadata_routing
()Get metadata routing of this object.
get_params
([deep])Get parameters for this estimator.
predict
(X)Predicts the values for each target variable.
score
(X)Scores the model using the loss function.
set_params
(**params)Set the parameters of this estimator.
tune
(training_data, test_data[, study_name, ...])Tune the hyperparameters of the model using Optuna.
tune_fit
(X[, hpo_study_name, hpo_min_loss, ...])Tune the hyperparameters of the model using Optuna, and the fit the model with the best parameters.
- __init__(hidden_dim=[75, 17], activation='relu', learning_rate=0.0046, dropout=0.001, batch_size=44, num_epochs=40, loss_fn='mse', device='cpu', test_size=0.1, early_stop=False, patience=10, min_delta=0.001, correlation_th=None, random_state=1234, verbose=False, prog_bar=True, silent=False, optuna_prog_bar=False)[source]#
Train DFF networks for all variables in data. Each network will be trained to predict one of the variables in the data, using the rest as predictors plus one source of random noise.
- Parameters:
data (pandas.DataFrame) – The dataframe with the continuous variables.
model_type (str) – The type of model to use. Either ‘dff’ or ‘mlp’.
hidden_dim (int) – The dimension(s) of the hidden layer(s). This value can be a single integer for DFF or an array with the dimension of each hidden layer for the MLP case.
activation (str) – The activation function to use, either ‘relu’ or ‘selu’. Default is ‘relu’.
learning_rate (float) – The learning rate for the optimizer.
dropout (float) – The dropout rate for the dropout layer.
batch_size (int) – The batch size for the optimizer.
num_epochs (int) – The number of epochs for the optimizer.
loss_fn (str) – The loss function to use. Default is “mmd”.
device (str) – The device to use. Either “cpu”, “cuda”, or “mps”. Default is “auto”.
test_size (float) – The proportion of the data to use for testing. Default is 0.1.
seed (int) – The seed for the random number generator. Default is 1234.
early_stop (bool) – Whether to use early stopping. Default is True.
patience (int) – The patience for early stopping. Default is 10.
min_delta (float) – The minimum delta for early stopping. Default is 0.001.
prog_bar (bool) – Whether to enable the progress bar. Default is False.
- Returns:
- A dictionary with the trained DFF networks, using the name of the
variables as the key.
- Return type:
- fit(X)[source]#
A reference implementation of a fitting function.
- Parameters:
X ({array-like, sparse matrix}, shape (n_samples, n_features)) – The training input samples.
y (array-like, shape (n_samples,) or (n_samples, n_outputs)) – The target values (class labels in classification, real numbers in regression).
- Returns:
self – Returns self.
- Return type:
- predict(X)[source]#
Predicts the values for each target variable.
- Parameters:
X (pd.DataFrame) – The input data to make predictions on.
- Returns:
The predictions for each target variable.
- Return type:
np.ndarray
- score(X)[source]#
Scores the model using the loss function. It returns the list of losses for each target variable.
This module contains the GBTRegressor class, which is a wrapper around the GradientBoostingRegressor class from the scikit-learn library. The class implements the fit, predict, and score methods to fit a separate model for each feature in the dataframe, and predict and score the model for each feature in the dataframe.
The class also implements a tune method to tune the hyperparameters of the model using Optuna. The tune method uses the Objective class to define the objective function for the hyperparameter optimization. The Objective class is a nested class within the GBTRegressor class, and it defines the objective function for the hyperparameter optimization. The class is designed to be used with the Optuna library.
The module also contains a main function that can be used to run the GBTRegressor class with the tune method. The main function takes the name of the experiment as an argument, and loads the data and the reference graph for the experiment. The main function then splits the data into train and test, and runs the tune method to tune the hyperparameters of the model. The main function can be used to run the GBTRegressor class with the tune method for any experiment.
The module can be run as a script to run the main function with the tune method for a specific experiment. The experiment name is passed as an argument to the script, and the main function is called with the experiment name as an argument. The script can be used to run the GBTRegressor class with the tune method for any experiment.
Example
$ python gbt.py rex_generated_linear_6
This will run the GBTRegressor class with the tune method for the experiment ‘rex_generated_linear_6’.
The module can also be imported and used in other modules or scripts to run the GBTRegressor class with the tune method for any experiment.
Example
from causalexplain.models.gbt import custom_main
custom_main(“rex_generated_linear_6”)
- class GBTRegressor(loss='squared_error', learning_rate=0.1, n_estimators=100, subsample=1.0, criterion='friedman_mse', min_samples_split=2, min_samples_leaf=1, min_weight_fraction_leaf=0.0, max_depth=3, min_impurity_decrease=0.0, init=None, random_state=42, max_features=None, max_leaf_nodes=None, warm_start=False, validation_fraction=0.1, n_iter_no_change=None, tol=0.0001, ccp_alpha=0.0, correlation_th=None, verbose=False, silent=False, prog_bar=True, optuna_prog_bar=False)[source]#
Bases:
GradientBoostingRegressor
- Attributes:
feature_importances_
The impurity-based feature importances.
Methods
apply
(X)Apply trees in the ensemble to X, return leaf indices.
fit
(X)Call the fit method of the parent class with every feature from the "X" dataframe as a target variable.
get_metadata_routing
()Get metadata routing of this object.
get_params
([deep])Get parameters for this estimator.
predict
(X)Call the predict method of the parent class with every feature from the "X" dataframe as a target variable.
score
(X)Call the score method of the parent class with every feature from the "X" dataframe as a target variable.
set_fit_request
(*[, monitor, sample_weight])Request metadata passed to the
fit
method.set_params
(**params)Set the parameters of this estimator.
set_score_request
(*[, sample_weight])Request metadata passed to the
score
method.staged_predict
(X)Predict regression target at each stage for X.
tune
(training_data, test_data[, study_name, ...])Tune the hyperparameters of the model using Optuna.
tune_fit
(X[, hpo_study_name, hpo_min_loss, ...])Tune the hyperparameters of the model using Optuna, and the fit the model with the best parameters.
- __init__(loss='squared_error', learning_rate=0.1, n_estimators=100, subsample=1.0, criterion='friedman_mse', min_samples_split=2, min_samples_leaf=1, min_weight_fraction_leaf=0.0, max_depth=3, min_impurity_decrease=0.0, init=None, random_state=42, max_features=None, max_leaf_nodes=None, warm_start=False, validation_fraction=0.1, n_iter_no_change=None, tol=0.0001, ccp_alpha=0.0, correlation_th=None, verbose=False, silent=False, prog_bar=True, optuna_prog_bar=False)[source]#
- random_state = 42#
- fit(X)[source]#
Call the fit method of the parent class with every feature from the “X” dataframe as a target variable. This will fit a separate model for each feature in the dataframe.
- predict(X)[source]#
Call the predict method of the parent class with every feature from the “X” dataframe as a target variable. This will predict a separate value for each feature in the dataframe.
- score(X)[source]#
Call the score method of the parent class with every feature from the “X” dataframe as a target variable. This will score a separate model for each feature in the dataframe.
Module contents#
CausalExplain Models#
This module contains the models for causal discovery methods. All the models are implemented in the scikit-learn style, with a fit method to fit the model to the data, a predict method to make predictions, and a score method to evaluate the model performance.
The models are:
GBTRegressor: Gradient Boosting Trees Regressor
NNRegressor: Neural Network Regressor
- class GBTRegressor(loss='squared_error', learning_rate=0.1, n_estimators=100, subsample=1.0, criterion='friedman_mse', min_samples_split=2, min_samples_leaf=1, min_weight_fraction_leaf=0.0, max_depth=3, min_impurity_decrease=0.0, init=None, random_state=42, max_features=None, max_leaf_nodes=None, warm_start=False, validation_fraction=0.1, n_iter_no_change=None, tol=0.0001, ccp_alpha=0.0, correlation_th=None, verbose=False, silent=False, prog_bar=True, optuna_prog_bar=False)[source]#
Bases:
GradientBoostingRegressor
- Attributes:
feature_importances_
The impurity-based feature importances.
Methods
apply
(X)Apply trees in the ensemble to X, return leaf indices.
fit
(X)Call the fit method of the parent class with every feature from the "X" dataframe as a target variable.
get_metadata_routing
()Get metadata routing of this object.
get_params
([deep])Get parameters for this estimator.
predict
(X)Call the predict method of the parent class with every feature from the "X" dataframe as a target variable.
score
(X)Call the score method of the parent class with every feature from the "X" dataframe as a target variable.
set_fit_request
(*[, monitor, sample_weight])Request metadata passed to the
fit
method.set_params
(**params)Set the parameters of this estimator.
set_score_request
(*[, sample_weight])Request metadata passed to the
score
method.staged_predict
(X)Predict regression target at each stage for X.
tune
(training_data, test_data[, study_name, ...])Tune the hyperparameters of the model using Optuna.
tune_fit
(X[, hpo_study_name, hpo_min_loss, ...])Tune the hyperparameters of the model using Optuna, and the fit the model with the best parameters.
- __init__(loss='squared_error', learning_rate=0.1, n_estimators=100, subsample=1.0, criterion='friedman_mse', min_samples_split=2, min_samples_leaf=1, min_weight_fraction_leaf=0.0, max_depth=3, min_impurity_decrease=0.0, init=None, random_state=42, max_features=None, max_leaf_nodes=None, warm_start=False, validation_fraction=0.1, n_iter_no_change=None, tol=0.0001, ccp_alpha=0.0, correlation_th=None, verbose=False, silent=False, prog_bar=True, optuna_prog_bar=False)[source]#
- random_state = 42#
- fit(X)[source]#
Call the fit method of the parent class with every feature from the “X” dataframe as a target variable. This will fit a separate model for each feature in the dataframe.
- predict(X)[source]#
Call the predict method of the parent class with every feature from the “X” dataframe as a target variable. This will predict a separate value for each feature in the dataframe.
- score(X)[source]#
Call the score method of the parent class with every feature from the “X” dataframe as a target variable. This will score a separate model for each feature in the dataframe.
- class NNRegressor(hidden_dim=[75, 17], activation='relu', learning_rate=0.0046, dropout=0.001, batch_size=44, num_epochs=40, loss_fn='mse', device='cpu', test_size=0.1, early_stop=False, patience=10, min_delta=0.001, correlation_th=None, random_state=1234, verbose=False, prog_bar=True, silent=False, optuna_prog_bar=False)[source]#
Bases:
BaseEstimator
A class to train DFF networks for all variables in data. Each network will be trained to predict one of the variables in the data, using the rest as predictors plus one source of random noise.
Methods
fit
(X)A reference implementation of a fitting function.
get_metadata_routing
()Get metadata routing of this object.
get_params
([deep])Get parameters for this estimator.
predict
(X)Predicts the values for each target variable.
score
(X)Scores the model using the loss function.
set_params
(**params)Set the parameters of this estimator.
tune
(training_data, test_data[, study_name, ...])Tune the hyperparameters of the model using Optuna.
tune_fit
(X[, hpo_study_name, hpo_min_loss, ...])Tune the hyperparameters of the model using Optuna, and the fit the model with the best parameters.
- __init__(hidden_dim=[75, 17], activation='relu', learning_rate=0.0046, dropout=0.001, batch_size=44, num_epochs=40, loss_fn='mse', device='cpu', test_size=0.1, early_stop=False, patience=10, min_delta=0.001, correlation_th=None, random_state=1234, verbose=False, prog_bar=True, silent=False, optuna_prog_bar=False)[source]#
Train DFF networks for all variables in data. Each network will be trained to predict one of the variables in the data, using the rest as predictors plus one source of random noise.
- Parameters:
data (pandas.DataFrame) – The dataframe with the continuous variables.
model_type (str) – The type of model to use. Either ‘dff’ or ‘mlp’.
hidden_dim (int) – The dimension(s) of the hidden layer(s). This value can be a single integer for DFF or an array with the dimension of each hidden layer for the MLP case.
activation (str) – The activation function to use, either ‘relu’ or ‘selu’. Default is ‘relu’.
learning_rate (float) – The learning rate for the optimizer.
dropout (float) – The dropout rate for the dropout layer.
batch_size (int) – The batch size for the optimizer.
num_epochs (int) – The number of epochs for the optimizer.
loss_fn (str) – The loss function to use. Default is “mmd”.
device (str) – The device to use. Either “cpu”, “cuda”, or “mps”. Default is “auto”.
test_size (float) – The proportion of the data to use for testing. Default is 0.1.
seed (int) – The seed for the random number generator. Default is 1234.
early_stop (bool) – Whether to use early stopping. Default is True.
patience (int) – The patience for early stopping. Default is 10.
min_delta (float) – The minimum delta for early stopping. Default is 0.001.
prog_bar (bool) – Whether to enable the progress bar. Default is False.
- Returns:
- A dictionary with the trained DFF networks, using the name of the
variables as the key.
- Return type:
- fit(X)[source]#
A reference implementation of a fitting function.
- Parameters:
X ({array-like, sparse matrix}, shape (n_samples, n_features)) – The training input samples.
y (array-like, shape (n_samples,) or (n_samples, n_outputs)) – The target values (class labels in classification, real numbers in regression).
- Returns:
self – Returns self.
- Return type:
- predict(X)[source]#
Predicts the values for each target variable.
- Parameters:
X (pd.DataFrame) – The input data to make predictions on.
- Returns:
The predictions for each target variable.
- Return type:
np.ndarray
- score(X)[source]#
Scores the model using the loss function. It returns the list of losses for each target variable.