pytagi.nn.sequential

pytagi.nn.sequential#

Classes#

Sequential

A sequential container for layers.

Module Contents#

class pytagi.nn.sequential.Sequential(*layers: pytagi.nn.base_layer.BaseLayer)[source]#

A sequential container for layers.

Layers are added to the container in the order they are passed in the constructor. This class acts as a Python wrapper for the C++/CUDA backend cutagi.Sequential.

Example

>>> import pytagi.nn as nn
>>> model = nn.Sequential(
...     nn.Linear(10, 20),
...     nn.ReLU(),
...     nn.Linear(20, 5)
... )
>>> mu_in = np.random.randn(1, 10)
>>> var_in = np.abs(np.random.randn(1, 10))
>>> mu_out, var_out = model(mu_in, var_in)

Initializes the Sequential model with a sequence of layers.

Parameters:: layers (BaseLayer) – A variable number of layer instances (e.g., Linear, ReLU) that will be executed in sequence.

__call__(mu_x: numpy.ndarray, var_x: numpy.ndarray = None) → Tuple[numpy.ndarray, numpy.ndarray][source]#

An alias for the forward pass.

Parameters:

mu_x (np.ndarray) – The mean of the input data.
var_x (np.ndarray, optional) – The variance of the input data. Defaults to None.

Returns:

A tuple containing the mean and variance of the output.

Return type:

Tuple[np.ndarray, np.ndarray]

property layers: List[pytagi.nn.base_layer.BaseLayer][source]#: The list of layers in the model.

property output_z_buffer: pytagi.nn.data_struct.BaseHiddenStates[source]#: The output hidden states buffer from the forward pass.

property input_delta_z_buffer: pytagi.nn.data_struct.BaseDeltaStates[source]#: The input delta states buffer used in the backward pass.

property output_delta_z_buffer: pytagi.nn.data_struct.BaseDeltaStates[source]#: The output delta states buffer from the backward pass.

property z_buffer_size: int[source]#: The size of the hidden state (z) buffer.

property z_buffer_block_size: int[source]#: The block size of the hidden state (z) buffer.

property device: str[source]#: The computational device (‘cpu’ or ‘cuda’) the model is on.

property input_state_update: bool[source]#: Flag indicating if the input state should be updated.

property num_samples: int[source]#: The number of samples used for Monte Carlo estimation. This is used for debugging purposes

to_device(device: str)[source]#

Moves the model and its parameters to a specified device.

Parameters:: device (str) – The target device, e.g., ‘cpu’ or ‘cuda:0’.

params_to_device()[source]#: Moves the model parameters to the currently configured CUDA device.

params_to_host()[source]#: Moves the model parameters from the CUDA device to the host (CPU).

set_threads(num_threads: int)[source]#

Sets the number of CPU threads to use for computation.

Parameters:: num_threads (int) – The number of threads.

train()[source]#: Sets the model to training mode.

eval()[source]#: Sets the model to evaluation mode.

forward(mu_x: numpy.ndarray, var_x: numpy.ndarray = None) → Tuple[numpy.ndarray, numpy.ndarray][source]#

Performs a forward pass through the network.

Parameters:

mu_x (np.ndarray) – The mean of the input data.
var_x (np.ndarray, optional) – The variance of the input data. Defaults to None.

Returns:

A tuple containing the mean and variance of the output.

Return type:

Tuple[np.ndarray, np.ndarray]

backward()[source]#: Performs a backward pass to update the network parameters.

smoother() → Tuple[numpy.ndarray, numpy.ndarray][source]#

Performs a smoother pass (e.g., Rauch-Tung-Striebel smoother).

This is used with the SLSTM to refine estimates by running backwards through time.

Returns:: A tuple containing the mean and variance of the smoothed output.
Return type:: Tuple[np.ndarray, np.ndarray]

step()[source]#: Performs a single step of inference to update the parameters.

reset_lstm_states()[source]#: Resets the hidden and cell states of all LSTM layers in the model.

output_to_host() → List[float][source]#

Copies the raw output data from the device to the host.

Returns:: A list of floating-point values representing the flattened output.
Return type:: List[float]

delta_z_to_host() → List[float][source]#

Copies the raw delta Z (error signal) data from the device to the host.

Returns:: A list of floating-point values representing the flattened delta Z.
Return type:: List[float]

set_delta_z(delta_mu: numpy.ndarray, delta_var: numpy.ndarray)[source]#

Sets the delta Z (error signal) on the device for the backward pass.

Parameters:

delta_mu (np.ndarray) – The mean of the error signal.
delta_var (np.ndarray) – The variance of the error signal.

get_layer_stack_info() → str[source]#

Gets a string representation of the layer stack architecture.

Returns:: A descriptive string of the model’s layers.
Return type:: str

preinit_layer()[source]#: Pre-initializes the layers in the model.

get_neg_var_w_counter() → dict[source]#

Counts the number of negative variance weights in each layer.

Returns:: A dictionary where keys are layer names and values are the counts of negative variances.
Return type:: dict

save(filename: str)[source]#

Saves the model’s state to a binary file.

Parameters:: filename (str) – The path to the file where the model will be saved.

load(filename: str)[source]#

Loads the model’s state from a binary file.

Parameters:: filename (str) – The path to the file from which to load the model.

save_csv(filename: str)[source]#

Saves the model parameters to a CSV file.

Parameters:: filename (str) – The base path for the CSV file(s).

load_csv(filename: str)[source]#

Loads the model parameters from a CSV file.

Parameters:: filename (str) – The base path of the CSV file(s).

parameters() → List[Tuple[numpy.ndarray, numpy.ndarray, numpy.ndarray, numpy.ndarray]][source]#

Gets all model parameters.

Returns:: A list where each element is a tuple containing the parameters for a layer: (mu_w, var_w, mu_b, var_b).
Return type:: List[Tuple[np.ndarray, np.ndarray, np.ndarray, np.ndarray]]

load_state_dict(state_dict: dict)[source]#

Loads the model’s parameters from a state dictionary.

Parameters:: state_dict (dict) – A dictionary containing the model’s state.

state_dict() → dict[source]#

Gets the model’s parameters as a state dictionary.

Returns:: A dictionary where each key is the layer name and the value is a tuple of parameters: (mu_w, var_w, mu_b, var_b).
Return type:: dict

params_from(other: Sequential)[source]#

Copies parameters from another Sequential model.

Parameters:: other (Sequential) – The source model from which to copy parameters.

get_outputs() → Tuple[numpy.ndarray, numpy.ndarray][source]#

Gets the outputs from the last forward pass.

Returns:: A tuple containing the mean and variance of the output.
Return type:: Tuple[np.ndarray, np.ndarray]

get_outputs_smoother() → Tuple[numpy.ndarray, numpy.ndarray][source]#

Gets the outputs from the last smoother pass.

Returns:: A tuple containing the mean and variance of the smoothed output.
Return type:: Tuple[np.ndarray, np.ndarray]

get_input_states() → Tuple[numpy.ndarray, numpy.ndarray][source]#

Gets the input states of the model.

Returns:: A tuple containing the mean and variance of the input states.
Return type:: Tuple[np.ndarray, np.ndarray]

get_norm_mean_var() → dict[source]#

Gets the mean and variance from normalization layers.

Returns:: A dictionary where each key is a normalization layer name and the value is a tuple of four arrays: (mu_batch, var_batch, mu_ema_batch, var_ema_batch).
Return type:: dict

get_lstm_states(time_step: int = -1) → dict[source]#

Get the LSTM states for all LSTM layers as a dictionary.

Parameters:: time_step (int, optional) – The time step at which to retrieve the smoothed SLSTM states. If not provided or -1, retrieves the unsmoothed current LSTM states.
Returns:: A dictionary mapping layer indices to a 4-tuple of numpy arrays: (mu_h_prior, var_h_prior, mu_c_prior, var_c_prior).
Return type:: dict

set_lstm_states(states: dict) → None[source]#

Sets the states for all LSTM layers.

Parameters:: states (dict) – A dictionary mapping layer indices to a 4-tuple of numpy arrays: (mu_h_prior, var_h_prior, mu_c_prior, var_c_prior).