pytagi.nn.lstm#

Classes#

LSTM

A Long Short-Term Memory (LSTM) layer for RNNs. It inherits from BaseLayer.

Module Contents#

class pytagi.nn.lstm.LSTM(input_size: int, output_size: int, seq_len: int, bias: bool = True, gain_weight: float = 1.0, gain_bias: float = 1.0, init_method: str = 'He')[source]#

Bases: pytagi.nn.base_layer.BaseLayer

A Long Short-Term Memory (LSTM) layer for RNNs. It inherits from BaseLayer.

Initializes the LSTM layer.

Parameters:
  • input_size – The number of features in the input tensor at each time step.

  • output_size – The size of the hidden state (\(h_t\)), which is the number of features in the output tensor at each time step.

  • seq_len – The maximum length of the input sequence. This is often required for efficient memory allocation in C++/CUDA backends like cuTAGI.

  • bias – If True, the internal gates and cell state updates will include an additive bias vector. Defaults to True.

  • gain_weight – Scaling factor applied to the initialized weights (\(W\)). Defaults to 1.0.

  • gain_bias – Scaling factor applied to the initialized biases (\(b\)). Defaults to 1.0.

  • init_method – The method used for initializing the weights and biases (e.g., “He”, “Xavier”). Defaults to “He”.

get_layer_info() str[source]#

Retrieves a descriptive string containing information about the layer’s configuration (e.g., input/output size, sequence length) from the C++ backend.

get_layer_name() str[source]#

Retrieves the name of the layer (e.g., ‘LSTM’) from the C++ backend.

init_weight_bias()[source]#

Initializes the various weight matrices and bias vectors used by the LSTM’s gates (input, forget, output) and cell state updates, using the specified method and gain factors. This task is delegated to the C++ backend.