Class ConvLSTMLayer<T>
- Namespace
- AiDotNet.NeuralNetworks.Layers
- Assembly
- AiDotNet.dll
Implements a Convolutional Long Short-Term Memory (ConvLSTM) layer for processing sequential spatial data.
public class ConvLSTMLayer<T> : LayerBase<T>, ILayer<T>, IJitCompilable<T>, IDiagnosticsProvider, IWeightLoadable<T>, IDisposable
Type Parameters
TThe numeric type used for computations (e.g., float, double).
- Inheritance
-
LayerBase<T>ConvLSTMLayer<T>
- Implements
-
ILayer<T>
- Inherited Members
Remarks
ConvLSTM combines convolutional operations with LSTM (Long Short-Term Memory) to handle spatial-temporal data. It's particularly useful for tasks involving sequences of images or spatial data, such as video prediction, weather forecasting, and spatiotemporal sequence prediction.
Key features of ConvLSTM: - Maintains spatial information throughout the processing - Captures both spatial and temporal dependencies - Uses convolutional operations instead of matrix multiplications in the LSTM cell - Suitable for data with both spatial and temporal structure
For Beginners: ConvLSTM is like a smart video analyzer that remembers spatial patterns over time.
Imagine you're watching a video of clouds moving across the sky:
- ConvLSTM looks at each frame (like a photo) in the video sequence
- It remembers important spatial features (like cloud shapes) from previous frames
- It uses this memory to predict how these features might change in future frames
This layer is particularly good at:
- Predicting what might happen next in a video
- Analyzing patterns in weather maps over time
- Understanding how spatial arrangements change in a sequence
Unlike simpler layers that treat each frame independently, ConvLSTM connects the dots between frames, making it powerful for tasks involving moving images or changing spatial data.
Constructors
ConvLSTMLayer(int[], int, int, int, int, IActivationFunction<T>?)
public ConvLSTMLayer(int[] inputShape, int kernelSize, int filters, int padding = 1, int strides = 1, IActivationFunction<T>? activationFunction = null)
Parameters
inputShapeint[]kernelSizeintfiltersintpaddingintstridesintactivationFunctionIActivationFunction<T>
ConvLSTMLayer(int[], int, int, int, int, IVectorActivationFunction<T>?)
Initializes a new instance of the ConvLSTMLayer class with a vector activation function.
public ConvLSTMLayer(int[] inputShape, int kernelSize, int filters, int padding = 1, int strides = 1, IVectorActivationFunction<T>? vectorActivationFunction = null)
Parameters
inputShapeint[]The shape of the input tensor [batch, time, height, width, channels].
kernelSizeintThe size of the convolutional kernel (filter).
filtersintThe number of output filters (channels) for the layer.
paddingintThe padding added to the input.
stridesintThe stride of the convolution.
vectorActivationFunctionIVectorActivationFunction<T>The vector activation function to use. Defaults to Tanh if not specified.
Remarks
This constructor allows using a vector activation function that can process entire tensors at once, which may be more efficient for certain operations.
For Beginners: This constructor is similar to the first one, but uses a special type of activation function.
A vector activation function:
- Processes entire groups of numbers at once, rather than one at a time
- Can be faster for large datasets
- Works the same way as the regular activation function, just with different internal machinery
You would use this version if you're working with very large datasets where processing speed is important, or if you have a specific vector activation function you want to use.
Properties
SupportsGpuExecution
Gets a value indicating whether this layer supports GPU-accelerated forward pass.
protected override bool SupportsGpuExecution { get; }
Property Value
Remarks
ConvLSTM supports GPU execution when a DirectGpuTensorEngine is available. The GPU implementation uses FusedConv2DGpu for convolutions and GPU-native gate operations.
SupportsGpuTraining
Gets a value indicating whether this layer supports GPU training.
public override bool SupportsGpuTraining { get; }
Property Value
SupportsJitCompilation
Gets a value indicating whether this layer supports JIT compilation.
public override bool SupportsJitCompilation { get; }
Property Value
- bool
Always
true. ConvLSTMLayer exports a single-step LSTM cell computation with full Conv2D operations for all gates.
Remarks
JIT compilation for ConvLSTM exports a single timestep of the LSTM cell computation. The exported graph uses proper Conv2D operations for all gate computations, matching the behavior of the Forward method.
For processing sequences with the JIT-compiled graph:
- Initialize hidden and cell states to zero tensors
- For each timestep, call the compiled graph with (input, h_prev, c_prev)
- The output is the new hidden state h_t
- Track cell state c_t for the next iteration (available from intermediate computation)
SupportsTraining
The computation engine (CPU or GPU) for vectorized operations.
public override bool SupportsTraining { get; }
Property Value
- bool
trueindicating that the layer supports training; this value is always true for ConvLSTM layers.
Remarks
This property indicates whether the ConvLSTM layer can be trained through backpropagation. ConvLSTM layers always return true as they contain trainable parameters (weights and biases).
For Beginners: This property tells you if the layer can learn from data.
A value of true means:
- The layer can adjust its internal values during training
- It will improve its performance as it sees more data
- It participates in the learning process
ConvLSTM layers always return true because they have parameters (like weights and biases) that can be updated during training to learn patterns in spatio-temporal data (like videos or weather data).
Methods
Backward(Tensor<T>)
Performs the backward pass of the ConvLSTM layer, computing gradients for all parameters.
public override Tensor<T> Backward(Tensor<T> outputGradient)
Parameters
outputGradientTensor<T>Gradient flowing back from the next layer with shape [batchSize, timeSteps, height, width, filters]
Returns
- Tensor<T>
Gradient with respect to the input with shape [batchSize, timeSteps, height, width, channels]
Remarks
This method implements backpropagation through time (BPTT) for the ConvLSTM layer: 1. Initializes gradient tensors for all parameters 2. Iterates backward through time steps 3. Computes gradients for each time step using BackwardStep 4. Accumulates gradients across all time steps 5. Stores gradients for later use in parameter updates
For Beginners: This method figures out how to improve the layer during training.
During the backward pass:
- The layer receives information about how to adjust its output to reduce errors
- It works backwards through the sequence (from the most recent frame to the earliest)
- It calculates how each of its internal values (weights and biases) should change
- It also calculates how the input should have been different to reduce errors
Think of it like a coach reviewing a game film backwards, noting what each player should have done differently at each moment to get a better outcome.
BackwardGpu(IGpuTensor<T>)
Performs GPU-accelerated backward pass for ConvLSTM using Backpropagation Through Time (BPTT).
public override IGpuTensor<T> BackwardGpu(IGpuTensor<T> outputGradient)
Parameters
outputGradientIGpuTensor<T>GPU tensor with gradient from next layer [batch, timesteps, H, W, filters].
Returns
- IGpuTensor<T>
GPU tensor with input gradients [batch, timesteps, H, W, channels].
Remarks
This method implements full BPTT on GPU, computing gradients through all timesteps in reverse order. It uses the cached gate values, hidden states, and cell states from the forward pass.
Exceptions
- InvalidOperationException
Thrown when ForwardGpu has not been called in training mode.
ExportComputationGraph(List<ComputationNode<T>>)
Exports the ConvLSTM computation graph for JIT compilation.
public override ComputationNode<T> ExportComputationGraph(List<ComputationNode<T>> inputNodes)
Parameters
inputNodesList<ComputationNode<T>>List to which input nodes will be added. The method adds:
- x_t: Current input tensor [batch, height, width, channels]
- h_prev: Previous hidden state [batch, height, width, filters]
- c_prev: Previous cell state [batch, height, width, filters]
Returns
- ComputationNode<T>
A computation node representing the new hidden state h_t.
Remarks
This method exports a single timestep of the ConvLSTM cell for JIT compilation. The computation graph implements the full ConvLSTM equations using Conv2D operations:
Gates (all use Conv2D operations):
- Forget gate: f_t = σ(Conv2D(x_t, W_fi) + Conv2D(h_{t-1}, W_fh) + b_f)
- Input gate: i_t = σ(Conv2D(x_t, W_ii) + Conv2D(h_{t-1}, W_ih) + b_i)
- Cell candidate: c̃_t = tanh(Conv2D(x_t, W_ci) + Conv2D(h_{t-1}, W_ch) + b_c)
- Output gate: o_t = σ(Conv2D(x_t, W_oi) + Conv2D(h_{t-1}, W_oh) + b_o)
State updates:
- Cell state: c_t = f_t ⊙ c_{t-1} + i_t ⊙ c̃_t
- Hidden state: h_t = o_t ⊙ tanh(c_t)
For Beginners: This method creates a blueprint for running ConvLSTM faster.
For processing sequences:
- Initialize h_prev and c_prev to zeros for the first timestep
- Call the JIT-compiled graph for each timestep in your sequence
- Pass the output hidden state as h_prev for the next timestep
- Track cell state separately if needed for stateful operation
Forward(Tensor<T>)
Performs the forward pass of the ConvLSTM layer.
public override Tensor<T> Forward(Tensor<T> input)
Parameters
inputTensor<T>The input tensor with shape [batchSize, timeSteps, height, width, channels].
Returns
- Tensor<T>
The output tensor after processing through the ConvLSTM layer.
Remarks
The forward pass processes the input sequence through the ConvLSTM cells, updating the hidden state and cell state at each time step. It applies the convolutional operations within the LSTM structure to maintain spatial information.
For Beginners: This method is like running your video through the analyzer.
During the forward pass, for each frame in the sequence:
- The layer looks at the current frame and its memory of previous frames
- It updates its memory based on what it sees in the current frame
- It produces an output that combines information from the current frame and its memory
This process allows the layer to:
- Remember important features from earlier in the sequence
- Understand how spatial patterns are changing over time
- Produce outputs that consider both the current input and the history
The result is a new sequence that captures the layer's understanding of the spatial-temporal patterns in your input data.
ForwardGpu(params IGpuTensor<T>[])
Performs a GPU-resident forward pass of the ConvLSTM layer.
public override IGpuTensor<T> ForwardGpu(params IGpuTensor<T>[] inputs)
Parameters
inputsIGpuTensor<T>[]GPU-resident input tensor(s).
Returns
- IGpuTensor<T>
GPU-resident output tensor after ConvLSTM processing.
Remarks
For Beginners: This is the GPU-optimized version of the Forward method. All data stays on the GPU throughout the computation, avoiding expensive CPU-GPU transfers. The ConvLSTM gates are computed using GPU convolutions and element-wise operations.
During training (IsTrainingMode == true), this method caches gate values and state buffers needed by BackwardGpu to perform full BPTT on GPU.
Exceptions
- ArgumentException
Thrown when no input tensor is provided.
- InvalidOperationException
Thrown when GPU backend is unavailable.
GetParameters()
Retrieves all trainable parameters of the ConvLSTM layer as a flattened vector.
public override Vector<T> GetParameters()
Returns
- Vector<T>
A vector containing all weights and biases of the layer
Remarks
This method flattens all trainable parameters into a single vector in the following order:
1. Input weights: _weightsFi, _weightsIi, _weightsCi, _weightsOi 2. Hidden weights: _weightsFh, _weightsIh, _weightsCh, _weightsOh 3. Biases: _biasF, _biasI, _biasC, _biasO
For Beginners: This method collects all the learnable values into one long list.
It's like taking all the knobs and dials from the control panel and listing them in a single row:
- First, it counts how many total numbers need to be stored
- Then it creates a vector (a one-dimensional array) of that size
- Finally, it copies all the weights and biases into this vector in a specific order
This is useful for:
- Saving all parameters to a file
- Loading parameters from a file
- Certain optimization techniques that work with all parameters at once
- Tracking how many learnable parameters the layer has in total
ResetState()
Resets the internal state of the ConvLSTM layer.
public override void ResetState()
Remarks
This method clears all cached values and gradients from previous forward and backward passes:
1. Clears the cached input tensor (_lastInput) 2. Clears the cached hidden state (_lastHiddenState) 3. Clears the cached cell state (_lastCellState) 4. Clears all accumulated gradients
For Beginners: This method clears the layer's memory to start fresh.
It's like erasing a whiteboard to start a new lesson:
- The layer forgets the last input it processed
- It clears its internal memory states (hidden and cell states)
- It discards any stored gradients from previous training
This is important when:
- Starting to process a new, unrelated sequence
- Beginning a new training epoch
- Testing the model on different data
- You want to ensure that information from previous sequences doesn't influence the processing of new sequences
SetParameters(Vector<T>)
Sets all trainable parameters of the ConvLSTM layer from a flattened vector.
public override void SetParameters(Vector<T> parameters)
Parameters
parametersVector<T>Vector containing all weights and biases to set
Remarks
This method updates all trainable parameters from a single vector in the following order:
1. Input weights: _weightsFi, _weightsIi, _weightsCi, _weightsOi 2. Hidden weights: _weightsFh, _weightsIh, _weightsCh, _weightsOh 3. Biases: _biasF, _biasI, _biasC, _biasO
For Beginners: This method loads all learnable values from a single list.
It's the opposite of GetParameters():
- It takes a long list of numbers (the parameters vector)
- It distributes these numbers back into the appropriate weight and bias tensors
- It follows the same order that was used when creating the vector
This is useful when:
- Loading a previously saved model
- Initializing with pre-trained weights
- Testing with specific parameter values
- Implementing advanced optimization techniques
UpdateParameters(T)
Updates all trainable parameters of the layer using the computed gradients and specified learning rate.
public override void UpdateParameters(T learningRate)
Parameters
learningRateTThe learning rate controlling how much to adjust parameters
Remarks
This method applies gradient descent with momentum to update all weights and biases:
1. First checks if gradients are available from a previous backward pass 2. Updates all input weights (weightsFi, weightsIi, weightsCi, weightsOi) 3. Updates all hidden weights (weightsFh, weightsIh, weightsCh, weightsOh) 4. Updates all biases (biasF, biasI, biasC, biasO) 5. Clears gradients after all updates are complete
For Beginners: This method applies the calculated updates to all weights and biases.
After figuring out how parameters should change:
- The learningRate controls how big each adjustment is
- Smaller values make small, cautious changes
- Larger values make bigger, more aggressive changes
The method also uses "momentum," which is like inertia:
- If parameters have been moving in a certain direction, they tend to keep going
- This helps navigate flat regions and avoid getting stuck in local minima
- Think of it like rolling a ball downhill - it builds up speed in the right direction
After updating all parameters, the gradients are cleared to prepare for the next training batch.
UpdateParametersGpu(IGpuOptimizerConfig)
GPU-resident parameter update with polymorphic optimizer support. Updates all weight tensors directly on GPU using the specified optimizer configuration.
public override void UpdateParametersGpu(IGpuOptimizerConfig config)
Parameters
configIGpuOptimizerConfigGPU optimizer configuration specifying the optimizer type and hyperparameters.