Class SequenceLastLayer<T>
- Namespace
- AiDotNet.NeuralNetworks.Layers
- Assembly
- AiDotNet.dll
A layer that extracts the last timestep from a sequence.
public class SequenceLastLayer<T> : LayerBase<T>, ILayer<T>, IJitCompilable<T>, IDiagnosticsProvider, IWeightLoadable<T>, IDisposable
Type Parameters
TThe numeric type used for calculations.
- Inheritance
-
LayerBase<T>SequenceLastLayer<T>
- Implements
-
ILayer<T>
- Inherited Members
Remarks
This layer is used after recurrent layers (RNN, LSTM, GRU) when the task requires a single output from the entire sequence, such as sequence classification.
For Beginners: When processing sequences (like sentences or time series), recurrent layers output a value for each timestep. For tasks like classification, we often only need the final output (after seeing the whole sequence). This layer extracts just that last output.
Constructors
SequenceLastLayer(int)
Initializes a new SequenceLastLayer.
public SequenceLastLayer(int featureSize)
Parameters
featureSizeintThe size of the feature dimension (last dimension of input).
Properties
ParameterCount
Gets the total number of parameters in this layer.
public override int ParameterCount { get; }
Property Value
- int
The total number of trainable parameters.
Remarks
This property returns the total number of trainable parameters in the layer. By default, it returns the length of the Parameters vector, but derived classes can override this to calculate the number of parameters differently.
For Beginners: This tells you how many learnable values the layer has.
The parameter count:
- Shows how complex the layer is
- Indicates how many values need to be learned during training
- Can help estimate memory usage and computational requirements
Layers with more parameters can potentially learn more complex patterns but may also require more data to train effectively.
SupportsGpuExecution
Indicates whether this layer supports GPU execution.
protected override bool SupportsGpuExecution { get; }
Property Value
SupportsGpuTraining
Gets whether this layer has full GPU training support (forward, backward, and parameter updates).
public override bool SupportsGpuTraining { get; }
Property Value
Remarks
This property indicates whether the layer can perform its entire training cycle on GPU without downloading data to CPU. A layer has full GPU training support when:
- ForwardGpu is implemented
- BackwardGpu is implemented
- UpdateParametersGpu is implemented (for layers with trainable parameters)
- GPU weight/bias/gradient buffers are properly managed
For Beginners: This tells you if training can happen entirely on GPU.
GPU-resident training is much faster because:
- Data stays on GPU between forward and backward passes
- No expensive CPU-GPU transfers during each training step
- GPU kernels handle all gradient computation
Only layers that return true here can participate in fully GPU-resident training.
SupportsJitCompilation
Gets a value indicating whether this layer supports JIT compilation.
public override bool SupportsJitCompilation { get; }
Property Value
SupportsTraining
Gets a value indicating whether this layer supports training.
public override bool SupportsTraining { get; }
Property Value
- bool
falsebecause this layer has no trainable parameters.
Methods
Backward(Tensor<T>)
Backward pass: distributes gradient to the last timestep only.
public override Tensor<T> Backward(Tensor<T> outputGradient)
Parameters
outputGradientTensor<T>Gradient from the next layer.
Returns
- Tensor<T>
Gradient with shape matching the original input (zeros except at last timestep).
BackwardGpu(IGpuTensor<T>)
Performs the backward pass of the layer on GPU.
public override IGpuTensor<T> BackwardGpu(IGpuTensor<T> outputGradient)
Parameters
outputGradientIGpuTensor<T>The GPU-resident gradient of the loss with respect to the layer's output.
Returns
- IGpuTensor<T>
The GPU-resident gradient of the loss with respect to the layer's input.
Remarks
This method performs the layer's backward computation entirely on GPU, including:
- Computing input gradients to pass to previous layers
- Computing and storing weight gradients on GPU (for layers with trainable parameters)
- Computing and storing bias gradients on GPU
For Beginners: This is like Backward() but runs entirely on GPU.
During GPU training:
- Output gradients come in (on GPU)
- Input gradients are computed (stay on GPU)
- Weight/bias gradients are computed and stored (on GPU)
- Input gradients are returned for the previous layer
All data stays on GPU - no CPU round-trips needed!
Exceptions
- NotSupportedException
Thrown when the layer does not support GPU training.
ExportComputationGraph(List<ComputationNode<T>>)
Exports the computation graph for this layer.
public override ComputationNode<T> ExportComputationGraph(List<ComputationNode<T>> inputNodes)
Parameters
inputNodesList<ComputationNode<T>>
Returns
Forward(Tensor<T>)
Extracts the last timestep from the input sequence.
public override Tensor<T> Forward(Tensor<T> input)
Parameters
inputTensor<T>Input tensor of shape [seqLen, features] or [seqLen, batch, features].
Returns
- Tensor<T>
Output tensor of shape [features] or [batch, features].
ForwardGpu(params IGpuTensor<T>[])
GPU-accelerated forward pass that extracts the last timestep from a sequence. Uses zero-copy CreateView to extract the last slice directly on GPU.
public override IGpuTensor<T> ForwardGpu(params IGpuTensor<T>[] inputs)
Parameters
inputsIGpuTensor<T>[]GPU-resident input tensors.
Returns
- IGpuTensor<T>
GPU-resident output tensor containing the last timestep.
GetParameters()
Returns an empty vector since this layer has no trainable parameters.
public override Vector<T> GetParameters()
Returns
- Vector<T>
ResetState()
Reset state is a no-op since this layer maintains no state between forward passes.
public override void ResetState()
UpdateParameters(T)
Update parameters is a no-op since this layer has no trainable parameters.
public override void UpdateParameters(T learningRate)
Parameters
learningRateT