Table of Contents

Class SequenceLastLayer<T>

Namespace
AiDotNet.NeuralNetworks.Layers
Assembly
AiDotNet.dll

A layer that extracts the last timestep from a sequence.

public class SequenceLastLayer<T> : LayerBase<T>, ILayer<T>, IJitCompilable<T>, IDiagnosticsProvider, IWeightLoadable<T>, IDisposable

Type Parameters

T

The numeric type used for calculations.

Inheritance
SequenceLastLayer<T>
Implements
Inherited Members

Remarks

This layer is used after recurrent layers (RNN, LSTM, GRU) when the task requires a single output from the entire sequence, such as sequence classification.

For Beginners: When processing sequences (like sentences or time series), recurrent layers output a value for each timestep. For tasks like classification, we often only need the final output (after seeing the whole sequence). This layer extracts just that last output.

Constructors

SequenceLastLayer(int)

Initializes a new SequenceLastLayer.

public SequenceLastLayer(int featureSize)

Parameters

featureSize int

The size of the feature dimension (last dimension of input).

Properties

ParameterCount

Gets the total number of parameters in this layer.

public override int ParameterCount { get; }

Property Value

int

The total number of trainable parameters.

Remarks

This property returns the total number of trainable parameters in the layer. By default, it returns the length of the Parameters vector, but derived classes can override this to calculate the number of parameters differently.

For Beginners: This tells you how many learnable values the layer has.

The parameter count:

  • Shows how complex the layer is
  • Indicates how many values need to be learned during training
  • Can help estimate memory usage and computational requirements

Layers with more parameters can potentially learn more complex patterns but may also require more data to train effectively.

SupportsGpuExecution

Indicates whether this layer supports GPU execution.

protected override bool SupportsGpuExecution { get; }

Property Value

bool

SupportsGpuTraining

Gets whether this layer has full GPU training support (forward, backward, and parameter updates).

public override bool SupportsGpuTraining { get; }

Property Value

bool

Remarks

This property indicates whether the layer can perform its entire training cycle on GPU without downloading data to CPU. A layer has full GPU training support when:

  • ForwardGpu is implemented
  • BackwardGpu is implemented
  • UpdateParametersGpu is implemented (for layers with trainable parameters)
  • GPU weight/bias/gradient buffers are properly managed

For Beginners: This tells you if training can happen entirely on GPU.

GPU-resident training is much faster because:

  • Data stays on GPU between forward and backward passes
  • No expensive CPU-GPU transfers during each training step
  • GPU kernels handle all gradient computation

Only layers that return true here can participate in fully GPU-resident training.

SupportsJitCompilation

Gets a value indicating whether this layer supports JIT compilation.

public override bool SupportsJitCompilation { get; }

Property Value

bool

SupportsTraining

Gets a value indicating whether this layer supports training.

public override bool SupportsTraining { get; }

Property Value

bool

false because this layer has no trainable parameters.

Methods

Backward(Tensor<T>)

Backward pass: distributes gradient to the last timestep only.

public override Tensor<T> Backward(Tensor<T> outputGradient)

Parameters

outputGradient Tensor<T>

Gradient from the next layer.

Returns

Tensor<T>

Gradient with shape matching the original input (zeros except at last timestep).

BackwardGpu(IGpuTensor<T>)

Performs the backward pass of the layer on GPU.

public override IGpuTensor<T> BackwardGpu(IGpuTensor<T> outputGradient)

Parameters

outputGradient IGpuTensor<T>

The GPU-resident gradient of the loss with respect to the layer's output.

Returns

IGpuTensor<T>

The GPU-resident gradient of the loss with respect to the layer's input.

Remarks

This method performs the layer's backward computation entirely on GPU, including:

  • Computing input gradients to pass to previous layers
  • Computing and storing weight gradients on GPU (for layers with trainable parameters)
  • Computing and storing bias gradients on GPU

For Beginners: This is like Backward() but runs entirely on GPU.

During GPU training:

  1. Output gradients come in (on GPU)
  2. Input gradients are computed (stay on GPU)
  3. Weight/bias gradients are computed and stored (on GPU)
  4. Input gradients are returned for the previous layer

All data stays on GPU - no CPU round-trips needed!

Exceptions

NotSupportedException

Thrown when the layer does not support GPU training.

ExportComputationGraph(List<ComputationNode<T>>)

Exports the computation graph for this layer.

public override ComputationNode<T> ExportComputationGraph(List<ComputationNode<T>> inputNodes)

Parameters

inputNodes List<ComputationNode<T>>

Returns

ComputationNode<T>

Forward(Tensor<T>)

Extracts the last timestep from the input sequence.

public override Tensor<T> Forward(Tensor<T> input)

Parameters

input Tensor<T>

Input tensor of shape [seqLen, features] or [seqLen, batch, features].

Returns

Tensor<T>

Output tensor of shape [features] or [batch, features].

ForwardGpu(params IGpuTensor<T>[])

GPU-accelerated forward pass that extracts the last timestep from a sequence. Uses zero-copy CreateView to extract the last slice directly on GPU.

public override IGpuTensor<T> ForwardGpu(params IGpuTensor<T>[] inputs)

Parameters

inputs IGpuTensor<T>[]

GPU-resident input tensors.

Returns

IGpuTensor<T>

GPU-resident output tensor containing the last timestep.

GetParameters()

Returns an empty vector since this layer has no trainable parameters.

public override Vector<T> GetParameters()

Returns

Vector<T>

ResetState()

Reset state is a no-op since this layer maintains no state between forward passes.

public override void ResetState()

UpdateParameters(T)

Update parameters is a no-op since this layer has no trainable parameters.

public override void UpdateParameters(T learningRate)

Parameters

learningRate T