Class SequenceLastLayer<T>

Namespace: AiDotNet.NeuralNetworks.Layers

Assembly: AiDotNet.dll

A layer that extracts the last timestep from a sequence.

public class SequenceLastLayer<T> : LayerBase<T>, ILayer<T>, IJitCompilable<T>, IDiagnosticsProvider, IWeightLoadable<T>, IDisposable

Type Parameters

T: The numeric type used for calculations.

Inheritance: object

LayerBase<T>

SequenceLastLayer<T>

Implements: ILayer<T>

IJitCompilable<T>

IDiagnosticsProvider

IWeightLoadable<T>

IDisposable

Inherited Members: LayerBase<T>.Engine

LayerBase<T>.ScalarActivation

LayerBase<T>.VectorActivation

LayerBase<T>.UsingVectorActivation

LayerBase<T>.NumOps

LayerBase<T>.Random

LayerBase<T>.Parameters

LayerBase<T>.ParameterGradients

LayerBase<T>.InputShape

LayerBase<T>.InputShapes

LayerBase<T>.UpdateInputShape(int[])

LayerBase<T>.OutputShape

LayerBase<T>.IsTrainingMode

LayerBase<T>.InitializationStrategy

LayerBase<T>.IsInitialized

LayerBase<T>.InitializationLock

LayerBase<T>.EnsureInitialized()

LayerBase<T>.UseAutodiff

LayerBase<T>.SetTrainingMode(bool)

LayerBase<T>.GetParameterGradients()

LayerBase<T>.ClearGradients()

LayerBase<T>.GetInputShape()

LayerBase<T>.GetInputShapes()

LayerBase<T>.GetOutputShape()

LayerBase<T>.GetWeights()

LayerBase<T>.GetBiases()

LayerBase<T>.MapActivationToFused()

LayerBase<T>.CanExecuteOnGpu

LayerBase<T>.CanTrainOnGpu

LayerBase<T>.UpdateParametersGpu(IGpuOptimizerConfig)

LayerBase<T>.UploadWeightsToGpu()

LayerBase<T>.DownloadWeightsFromGpu()

LayerBase<T>.ZeroGradientsGpu()

LayerBase<T>.GetActivationTypes()

LayerBase<T>.Forward(params Tensor<T>[])

LayerBase<T>.ApplyActivation(Tensor<T>)

LayerBase<T>.ApplyActivation(Vector<T>)

LayerBase<T>.ActivateTensor(IActivationFunction<T>, Tensor<T>)

LayerBase<T>.ActivateTensor(IVectorActivationFunction<T>, Tensor<T>)

LayerBase<T>.CalculateInputShape(int, int, int)

LayerBase<T>.CalculateOutputShape(int, int, int)

LayerBase<T>.Clone()

LayerBase<T>.DerivativeTensor(IActivationFunction<T>, Tensor<T>)

LayerBase<T>.ApplyActivationDerivative(T, T)

LayerBase<T>.ApplyActivationDerivative(Tensor<T>, Tensor<T>)

LayerBase<T>.ComputeActivationJacobian(Vector<T>)

LayerBase<T>.ApplyActivationDerivative(Vector<T>, Vector<T>)

LayerBase<T>.UpdateParameters(Vector<T>)

LayerBase<T>.Serialize(BinaryWriter)

LayerBase<T>.Deserialize(BinaryReader)

LayerBase<T>.SetParameters(Vector<T>)

LayerBase<T>.GetDiagnostics()

LayerBase<T>.ApplyActivationToGraph(ComputationNode<T>)

LayerBase<T>.CanActivationBeJitted()

LayerBase<T>.RegisterTrainableParameter(Tensor<T>, PersistentTensorRole)

LayerBase<T>.InvalidateTrainableParameter(Tensor<T>)

LayerBase<T>.HasGpuActivation()

LayerBase<T>.ApplyActivationForwardGpu(IDirectGpuBackend, IGpuBuffer, IGpuBuffer, int)

LayerBase<T>.ApplyActivationBackwardGpu(IDirectGpuBackend, IGpuBuffer, IGpuBuffer, IGpuBuffer, IGpuBuffer, int)

LayerBase<T>.GetFusedActivationType()

LayerBase<T>.ApplyGpuActivation(IDirectGpuBackend, IGpuBuffer, IGpuBuffer, int, FusedActivationType)

LayerBase<T>.ApplyGpuActivationBackward(IDirectGpuBackend, IGpuBuffer, IGpuBuffer, IGpuBuffer, IGpuBuffer, int, FusedActivationType, float)

LayerBase<T>.Dispose()

LayerBase<T>.Dispose(bool)

LayerBase<T>.WeightParameterName

LayerBase<T>.BiasParameterName

LayerBase<T>.SetWeights(Tensor<T>)

LayerBase<T>.SetBiases(Tensor<T>)

LayerBase<T>.GetParameterNames()

LayerBase<T>.TryGetParameter(string, out Tensor<T>)

LayerBase<T>.SetParameter(string, Tensor<T>)

LayerBase<T>.GetParameterShape(string)

LayerBase<T>.NamedParameterCount

LayerBase<T>.ValidateWeights(IEnumerable<string>, Func<string, string>)

LayerBase<T>.LoadWeights(Dictionary<string, Tensor<T>>, Func<string, string>, bool)

object.Equals(object)

object.Equals(object, object)

object.GetHashCode()

object.GetType()

object.MemberwiseClone()

object.ReferenceEquals(object, object)

object.ToString()

Remarks

This layer is used after recurrent layers (RNN, LSTM, GRU) when the task requires a single output from the entire sequence, such as sequence classification.

For Beginners: When processing sequences (like sentences or time series), recurrent layers output a value for each timestep. For tasks like classification, we often only need the final output (after seeing the whole sequence). This layer extracts just that last output.

Constructors

SequenceLastLayer(int)

Initializes a new SequenceLastLayer.

public SequenceLastLayer(int featureSize)

Parameters

featureSize int: The size of the feature dimension (last dimension of input).

Properties

ParameterCount

Gets the total number of parameters in this layer.

public override int ParameterCount { get; }

Property Value

int: The total number of trainable parameters.

Remarks

This property returns the total number of trainable parameters in the layer. By default, it returns the length of the Parameters vector, but derived classes can override this to calculate the number of parameters differently.

For Beginners: This tells you how many learnable values the layer has.

The parameter count:

Shows how complex the layer is
Indicates how many values need to be learned during training
Can help estimate memory usage and computational requirements

Layers with more parameters can potentially learn more complex patterns but may also require more data to train effectively.

SupportsGpuExecution

Indicates whether this layer supports GPU execution.

protected override bool SupportsGpuExecution { get; }

Property Value

bool

SupportsGpuTraining

Gets whether this layer has full GPU training support (forward, backward, and parameter updates).

public override bool SupportsGpuTraining { get; }

Property Value

bool

Remarks

This property indicates whether the layer can perform its entire training cycle on GPU without downloading data to CPU. A layer has full GPU training support when:

ForwardGpu is implemented
BackwardGpu is implemented
UpdateParametersGpu is implemented (for layers with trainable parameters)
GPU weight/bias/gradient buffers are properly managed

For Beginners: This tells you if training can happen entirely on GPU.

GPU-resident training is much faster because:

Data stays on GPU between forward and backward passes
No expensive CPU-GPU transfers during each training step
GPU kernels handle all gradient computation

Only layers that return true here can participate in fully GPU-resident training.

SupportsJitCompilation

Gets a value indicating whether this layer supports JIT compilation.

public override bool SupportsJitCompilation { get; }

Property Value

bool

SupportsTraining

Gets a value indicating whether this layer supports training.

public override bool SupportsTraining { get; }

Property Value

bool: false because this layer has no trainable parameters.

Methods

Backward(Tensor<T>)

Backward pass: distributes gradient to the last timestep only.

public override Tensor<T> Backward(Tensor<T> outputGradient)

Parameters

outputGradient Tensor<T>: Gradient from the next layer.

Returns

Tensor<T>: Gradient with shape matching the original input (zeros except at last timestep).

BackwardGpu(IGpuTensor<T>)

Performs the backward pass of the layer on GPU.

public override IGpuTensor<T> BackwardGpu(IGpuTensor<T> outputGradient)

Parameters

outputGradient IGpuTensor<T>: The GPU-resident gradient of the loss with respect to the layer's output.

Returns

IGpuTensor<T>: The GPU-resident gradient of the loss with respect to the layer's input.

Remarks

This method performs the layer's backward computation entirely on GPU, including:

Computing input gradients to pass to previous layers
Computing and storing weight gradients on GPU (for layers with trainable parameters)
Computing and storing bias gradients on GPU

For Beginners: This is like Backward() but runs entirely on GPU.

During GPU training:

Output gradients come in (on GPU)
Input gradients are computed (stay on GPU)
Weight/bias gradients are computed and stored (on GPU)
Input gradients are returned for the previous layer

All data stays on GPU - no CPU round-trips needed!

Exceptions

NotSupportedException: Thrown when the layer does not support GPU training.

ExportComputationGraph(List<ComputationNode<T>>)

Exports the computation graph for this layer.

public override ComputationNode<T> ExportComputationGraph(List<ComputationNode<T>> inputNodes)

Parameters

inputNodes List<ComputationNode<T>>

Returns

ComputationNode<T>

Forward(Tensor<T>)

Extracts the last timestep from the input sequence.

public override Tensor<T> Forward(Tensor<T> input)

Parameters

input Tensor<T>: Input tensor of shape [seqLen, features] or [seqLen, batch, features].

Returns

Tensor<T>: Output tensor of shape [features] or [batch, features].

ForwardGpu(params IGpuTensor<T>[])

GPU-accelerated forward pass that extracts the last timestep from a sequence. Uses zero-copy CreateView to extract the last slice directly on GPU.

public override IGpuTensor<T> ForwardGpu(params IGpuTensor<T>[] inputs)

Parameters

inputs IGpuTensor<T>[]: GPU-resident input tensors.

Returns

IGpuTensor<T>: GPU-resident output tensor containing the last timestep.

GetParameters()

Returns an empty vector since this layer has no trainable parameters.

public override Vector<T> GetParameters()

Returns

Vector<T>

ResetState()

Reset state is a no-op since this layer maintains no state between forward passes.

public override void ResetState()

UpdateParameters(T)

Update parameters is a no-op since this layer has no trainable parameters.

public override void UpdateParameters(T learningRate)

Parameters

learningRate T

Table of Contents

Class SequenceLastLayer<T>

Type Parameters

Remarks

Constructors

SequenceLastLayer(int)

Parameters

Properties

ParameterCount

Property Value

Remarks

SupportsGpuExecution

Property Value

SupportsGpuTraining

Property Value

Remarks

SupportsJitCompilation

Property Value

SupportsTraining

Property Value

Methods

Backward(Tensor<T>)

Parameters

Returns

BackwardGpu(IGpuTensor<T>)

Parameters

Returns

Remarks

Exceptions

ExportComputationGraph(List<ComputationNode<T>>)

Parameters

Returns

Forward(Tensor<T>)

Parameters

Returns

ForwardGpu(params IGpuTensor<T>[])

Parameters

Returns

GetParameters()

Returns

ResetState()

UpdateParameters(T)

Parameters