Class ReadoutLayer<T>

Namespace: AiDotNet.NeuralNetworks.Layers

Assembly: AiDotNet.dll

Represents a readout layer that performs the final mapping from features to output in a neural network.

public class ReadoutLayer<T> : LayerBase<T>, ILayer<T>, IJitCompilable<T>, IDiagnosticsProvider, IWeightLoadable<T>, IDisposable

Type Parameters

T: The numeric type used for calculations, typically float or double.

Inheritance: object

LayerBase<T>

ReadoutLayer<T>

Implements: ILayer<T>

IJitCompilable<T>

IDiagnosticsProvider

IWeightLoadable<T>

IDisposable

Inherited Members: LayerBase<T>.Engine

LayerBase<T>.ScalarActivation

LayerBase<T>.VectorActivation

LayerBase<T>.UsingVectorActivation

LayerBase<T>.NumOps

LayerBase<T>.Random

LayerBase<T>.Parameters

LayerBase<T>.ParameterGradients

LayerBase<T>.InputShape

LayerBase<T>.InputShapes

LayerBase<T>.UpdateInputShape(int[])

LayerBase<T>.OutputShape

LayerBase<T>.IsTrainingMode

LayerBase<T>.InitializationStrategy

LayerBase<T>.IsInitialized

LayerBase<T>.InitializationLock

LayerBase<T>.EnsureInitialized()

LayerBase<T>.UseAutodiff

LayerBase<T>.SetTrainingMode(bool)

LayerBase<T>.GetParameterGradients()

LayerBase<T>.ClearGradients()

LayerBase<T>.GetInputShape()

LayerBase<T>.GetInputShapes()

LayerBase<T>.GetOutputShape()

LayerBase<T>.GetWeights()

LayerBase<T>.GetBiases()

LayerBase<T>.MapActivationToFused()

LayerBase<T>.SupportsGpuTraining

LayerBase<T>.CanExecuteOnGpu

LayerBase<T>.CanTrainOnGpu

LayerBase<T>.UpdateParametersGpu(IGpuOptimizerConfig)

LayerBase<T>.UploadWeightsToGpu()

LayerBase<T>.DownloadWeightsFromGpu()

LayerBase<T>.ZeroGradientsGpu()

LayerBase<T>.GetActivationTypes()

LayerBase<T>.Forward(params Tensor<T>[])

LayerBase<T>.ApplyActivation(Tensor<T>)

LayerBase<T>.ApplyActivation(Vector<T>)

LayerBase<T>.ActivateTensor(IActivationFunction<T>, Tensor<T>)

LayerBase<T>.ActivateTensor(IVectorActivationFunction<T>, Tensor<T>)

LayerBase<T>.CalculateInputShape(int, int, int)

LayerBase<T>.CalculateOutputShape(int, int, int)

LayerBase<T>.Clone()

LayerBase<T>.DerivativeTensor(IActivationFunction<T>, Tensor<T>)

LayerBase<T>.ApplyActivationDerivative(T, T)

LayerBase<T>.ApplyActivationDerivative(Tensor<T>, Tensor<T>)

LayerBase<T>.ComputeActivationJacobian(Vector<T>)

LayerBase<T>.ApplyActivationDerivative(Vector<T>, Vector<T>)

LayerBase<T>.UpdateParameters(Vector<T>)

LayerBase<T>.ParameterCount

LayerBase<T>.Serialize(BinaryWriter)

LayerBase<T>.Deserialize(BinaryReader)

LayerBase<T>.GetDiagnostics()

LayerBase<T>.ApplyActivationToGraph(ComputationNode<T>)

LayerBase<T>.CanActivationBeJitted()

LayerBase<T>.RegisterTrainableParameter(Tensor<T>, PersistentTensorRole)

LayerBase<T>.InvalidateTrainableParameter(Tensor<T>)

LayerBase<T>.HasGpuActivation()

LayerBase<T>.ApplyActivationForwardGpu(IDirectGpuBackend, IGpuBuffer, IGpuBuffer, int)

LayerBase<T>.ApplyActivationBackwardGpu(IDirectGpuBackend, IGpuBuffer, IGpuBuffer, IGpuBuffer, IGpuBuffer, int)

LayerBase<T>.GetFusedActivationType()

LayerBase<T>.ApplyGpuActivation(IDirectGpuBackend, IGpuBuffer, IGpuBuffer, int, FusedActivationType)

LayerBase<T>.ApplyGpuActivationBackward(IDirectGpuBackend, IGpuBuffer, IGpuBuffer, IGpuBuffer, IGpuBuffer, int, FusedActivationType, float)

LayerBase<T>.Dispose()

LayerBase<T>.Dispose(bool)

LayerBase<T>.WeightParameterName

LayerBase<T>.BiasParameterName

LayerBase<T>.SetWeights(Tensor<T>)

LayerBase<T>.SetBiases(Tensor<T>)

LayerBase<T>.GetParameterNames()

LayerBase<T>.TryGetParameter(string, out Tensor<T>)

LayerBase<T>.SetParameter(string, Tensor<T>)

LayerBase<T>.GetParameterShape(string)

LayerBase<T>.NamedParameterCount

LayerBase<T>.ValidateWeights(IEnumerable<string>, Func<string, string>)

LayerBase<T>.LoadWeights(Dictionary<string, Tensor<T>>, Func<string, string>, bool)

object.Equals(object)

object.Equals(object, object)

object.GetHashCode()

object.GetType()

object.MemberwiseClone()

object.ReferenceEquals(object, object)

object.ToString()

Remarks

The ReadoutLayer is typically used as the final layer in a neural network to transform features extracted by previous layers into the desired output format. It applies a linear transformation (weights and bias) followed by an activation function. This layer is similar to a dense or fully connected layer but is specifically designed for outputting the final results.

For Beginners: This layer serves as the final "decision maker" in a neural network.

Think of the ReadoutLayer as a panel of judges in a competition:

Each judge (output neuron) receives information from all contestants (input features)
Each judge has their own preferences (weights) for different skills
Judges combine all this information to make their final scores (outputs)
An activation function then shapes these scores into the desired format

For example, in an image classification network:

Previous layers extract features like edges, shapes, and patterns
The ReadoutLayer takes all these features and combines them into class scores
If there are 10 possible classes, the ReadoutLayer might have 10 outputs
Each output represents the network's confidence that the image belongs to that class

This layer learns which features are most important for each output category during training.

Constructors

ReadoutLayer(int, int, IActivationFunction<T>)

Initializes a new instance of the ReadoutLayer<T> class with a scalar activation function.

public ReadoutLayer(int inputSize, int outputSize, IActivationFunction<T> scalarActivation)

Parameters

inputSize int: The size of the input to the layer.
outputSize int: The size of the output from the layer.
scalarActivation IActivationFunction<T>: The activation function to apply to individual elements of the output.

Remarks

This constructor creates a new ReadoutLayer with the specified dimensions and a scalar activation function. The weights are initialized with small random values, and the biases are initialized to zero. A scalar activation function is applied element-wise to each output neuron independently.

For Beginners: This creates a new readout layer for your neural network using a simple activation function.

When you create this layer, you specify:

inputSize: How many features come into the layer
outputSize: How many outputs you want from the layer
scalarActivation: How to transform each output (e.g., sigmoid, ReLU, tanh)

A scalar activation means each output is calculated independently. For example, in a 10-class classification problem, you might use inputSize=512 (512 features from previous layers), outputSize=10 (one for each class), and a softmax activation to get class probabilities.

The layer starts with small random weights and zero biases that will be refined during training.

ReadoutLayer(int, int, IVectorActivationFunction<T>)

Initializes a new instance of the ReadoutLayer<T> class with a vector activation function.

public ReadoutLayer(int inputSize, int outputSize, IVectorActivationFunction<T> vectorActivation)

Parameters

inputSize int: The size of the input to the layer.
outputSize int: The size of the output from the layer.
vectorActivation IVectorActivationFunction<T>: The activation function to apply to the entire output vector.

Remarks

This constructor creates a new ReadoutLayer with the specified dimensions and a vector activation function. The weights are initialized with small random values, and the biases are initialized to zero. A vector activation function is applied to the entire output vector at once, which allows for interactions between different output neurons.

For Beginners: This creates a new readout layer for your neural network using an advanced activation function.

When you create this layer, you specify:

inputSize: How many features come into the layer
outputSize: How many outputs you want from the layer
vectorActivation: How to transform the entire output as a group

A vector activation means all outputs are calculated together, which can capture relationships between outputs. For example, softmax is a vector activation that ensures all outputs sum to 1, making them behave like probabilities.

This is particularly useful for:

Multi-class classification (using softmax activation)
Problems where outputs should be interdependent
Cases where you need to enforce specific constraints across all outputs

The layer starts with small random weights and zero biases that will be refined during training.

Properties

SupportsGpuExecution

Gets a value indicating whether this layer supports GPU execution.

protected override bool SupportsGpuExecution { get; }

Property Value

bool

SupportsJitCompilation

Gets whether this layer supports JIT compilation.

public override bool SupportsJitCompilation { get; }

Property Value

bool: True if the layer can be JIT compiled, false otherwise.

Remarks

This property indicates whether the layer has implemented ExportComputationGraph() and can benefit from JIT compilation. All layers MUST implement this property.

For Beginners: JIT compilation can make inference 5-10x faster by converting the layer's operations into optimized native code.

Layers should return false if they:

Have not yet implemented a working ExportComputationGraph()
Use dynamic operations that change based on input data
Are too simple to benefit from JIT compilation

When false, the layer will use the standard Forward() method instead.

SupportsTraining

Gets a value indicating whether this layer supports training.

public override bool SupportsTraining { get; }

Property Value

bool: Always true for ReadoutLayer, indicating that the layer can be trained through backpropagation.

Remarks

This property indicates that the ReadoutLayer has trainable parameters (weights and biases) that can be optimized during the training process using backpropagation. The gradients of these parameters are calculated during the backward pass and used to update the parameters.

For Beginners: This property tells you if the layer can learn from data.

A value of true means:

The layer has values (weights and biases) that can be adjusted during training
It will improve its performance as it sees more data
It participates in the learning process of the neural network

When you train a neural network containing this layer, the weights and biases will automatically adjust to better recognize patterns specific to your data.

Methods

Backward(Tensor<T>)

Performs the backward pass of the readout layer.

public override Tensor<T> Backward(Tensor<T> outputGradient)

Parameters

outputGradient Tensor<T>: The gradient of the loss with respect to the layer's output.

Returns

Tensor<T>: The gradient of the loss with respect to the layer's input.

Remarks

This method implements the backward pass of the readout layer, which is used during training to propagate error gradients back through the network. It calculates the gradients of the loss with respect to the weights and biases (to update the layer's parameters) and with respect to the input (to propagate back to previous layers). The method handles both scalar and vector activation functions.

For Beginners: This method is used during training to calculate how the layer should change to reduce errors.

During the backward pass:

The error gradient from the loss function or next layer is received
This gradient is adjusted based on the activation function used
The layer calculates how each weight and bias should change to reduce the error
The layer calculates how the previous layer's output should change

This is like giving feedback to improve performance:

"This feature was too important in your decision-making" (weight too high)
"You're not paying enough attention to this feature" (weight too low)
"You're consistently scoring too high/low" (bias adjustment needed)

These calculations are at the heart of how neural networks learn from their mistakes.

Exceptions

InvalidOperationException: Thrown when backward is called before forward.

BackwardGpu(IGpuTensor<T>)

Performs GPU-accelerated backward pass for the readout layer.

public override IGpuTensor<T> BackwardGpu(IGpuTensor<T> outputGradient)

Parameters

outputGradient IGpuTensor<T>: GPU tensor containing gradient of loss with respect to layer output.

Returns

IGpuTensor<T>: GPU tensor containing gradient with respect to input.

Remarks

Computes gradients for weights, biases, and input on GPU: - Weight gradient: activationGrad.T @ input - Bias gradient: sum(activationGrad, axis=0) - Input gradient: activationGrad @ weights

ExportComputationGraph(List<ComputationNode<T>>)

Exports the layer's computation graph for JIT compilation.

public override ComputationNode<T> ExportComputationGraph(List<ComputationNode<T>> inputNodes)

Parameters

inputNodes List<ComputationNode<T>>: List to populate with input computation nodes.

Returns

ComputationNode<T>: The output computation node representing the layer's operation.

Remarks

This method constructs a computation graph representation of the layer's forward pass that can be JIT compiled for faster inference. All layers MUST implement this method to support JIT compilation.

For Beginners: JIT (Just-In-Time) compilation converts the layer's operations into optimized native code for 5-10x faster inference.

To support JIT compilation, a layer must:

Implement this method to export its computation graph
Set SupportsJitCompilation to true
Use ComputationNode and TensorOperations to build the graph

All layers are required to implement this method, even if they set SupportsJitCompilation = false.

Forward(Tensor<T>)

Performs the forward pass of the readout layer.

public override Tensor<T> Forward(Tensor<T> input)

Parameters

input Tensor<T>: The input tensor to process.

Returns

Tensor<T>: The output tensor after readout processing.

Remarks

This method implements the forward pass of the readout layer. It converts the input tensor to a vector, applies a linear transformation (weights and bias), and then applies the activation function. The input is cached for use during the backward pass. The method handles both scalar and vector activation functions.

For Beginners: This method processes your data through the readout layer.

During the forward pass:

Your input data is flattened into a simple list of numbers
Each output neuron calculates a weighted sum of all inputs plus its bias
The activation function transforms these sums into the final outputs

The formula for each output is basically: output = activation(weights × inputs + bias)

This is similar to how a teacher might grade an exam:

Different questions have different weights (more important questions get more points)
There might be a curve applied to the final scores (activation function)

The layer saves the input for later use during training.

ForwardGpu(params IGpuTensor<T>[])

Performs the forward pass on GPU using FusedLinearGpu. Supports both scalar and vector (softmax) activations.

public override IGpuTensor<T> ForwardGpu(params IGpuTensor<T>[] inputs)

Parameters

inputs IGpuTensor<T>[]: The GPU input tensors.

Returns

IGpuTensor<T>: The GPU output tensor.

GetParameters()

Gets all trainable parameters of the readout layer as a single vector.

public override Vector<T> GetParameters()

Returns

Vector<T>: A vector containing all trainable parameters (weights and biases).

Remarks

This method retrieves all trainable parameters (weights and biases) of the readout layer as a single vector. The weights are stored first, followed by the biases. This is useful for optimization algorithms that operate on all parameters at once, or for saving and loading model weights.

For Beginners: This method collects all the learnable values from the readout layer.

The parameters:

Are the weights and biases that the readout layer learns during training
Control how the layer processes information
Are returned as a single list (vector)

This is useful for:

Saving the model to disk
Loading parameters from a previously trained model
Advanced optimization techniques that need access to all parameters

The weights are stored first in the vector, followed by all the bias values.

ResetState()

Resets the internal state of the readout layer.

public override void ResetState()

Remarks

This method resets the internal state of the readout layer, including the cached input from the forward pass and the gradients from the backward pass. This is useful when starting to process a new sequence or batch of data.

For Beginners: This method clears the layer's memory to start fresh.

When resetting the state:

Stored input from previous calculations is cleared
Calculated gradients are reset to zero
The layer forgets any information from previous batches

This is important for:

Processing a new, unrelated batch of data
Preventing information from one batch affecting another
Starting a new training episode

The weights and biases (the learned parameters) are not reset, only the temporary state information.

SetParameters(Vector<T>)

Sets the trainable parameters of the readout layer.

public override void SetParameters(Vector<T> parameters)

Parameters

parameters Vector<T>: A vector containing all parameters (weights and biases) to set.

Remarks

This method sets the trainable parameters (weights and biases) of the readout layer from a single vector. The vector should contain the weight values first, followed by the bias values. This is useful for loading saved model weights or for implementing optimization algorithms that operate on all parameters at once.

For Beginners: This method updates all the weights and biases in the readout layer.

When setting parameters:

The input must be a vector with the correct total length
The first part of the vector is used for the weights
The second part of the vector is used for the biases

This is useful for:

Loading a previously saved model
Transferring parameters from another model
Testing different parameter values

An error is thrown if the input vector doesn't have the expected number of parameters.

Exceptions

ArgumentException: Thrown when the parameters vector has incorrect length.

UpdateParameters(T)

Updates the parameters of the readout layer using the calculated gradients.

public override void UpdateParameters(T learningRate)

Parameters

learningRate T: The learning rate to use for the parameter updates.

Remarks

This method updates the weights and biases of the readout layer based on the gradients calculated during the backward pass. The learning rate controls the size of the parameter updates. This method should be called after the backward pass to apply the calculated updates.

For Beginners: This method updates the layer's internal values during training.

When updating parameters:

The weight values are adjusted based on their gradients
The bias values are adjusted based on their gradients
The learning rate controls how big each update step is

This is like making small adjustments based on feedback:

Weights that contributed to errors are reduced
Weights that would have helped are increased
The learning rate determines how quickly the model adapts

Smaller learning rates mean slower but more stable learning, while larger learning rates mean faster but potentially unstable learning.

Table of Contents

Class ReadoutLayer<T>

Type Parameters

Remarks

Constructors

ReadoutLayer(int, int, IActivationFunction<T>)

Parameters

Remarks

ReadoutLayer(int, int, IVectorActivationFunction<T>)

Parameters

Remarks

Properties

SupportsGpuExecution

Property Value

SupportsJitCompilation

Property Value

Remarks

SupportsTraining

Property Value

Remarks

Methods

Backward(Tensor<T>)

Parameters

Returns

Remarks

Exceptions

BackwardGpu(IGpuTensor<T>)

Parameters

Returns

Remarks

ExportComputationGraph(List<ComputationNode<T>>)

Parameters

Returns

Remarks

Forward(Tensor<T>)

Parameters

Returns

Remarks

ForwardGpu(params IGpuTensor<T>[])

Parameters

Returns

GetParameters()

Returns

Remarks

ResetState()

Remarks

SetParameters(Vector<T>)

Parameters

Remarks

Exceptions

UpdateParameters(T)

Parameters

Remarks