Class ReadoutLayer<T>
- Namespace
- AiDotNet.NeuralNetworks.Layers
- Assembly
- AiDotNet.dll
Represents a readout layer that performs the final mapping from features to output in a neural network.
public class ReadoutLayer<T> : LayerBase<T>, ILayer<T>, IJitCompilable<T>, IDiagnosticsProvider, IWeightLoadable<T>, IDisposable
Type Parameters
TThe numeric type used for calculations, typically float or double.
- Inheritance
-
LayerBase<T>ReadoutLayer<T>
- Implements
-
ILayer<T>
- Inherited Members
Remarks
The ReadoutLayer is typically used as the final layer in a neural network to transform features extracted by previous layers into the desired output format. It applies a linear transformation (weights and bias) followed by an activation function. This layer is similar to a dense or fully connected layer but is specifically designed for outputting the final results.
For Beginners: This layer serves as the final "decision maker" in a neural network.
Think of the ReadoutLayer as a panel of judges in a competition:
- Each judge (output neuron) receives information from all contestants (input features)
- Each judge has their own preferences (weights) for different skills
- Judges combine all this information to make their final scores (outputs)
- An activation function then shapes these scores into the desired format
For example, in an image classification network:
- Previous layers extract features like edges, shapes, and patterns
- The ReadoutLayer takes all these features and combines them into class scores
- If there are 10 possible classes, the ReadoutLayer might have 10 outputs
- Each output represents the network's confidence that the image belongs to that class
This layer learns which features are most important for each output category during training.
Constructors
ReadoutLayer(int, int, IActivationFunction<T>)
Initializes a new instance of the ReadoutLayer<T> class with a scalar activation function.
public ReadoutLayer(int inputSize, int outputSize, IActivationFunction<T> scalarActivation)
Parameters
inputSizeintThe size of the input to the layer.
outputSizeintThe size of the output from the layer.
scalarActivationIActivationFunction<T>The activation function to apply to individual elements of the output.
Remarks
This constructor creates a new ReadoutLayer with the specified dimensions and a scalar activation function. The weights are initialized with small random values, and the biases are initialized to zero. A scalar activation function is applied element-wise to each output neuron independently.
For Beginners: This creates a new readout layer for your neural network using a simple activation function.
When you create this layer, you specify:
- inputSize: How many features come into the layer
- outputSize: How many outputs you want from the layer
- scalarActivation: How to transform each output (e.g., sigmoid, ReLU, tanh)
A scalar activation means each output is calculated independently. For example, in a 10-class classification problem, you might use inputSize=512 (512 features from previous layers), outputSize=10 (one for each class), and a softmax activation to get class probabilities.
The layer starts with small random weights and zero biases that will be refined during training.
ReadoutLayer(int, int, IVectorActivationFunction<T>)
Initializes a new instance of the ReadoutLayer<T> class with a vector activation function.
public ReadoutLayer(int inputSize, int outputSize, IVectorActivationFunction<T> vectorActivation)
Parameters
inputSizeintThe size of the input to the layer.
outputSizeintThe size of the output from the layer.
vectorActivationIVectorActivationFunction<T>The activation function to apply to the entire output vector.
Remarks
This constructor creates a new ReadoutLayer with the specified dimensions and a vector activation function. The weights are initialized with small random values, and the biases are initialized to zero. A vector activation function is applied to the entire output vector at once, which allows for interactions between different output neurons.
For Beginners: This creates a new readout layer for your neural network using an advanced activation function.
When you create this layer, you specify:
- inputSize: How many features come into the layer
- outputSize: How many outputs you want from the layer
- vectorActivation: How to transform the entire output as a group
A vector activation means all outputs are calculated together, which can capture relationships between outputs. For example, softmax is a vector activation that ensures all outputs sum to 1, making them behave like probabilities.
This is particularly useful for:
- Multi-class classification (using softmax activation)
- Problems where outputs should be interdependent
- Cases where you need to enforce specific constraints across all outputs
The layer starts with small random weights and zero biases that will be refined during training.
Properties
SupportsGpuExecution
Gets a value indicating whether this layer supports GPU execution.
protected override bool SupportsGpuExecution { get; }
Property Value
SupportsJitCompilation
Gets whether this layer supports JIT compilation.
public override bool SupportsJitCompilation { get; }
Property Value
- bool
True if the layer can be JIT compiled, false otherwise.
Remarks
This property indicates whether the layer has implemented ExportComputationGraph() and can benefit from JIT compilation. All layers MUST implement this property.
For Beginners: JIT compilation can make inference 5-10x faster by converting the layer's operations into optimized native code.
Layers should return false if they:
- Have not yet implemented a working ExportComputationGraph()
- Use dynamic operations that change based on input data
- Are too simple to benefit from JIT compilation
When false, the layer will use the standard Forward() method instead.
SupportsTraining
Gets a value indicating whether this layer supports training.
public override bool SupportsTraining { get; }
Property Value
- bool
Always
truefor ReadoutLayer, indicating that the layer can be trained through backpropagation.
Remarks
This property indicates that the ReadoutLayer has trainable parameters (weights and biases) that can be optimized during the training process using backpropagation. The gradients of these parameters are calculated during the backward pass and used to update the parameters.
For Beginners: This property tells you if the layer can learn from data.
A value of true means:
- The layer has values (weights and biases) that can be adjusted during training
- It will improve its performance as it sees more data
- It participates in the learning process of the neural network
When you train a neural network containing this layer, the weights and biases will automatically adjust to better recognize patterns specific to your data.
Methods
Backward(Tensor<T>)
Performs the backward pass of the readout layer.
public override Tensor<T> Backward(Tensor<T> outputGradient)
Parameters
outputGradientTensor<T>The gradient of the loss with respect to the layer's output.
Returns
- Tensor<T>
The gradient of the loss with respect to the layer's input.
Remarks
This method implements the backward pass of the readout layer, which is used during training to propagate error gradients back through the network. It calculates the gradients of the loss with respect to the weights and biases (to update the layer's parameters) and with respect to the input (to propagate back to previous layers). The method handles both scalar and vector activation functions.
For Beginners: This method is used during training to calculate how the layer should change to reduce errors.
During the backward pass:
- The error gradient from the loss function or next layer is received
- This gradient is adjusted based on the activation function used
- The layer calculates how each weight and bias should change to reduce the error
- The layer calculates how the previous layer's output should change
This is like giving feedback to improve performance:
- "This feature was too important in your decision-making" (weight too high)
- "You're not paying enough attention to this feature" (weight too low)
- "You're consistently scoring too high/low" (bias adjustment needed)
These calculations are at the heart of how neural networks learn from their mistakes.
Exceptions
- InvalidOperationException
Thrown when backward is called before forward.
BackwardGpu(IGpuTensor<T>)
Performs GPU-accelerated backward pass for the readout layer.
public override IGpuTensor<T> BackwardGpu(IGpuTensor<T> outputGradient)
Parameters
outputGradientIGpuTensor<T>GPU tensor containing gradient of loss with respect to layer output.
Returns
- IGpuTensor<T>
GPU tensor containing gradient with respect to input.
Remarks
Computes gradients for weights, biases, and input on GPU: - Weight gradient: activationGrad.T @ input - Bias gradient: sum(activationGrad, axis=0) - Input gradient: activationGrad @ weights
ExportComputationGraph(List<ComputationNode<T>>)
Exports the layer's computation graph for JIT compilation.
public override ComputationNode<T> ExportComputationGraph(List<ComputationNode<T>> inputNodes)
Parameters
inputNodesList<ComputationNode<T>>List to populate with input computation nodes.
Returns
- ComputationNode<T>
The output computation node representing the layer's operation.
Remarks
This method constructs a computation graph representation of the layer's forward pass that can be JIT compiled for faster inference. All layers MUST implement this method to support JIT compilation.
For Beginners: JIT (Just-In-Time) compilation converts the layer's operations into optimized native code for 5-10x faster inference.
To support JIT compilation, a layer must:
- Implement this method to export its computation graph
- Set SupportsJitCompilation to true
- Use ComputationNode and TensorOperations to build the graph
All layers are required to implement this method, even if they set SupportsJitCompilation = false.
Forward(Tensor<T>)
Performs the forward pass of the readout layer.
public override Tensor<T> Forward(Tensor<T> input)
Parameters
inputTensor<T>The input tensor to process.
Returns
- Tensor<T>
The output tensor after readout processing.
Remarks
This method implements the forward pass of the readout layer. It converts the input tensor to a vector, applies a linear transformation (weights and bias), and then applies the activation function. The input is cached for use during the backward pass. The method handles both scalar and vector activation functions.
For Beginners: This method processes your data through the readout layer.
During the forward pass:
- Your input data is flattened into a simple list of numbers
- Each output neuron calculates a weighted sum of all inputs plus its bias
- The activation function transforms these sums into the final outputs
The formula for each output is basically: output = activation(weights × inputs + bias)
This is similar to how a teacher might grade an exam:
- Different questions have different weights (more important questions get more points)
- There might be a curve applied to the final scores (activation function)
The layer saves the input for later use during training.
ForwardGpu(params IGpuTensor<T>[])
Performs the forward pass on GPU using FusedLinearGpu. Supports both scalar and vector (softmax) activations.
public override IGpuTensor<T> ForwardGpu(params IGpuTensor<T>[] inputs)
Parameters
inputsIGpuTensor<T>[]The GPU input tensors.
Returns
- IGpuTensor<T>
The GPU output tensor.
GetParameters()
Gets all trainable parameters of the readout layer as a single vector.
public override Vector<T> GetParameters()
Returns
- Vector<T>
A vector containing all trainable parameters (weights and biases).
Remarks
This method retrieves all trainable parameters (weights and biases) of the readout layer as a single vector. The weights are stored first, followed by the biases. This is useful for optimization algorithms that operate on all parameters at once, or for saving and loading model weights.
For Beginners: This method collects all the learnable values from the readout layer.
The parameters:
- Are the weights and biases that the readout layer learns during training
- Control how the layer processes information
- Are returned as a single list (vector)
This is useful for:
- Saving the model to disk
- Loading parameters from a previously trained model
- Advanced optimization techniques that need access to all parameters
The weights are stored first in the vector, followed by all the bias values.
ResetState()
Resets the internal state of the readout layer.
public override void ResetState()
Remarks
This method resets the internal state of the readout layer, including the cached input from the forward pass and the gradients from the backward pass. This is useful when starting to process a new sequence or batch of data.
For Beginners: This method clears the layer's memory to start fresh.
When resetting the state:
- Stored input from previous calculations is cleared
- Calculated gradients are reset to zero
- The layer forgets any information from previous batches
This is important for:
- Processing a new, unrelated batch of data
- Preventing information from one batch affecting another
- Starting a new training episode
The weights and biases (the learned parameters) are not reset, only the temporary state information.
SetParameters(Vector<T>)
Sets the trainable parameters of the readout layer.
public override void SetParameters(Vector<T> parameters)
Parameters
parametersVector<T>A vector containing all parameters (weights and biases) to set.
Remarks
This method sets the trainable parameters (weights and biases) of the readout layer from a single vector. The vector should contain the weight values first, followed by the bias values. This is useful for loading saved model weights or for implementing optimization algorithms that operate on all parameters at once.
For Beginners: This method updates all the weights and biases in the readout layer.
When setting parameters:
- The input must be a vector with the correct total length
- The first part of the vector is used for the weights
- The second part of the vector is used for the biases
This is useful for:
- Loading a previously saved model
- Transferring parameters from another model
- Testing different parameter values
An error is thrown if the input vector doesn't have the expected number of parameters.
Exceptions
- ArgumentException
Thrown when the parameters vector has incorrect length.
UpdateParameters(T)
Updates the parameters of the readout layer using the calculated gradients.
public override void UpdateParameters(T learningRate)
Parameters
learningRateTThe learning rate to use for the parameter updates.
Remarks
This method updates the weights and biases of the readout layer based on the gradients calculated during the backward pass. The learning rate controls the size of the parameter updates. This method should be called after the backward pass to apply the calculated updates.
For Beginners: This method updates the layer's internal values during training.
When updating parameters:
- The weight values are adjusted based on their gradients
- The bias values are adjusted based on their gradients
- The learning rate controls how big each update step is
This is like making small adjustments based on feedback:
- Weights that contributed to errors are reduced
- Weights that would have helped are increased
- The learning rate determines how quickly the model adapts
Smaller learning rates mean slower but more stable learning, while larger learning rates mean faster but potentially unstable learning.