Class BidirectionalLayer<T>

Namespace: AiDotNet.NeuralNetworks.Layers

Assembly: AiDotNet.dll

Represents a bidirectional layer that processes input sequences in both forward and backward directions.

public class BidirectionalLayer<T> : LayerBase<T>, ILayer<T>, IJitCompilable<T>, IDiagnosticsProvider, IWeightLoadable<T>, IDisposable

Type Parameters

T: The numeric type used for calculations, typically float or double.

Inheritance: object

LayerBase<T>

BidirectionalLayer<T>

Implements: ILayer<T>

IJitCompilable<T>

IDiagnosticsProvider

IWeightLoadable<T>

IDisposable

Inherited Members: LayerBase<T>.Engine

LayerBase<T>.ScalarActivation

LayerBase<T>.VectorActivation

LayerBase<T>.UsingVectorActivation

LayerBase<T>.NumOps

LayerBase<T>.Random

LayerBase<T>.Parameters

LayerBase<T>.ParameterGradients

LayerBase<T>.InputShape

LayerBase<T>.InputShapes

LayerBase<T>.UpdateInputShape(int[])

LayerBase<T>.OutputShape

LayerBase<T>.IsTrainingMode

LayerBase<T>.InitializationStrategy

LayerBase<T>.IsInitialized

LayerBase<T>.InitializationLock

LayerBase<T>.EnsureInitialized()

LayerBase<T>.UseAutodiff

LayerBase<T>.SetTrainingMode(bool)

LayerBase<T>.GetParameterGradients()

LayerBase<T>.ClearGradients()

LayerBase<T>.GetInputShape()

LayerBase<T>.GetInputShapes()

LayerBase<T>.GetOutputShape()

LayerBase<T>.GetWeights()

LayerBase<T>.GetBiases()

LayerBase<T>.MapActivationToFused()

LayerBase<T>.CanExecuteOnGpu

LayerBase<T>.CanTrainOnGpu

LayerBase<T>.UploadWeightsToGpu()

LayerBase<T>.DownloadWeightsFromGpu()

LayerBase<T>.ZeroGradientsGpu()

LayerBase<T>.GetActivationTypes()

LayerBase<T>.Forward(params Tensor<T>[])

LayerBase<T>.ApplyActivation(Tensor<T>)

LayerBase<T>.ApplyActivation(Vector<T>)

LayerBase<T>.ActivateTensor(IActivationFunction<T>, Tensor<T>)

LayerBase<T>.ActivateTensor(IVectorActivationFunction<T>, Tensor<T>)

LayerBase<T>.CalculateInputShape(int, int, int)

LayerBase<T>.CalculateOutputShape(int, int, int)

LayerBase<T>.Clone()

LayerBase<T>.DerivativeTensor(IActivationFunction<T>, Tensor<T>)

LayerBase<T>.ApplyActivationDerivative(T, T)

LayerBase<T>.ApplyActivationDerivative(Tensor<T>, Tensor<T>)

LayerBase<T>.ComputeActivationJacobian(Vector<T>)

LayerBase<T>.ApplyActivationDerivative(Vector<T>, Vector<T>)

LayerBase<T>.UpdateParameters(Vector<T>)

LayerBase<T>.ParameterCount

LayerBase<T>.Serialize(BinaryWriter)

LayerBase<T>.Deserialize(BinaryReader)

LayerBase<T>.GetDiagnostics()

LayerBase<T>.ApplyActivationToGraph(ComputationNode<T>)

LayerBase<T>.CanActivationBeJitted()

LayerBase<T>.RegisterTrainableParameter(Tensor<T>, PersistentTensorRole)

LayerBase<T>.InvalidateTrainableParameter(Tensor<T>)

LayerBase<T>.HasGpuActivation()

LayerBase<T>.ApplyActivationForwardGpu(IDirectGpuBackend, IGpuBuffer, IGpuBuffer, int)

LayerBase<T>.ApplyActivationBackwardGpu(IDirectGpuBackend, IGpuBuffer, IGpuBuffer, IGpuBuffer, IGpuBuffer, int)

LayerBase<T>.GetFusedActivationType()

LayerBase<T>.ApplyGpuActivation(IDirectGpuBackend, IGpuBuffer, IGpuBuffer, int, FusedActivationType)

LayerBase<T>.ApplyGpuActivationBackward(IDirectGpuBackend, IGpuBuffer, IGpuBuffer, IGpuBuffer, IGpuBuffer, int, FusedActivationType, float)

LayerBase<T>.Dispose()

LayerBase<T>.Dispose(bool)

LayerBase<T>.WeightParameterName

LayerBase<T>.BiasParameterName

LayerBase<T>.SetWeights(Tensor<T>)

LayerBase<T>.SetBiases(Tensor<T>)

LayerBase<T>.GetParameterNames()

LayerBase<T>.TryGetParameter(string, out Tensor<T>)

LayerBase<T>.SetParameter(string, Tensor<T>)

LayerBase<T>.GetParameterShape(string)

LayerBase<T>.NamedParameterCount

LayerBase<T>.ValidateWeights(IEnumerable<string>, Func<string, string>)

LayerBase<T>.LoadWeights(Dictionary<string, Tensor<T>>, Func<string, string>, bool)

object.Equals(object)

object.Equals(object, object)

object.GetHashCode()

object.GetType()

object.MemberwiseClone()

object.ReferenceEquals(object, object)

object.ToString()

Remarks

A bidirectional layer processes input sequences in two directions: forward (from first to last) and backward (from last to first). This approach allows the layer to capture patterns that depend on both past and future context. The outputs from both directions can either be merged (typically added together) or kept separate, depending on the configuration.

For Beginners: This layer looks at input data in two ways at the same time - both forward and backward.

Think of it like reading a sentence:

Forward reading: "The cat sat on the mat" (left to right)
Backward reading: "mat the on sat cat The" (right to left)

By processing data in both directions:

The layer can understand context from both past and future elements
It can discover patterns that might be missed if only looking in one direction
It often improves performance on sequence tasks like text processing

For example, in the sentence "The bank is by the river", the meaning of "bank" depends on both previous words ("The") and future words ("by the river"). A bidirectional layer helps capture these relationships.

Constructors

BidirectionalLayer(LayerBase<T>, bool, IActivationFunction<T>?, IEngine?)

Initializes a new instance of the BidirectionalLayer<T> class with the specified inner layer and a ReLU activation function.

public BidirectionalLayer(LayerBase<T> innerLayer, bool mergeMode = true, IActivationFunction<T>? activationFunction = null, IEngine? engine = null)

Parameters

innerLayer LayerBase<T>: The layer to be used for both forward and backward processing.
mergeMode bool: If true, outputs from both directions are added; otherwise, they are kept separate.
activationFunction IActivationFunction<T>: The activation function to apply after processing. Defaults to ReLU if not specified.
engine IEngine

Remarks

This constructor creates a bidirectional layer using the specified inner layer for both forward and backward processing. A copy of the inner layer is created for backward processing to ensure independent parameters. The mergeMode parameter determines how outputs from both directions are combined.

For Beginners: This constructor creates a new bidirectional layer with a standard activation function.

When you create a bidirectional layer this way:

The same type of layer is used for both forward and backward processing
The mergeMode parameter decides how to combine the results from both directions
The ReLU activation function is used by default, which helps the network learn non-linear patterns

For example, if innerLayer is an LSTM layer, this creates a bidirectional LSTM that processes sequences in both directions.

BidirectionalLayer(LayerBase<T>, bool, IVectorActivationFunction<T>?, IEngine?)

Initializes a new instance of the BidirectionalLayer<T> class with the specified inner layer and a vector activation function.

public BidirectionalLayer(LayerBase<T> innerLayer, bool mergeMode = true, IVectorActivationFunction<T>? vectorActivationFunction = null, IEngine? engine = null)

Parameters

innerLayer LayerBase<T>: The layer to be used for both forward and backward processing.
mergeMode bool: If true, outputs from both directions are added; otherwise, they are kept separate.
vectorActivationFunction IVectorActivationFunction<T>: The vector activation function to apply after processing. Defaults to Identity if not specified.
engine IEngine

Remarks

This constructor creates a bidirectional layer using the specified inner layer for both forward and backward processing. A copy of the inner layer is created for backward processing to ensure independent parameters. This overload accepts a vector activation function, which operates on entire vectors rather than individual elements.

For Beginners: This constructor creates a new bidirectional layer with a vector-based activation function.

A vector activation function:

Operates on entire groups of numbers at once, rather than one at a time
Can capture relationships between different elements in the output
Defaults to the Identity function, which doesn't change the values

This constructor is useful when you need more complex activation patterns that consider the relationships between different outputs.

Properties

SupportsGpuExecution

Gets a value indicating whether this layer supports GPU-accelerated forward pass.

protected override bool SupportsGpuExecution { get; }

Property Value

bool

Remarks

The bidirectional layer supports GPU execution when both the forward and backward inner layers support GPU execution. This ensures that the entire bidirectional processing can be done on the GPU.

For Beginners: This property indicates whether this layer can use the GPU for faster processing. Since the bidirectional layer wraps two inner layers, it can only use the GPU if both of those layers support GPU execution.

SupportsGpuTraining

Gets a value indicating whether this layer supports GPU-resident training.

public override bool SupportsGpuTraining { get; }

Property Value

bool

Remarks

The bidirectional layer supports GPU training when both inner layers support GPU training.

SupportsJitCompilation

Gets whether this layer supports JIT compilation.

public override bool SupportsJitCompilation { get; }

Property Value

bool: True if the layer can be JIT compiled, false otherwise.

Remarks

This property indicates whether the layer has implemented ExportComputationGraph() and can benefit from JIT compilation. All layers MUST implement this property.

For Beginners: JIT compilation can make inference 5-10x faster by converting the layer's operations into optimized native code.

Layers should return false if they:

Have not yet implemented a working ExportComputationGraph()
Use dynamic operations that change based on input data
Are too simple to benefit from JIT compilation

When false, the layer will use the standard Forward() method instead.

SupportsTraining

The computation engine (CPU or GPU) for vectorized operations.

public override bool SupportsTraining { get; }

Property Value

bool: true if either the forward or backward layer supports training; otherwise, false.

Remarks

This property indicates whether the bidirectional layer can be trained through backpropagation. The layer supports training if either of its internal layers (forward or backward) supports training. This is typically the case for layers that have trainable parameters, such as weights or biases.

For Beginners: This property tells you if the layer can learn from data.

A value of true means:

The layer can adjust its internal values during training
It will improve its performance as it sees more data
It participates in the learning process

The bidirectional layer supports training if either of its two internal layers (forward or backward) supports training.

Methods

Backward(Tensor<T>)

Performs the backward pass of the bidirectional layer.

public override Tensor<T> Backward(Tensor<T> outputGradient)

Parameters

outputGradient Tensor<T>: The gradient of the loss with respect to the layer's output.

Returns

Tensor<T>: The gradient of the loss with respect to the layer's input.

Remarks

This method implements the backward pass of the bidirectional layer, which is used during training to propagate error gradients back through the network. It splits the output gradient according to the merge mode, propagates it through both forward and backward inner layers, and combines the resulting input gradients.

For Beginners: This method is used during training to calculate how the layer's input should change to reduce errors.

During the backward pass:

The error gradient from the next layer is received
If the outputs were merged, the same gradient is sent to both forward and backward layers
If the outputs were separate, the gradient is split for each direction
The gradients from both layers are combined to update the previous layer

This process is part of the "backpropagation" algorithm that helps neural networks learn.

BackwardGpu(IGpuTensor<T>)

GPU-accelerated backward pass for the bidirectional layer.

public override IGpuTensor<T> BackwardGpu(IGpuTensor<T> outputGradient)

Parameters

outputGradient IGpuTensor<T>: The gradient of the loss with respect to the layer's output.

Returns

IGpuTensor<T>: The gradient of the loss with respect to the layer's input.

Remarks

For Beginners: This is the GPU-optimized backward pass that propagates gradients through both forward and backward inner layers while keeping all data on the GPU.

ExportComputationGraph(List<ComputationNode<T>>)

Exports the layer's computation graph for JIT compilation.

public override ComputationNode<T> ExportComputationGraph(List<ComputationNode<T>> inputNodes)

Parameters

inputNodes List<ComputationNode<T>>: List to populate with input computation nodes.

Returns

ComputationNode<T>: The output computation node representing the layer's operation.

Remarks

This method constructs a computation graph representation of the layer's forward pass that can be JIT compiled for faster inference. All layers MUST implement this method to support JIT compilation.

For Beginners: JIT (Just-In-Time) compilation converts the layer's operations into optimized native code for 5-10x faster inference.

To support JIT compilation, a layer must:

Implement this method to export its computation graph
Set SupportsJitCompilation to true
Use ComputationNode and TensorOperations to build the graph

All layers are required to implement this method, even if they set SupportsJitCompilation = false.

Forward(Tensor<T>)

Performs the forward pass of the bidirectional layer.

public override Tensor<T> Forward(Tensor<T> input)

Parameters

input Tensor<T>: The input tensor to process.

Returns

Tensor<T>: The output tensor after bidirectional processing.

Remarks

This method implements the forward pass of the bidirectional layer. It processes the input in both forward and backward directions using the respective inner layers, and then combines the outputs according to the merge mode. The input and outputs are cached for use during the backward pass.

For Beginners: This method processes the input data through both forward and backward layers.

During the forward pass:

The original input is sent through the forward layer
A reversed version of the input is sent through the backward layer
The results from both directions are either combined or kept separate

This method also saves the inputs and outputs for later use during training.

For example, with a text sequence, the forward layer sees "Hello world" while the backward layer sees "world Hello", and then the results are combined.

ForwardGpu(params IGpuTensor<T>[])

Performs a GPU-resident forward pass of the bidirectional layer.

public override IGpuTensor<T> ForwardGpu(params IGpuTensor<T>[] inputs)

Parameters

inputs IGpuTensor<T>[]: GPU-resident input tensor(s).

Returns

IGpuTensor<T>: GPU-resident output tensor after bidirectional processing.

Remarks

For Beginners: This is the GPU-optimized version of the Forward method. All data stays on the GPU throughout the computation, avoiding expensive CPU-GPU transfers. The input sequence is processed in both forward and backward directions using GPU operations, and the results are merged on the GPU.

Exceptions

ArgumentException: Thrown when no input tensor is provided.
InvalidOperationException: Thrown when GPU backend is unavailable or inner layers don't support GPU.

GetParameters()

Gets all trainable parameters from both the forward and backward layers as a single vector.

public override Vector<T> GetParameters()

Returns

Vector<T>: A vector containing all trainable parameters.

Remarks

This method retrieves all trainable parameters from both the forward and backward inner layers and combines them into a single vector. This is useful for optimization algorithms that operate on all parameters at once, or for saving and loading model weights.

For Beginners: This method collects all the learnable values from both forward and backward layers.

The parameters:

Are the numbers that the neural network learns during training
Include weights and biases from both forward and backward layers
Are combined into a single long list (vector)

This is useful for:

Saving the model to disk
Loading parameters from a previously trained model
Advanced optimization techniques that need access to all parameters

ResetState()

Resets the internal state of the bidirectional layer and its inner layers.

public override void ResetState()

Remarks

This method resets the internal state of the bidirectional layer, including the cached inputs and outputs, as well as the states of both forward and backward inner layers. This is useful when starting to process a new sequence or when implementing stateful recurrent networks.

For Beginners: This method clears the layer's memory to start fresh.

When resetting the state:

Stored inputs and outputs are cleared
Both forward and backward layers are also reset
The layer forgets any information from previous sequences

This is important for:

Processing a new, unrelated sequence
Preventing information from one sequence affecting another
Starting a new training episode

For example, if you've processed one sentence and want to start with a new sentence, you should reset the state to prevent the new sentence from being influenced by the previous one.

SetParameters(Vector<T>)

Sets the trainable parameters for both the forward and backward layers.

public override void SetParameters(Vector<T> parameters)

Parameters

parameters Vector<T>: A vector containing all parameters to set.

Remarks

This method sets the trainable parameters for both the forward and backward inner layers from a single vector. It extracts the appropriate portions of the input vector for each inner layer. This is useful for loading saved model weights or for implementing optimization algorithms that operate on all parameters at once.

For Beginners: This method updates all the learnable values in both forward and backward layers.

When setting parameters:

The input must be a vector with the correct length
The first part of the vector is used for the forward layer
The second part of the vector is used for the backward layer

This is useful for:

Loading a previously saved model
Transferring parameters from another model
Testing different parameter values

An error is thrown if the input vector doesn't have the expected number of parameters.

Exceptions

ArgumentException: Thrown when the parameters vector has incorrect length.

UpdateParameters(T)

Updates the parameters of both forward and backward layers using the calculated gradients.

public override void UpdateParameters(T learningRate)

Parameters

learningRate T: The learning rate to use for the parameter updates.

Remarks

This method updates the parameters of both the forward and backward inner layers based on the gradients calculated during the backward pass. The learning rate controls the size of the parameter updates.

For Beginners: This method updates the layer's internal values during training.

When updating parameters:

Both forward and backward layers are updated independently
The learning rate controls how big each update step is
Smaller learning rates mean slower but more stable learning
Larger learning rates mean faster but potentially unstable learning

This is how the layer "learns" from data over time.

UpdateParametersGpu(IGpuOptimizerConfig)

Performs the backward pass on GPU tensors.

public override void UpdateParametersGpu(IGpuOptimizerConfig config)

Parameters

config IGpuOptimizerConfig: The GPU optimizer configuration.

Table of Contents

Class BidirectionalLayer<T>

Type Parameters

Remarks

Constructors

BidirectionalLayer(LayerBase<T>, bool, IActivationFunction<T>?, IEngine?)

Parameters

Remarks

BidirectionalLayer(LayerBase<T>, bool, IVectorActivationFunction<T>?, IEngine?)

Parameters

Remarks

Properties

SupportsGpuExecution

Property Value

Remarks

SupportsGpuTraining

Property Value

Remarks

SupportsJitCompilation

Property Value

Remarks

SupportsTraining

Property Value

Remarks

Methods

Backward(Tensor<T>)

Parameters

Returns

Remarks

BackwardGpu(IGpuTensor<T>)

Parameters

Returns

Remarks

ExportComputationGraph(List<ComputationNode<T>>)

Parameters

Returns

Remarks

Forward(Tensor<T>)

Parameters

Returns

Remarks

ForwardGpu(params IGpuTensor<T>[])

Parameters

Returns

Remarks

Exceptions

GetParameters()

Returns

Remarks

ResetState()

Remarks

SetParameters(Vector<T>)

Parameters

Remarks

Exceptions

UpdateParameters(T)

Parameters

Remarks

UpdateParametersGpu(IGpuOptimizerConfig)

Parameters