Class ConcatenateLayer<T>

Namespace: AiDotNet.NeuralNetworks.Layers

Assembly: AiDotNet.dll

Represents a neural network layer that concatenates multiple inputs along a specified axis.

public class ConcatenateLayer<T> : LayerBase<T>, ILayer<T>, IJitCompilable<T>, IDiagnosticsProvider, IWeightLoadable<T>, IDisposable

Type Parameters

T: The numeric type used for calculations, typically float or double.

Inheritance: object

LayerBase<T>

ConcatenateLayer<T>

Implements: ILayer<T>

IJitCompilable<T>

IDiagnosticsProvider

IWeightLoadable<T>

IDisposable

Inherited Members: LayerBase<T>.Engine

LayerBase<T>.ScalarActivation

LayerBase<T>.VectorActivation

LayerBase<T>.UsingVectorActivation

LayerBase<T>.NumOps

LayerBase<T>.Random

LayerBase<T>.Parameters

LayerBase<T>.ParameterGradients

LayerBase<T>.InputShape

LayerBase<T>.InputShapes

LayerBase<T>.UpdateInputShape(int[])

LayerBase<T>.OutputShape

LayerBase<T>.IsTrainingMode

LayerBase<T>.InitializationStrategy

LayerBase<T>.IsInitialized

LayerBase<T>.InitializationLock

LayerBase<T>.EnsureInitialized()

LayerBase<T>.UseAutodiff

LayerBase<T>.SetTrainingMode(bool)

LayerBase<T>.GetParameterGradients()

LayerBase<T>.ClearGradients()

LayerBase<T>.GetInputShape()

LayerBase<T>.GetInputShapes()

LayerBase<T>.GetOutputShape()

LayerBase<T>.GetWeights()

LayerBase<T>.GetBiases()

LayerBase<T>.MapActivationToFused()

LayerBase<T>.CanExecuteOnGpu

LayerBase<T>.CanTrainOnGpu

LayerBase<T>.UpdateParametersGpu(IGpuOptimizerConfig)

LayerBase<T>.UploadWeightsToGpu()

LayerBase<T>.DownloadWeightsFromGpu()

LayerBase<T>.ZeroGradientsGpu()

LayerBase<T>.GetActivationTypes()

LayerBase<T>.ApplyActivation(Tensor<T>)

LayerBase<T>.ApplyActivation(Vector<T>)

LayerBase<T>.ActivateTensor(IActivationFunction<T>, Tensor<T>)

LayerBase<T>.ActivateTensor(IVectorActivationFunction<T>, Tensor<T>)

LayerBase<T>.CalculateInputShape(int, int, int)

LayerBase<T>.CalculateOutputShape(int, int, int)

LayerBase<T>.Clone()

LayerBase<T>.DerivativeTensor(IActivationFunction<T>, Tensor<T>)

LayerBase<T>.ApplyActivationDerivative(T, T)

LayerBase<T>.ApplyActivationDerivative(Tensor<T>, Tensor<T>)

LayerBase<T>.ComputeActivationJacobian(Vector<T>)

LayerBase<T>.ApplyActivationDerivative(Vector<T>, Vector<T>)

LayerBase<T>.UpdateParameters(Vector<T>)

LayerBase<T>.ParameterCount

LayerBase<T>.Serialize(BinaryWriter)

LayerBase<T>.Deserialize(BinaryReader)

LayerBase<T>.SetParameters(Vector<T>)

LayerBase<T>.GetDiagnostics()

LayerBase<T>.ApplyActivationToGraph(ComputationNode<T>)

LayerBase<T>.CanActivationBeJitted()

LayerBase<T>.RegisterTrainableParameter(Tensor<T>, PersistentTensorRole)

LayerBase<T>.InvalidateTrainableParameter(Tensor<T>)

LayerBase<T>.HasGpuActivation()

LayerBase<T>.ApplyActivationForwardGpu(IDirectGpuBackend, IGpuBuffer, IGpuBuffer, int)

LayerBase<T>.ApplyActivationBackwardGpu(IDirectGpuBackend, IGpuBuffer, IGpuBuffer, IGpuBuffer, IGpuBuffer, int)

LayerBase<T>.GetFusedActivationType()

LayerBase<T>.ApplyGpuActivation(IDirectGpuBackend, IGpuBuffer, IGpuBuffer, int, FusedActivationType)

LayerBase<T>.ApplyGpuActivationBackward(IDirectGpuBackend, IGpuBuffer, IGpuBuffer, IGpuBuffer, IGpuBuffer, int, FusedActivationType, float)

LayerBase<T>.Dispose()

LayerBase<T>.Dispose(bool)

LayerBase<T>.WeightParameterName

LayerBase<T>.BiasParameterName

LayerBase<T>.SetWeights(Tensor<T>)

LayerBase<T>.SetBiases(Tensor<T>)

LayerBase<T>.GetParameterNames()

LayerBase<T>.TryGetParameter(string, out Tensor<T>)

LayerBase<T>.SetParameter(string, Tensor<T>)

LayerBase<T>.GetParameterShape(string)

LayerBase<T>.NamedParameterCount

LayerBase<T>.ValidateWeights(IEnumerable<string>, Func<string, string>)

LayerBase<T>.LoadWeights(Dictionary<string, Tensor<T>>, Func<string, string>, bool)

object.Equals(object)

object.Equals(object, object)

object.GetHashCode()

object.GetType()

object.MemberwiseClone()

object.ReferenceEquals(object, object)

object.ToString()

Remarks

A concatenate layer combines multiple input tensors into a single output tensor by joining them along a specified axis. For example, if you have two tensors of shape [batch_size, 10] and [batch_size, 15], concatenating them along axis 1 would produce a tensor of shape [batch_size, 25]. This layer doesn't have any trainable parameters and simply passes the gradients back to the appropriate input tensors during backpropagation.

For Beginners: A concatenate layer joins multiple inputs together to make one bigger output.

Think of it like joining arrays or lists:

If you have two lists [1, 2, 3] and [4, 5], concatenating them gives [1, 2, 3, 4, 5]

In neural networks, we often work with multi-dimensional data, so we need to specify which dimension (axis) to join along:

Axis 0 would join along the first dimension (like stacking sheets of paper)
Axis 1 would join along the second dimension (like extending rows sideways)
Axis 2 would join along the third dimension (like extending columns downward)

For example, if you have:

One tensor representing features from an image: [batch_size, 100]
Another tensor representing features from text: [batch_size, 50]

You could use a concatenate layer with axis=1 to create a combined feature tensor of shape [batch_size, 150] that contains both sets of features side by side.

Constructors

ConcatenateLayer(int[][], int, IActivationFunction<T>?)

Initializes a new instance of the ConcatenateLayer<T> class with a scalar activation function.

public ConcatenateLayer(int[][] inputShapes, int axis, IActivationFunction<T>? activationFunction = null)

Parameters

inputShapes int[][]: The shapes of the input tensors to be concatenated.
axis int: The axis along which to concatenate the inputs.
activationFunction IActivationFunction<T>: The activation function to apply after concatenation. Defaults to identity if not specified.

Remarks

This constructor creates a new concatenate layer using the specified input shapes and concatenation axis. It validates the input shapes to ensure they are compatible for concatenation, and calculates the output shape based on the input shapes and axis. The activation function is applied to the output after concatenation.

For Beginners: This constructor creates a new concatenate layer with a standard activation function.

When creating a concatenate layer, you need to specify:

The shapes of all the inputs that will be joined together
Which dimension (axis) to join them along
Optionally, an activation function to apply after joining

For example, if you have two inputs with shapes [32, 10] and [32, 20], and specify axis=1, the output shape will be [32, 30].

The default activation is the "identity" function, which doesn't change the values at all.

Exceptions

ArgumentException: Thrown when fewer than two input shapes are provided or when input shapes have different ranks.

ConcatenateLayer(int[][], int, IVectorActivationFunction<T>?)

Initializes a new instance of the ConcatenateLayer<T> class with a vector activation function.

public ConcatenateLayer(int[][] inputShapes, int axis, IVectorActivationFunction<T>? vectorActivationFunction = null)

Parameters

inputShapes int[][]: The shapes of the input tensors to be concatenated.
axis int: The axis along which to concatenate the inputs.
vectorActivationFunction IVectorActivationFunction<T>: The vector activation function to apply after concatenation. Defaults to identity if not specified.

Remarks

This constructor creates a new concatenate layer using the specified input shapes and concatenation axis. It validates the input shapes to ensure they are compatible for concatenation, and calculates the output shape based on the input shapes and axis. This overload accepts a vector activation function, which operates on entire vectors rather than individual elements.

For Beginners: This constructor creates a new concatenate layer with a vector-based activation function.

A vector activation function:

Operates on entire groups of numbers at once, rather than one at a time
Can capture relationships between different elements in the output
Defaults to the Identity function, which doesn't change the values

This constructor works the same way as the other one, but it's useful when you need more complex activation patterns that consider the relationships between different outputs.

Exceptions

ArgumentException: Thrown when fewer than two input shapes are provided or when input shapes have different ranks.

Properties

SupportsGpuExecution

Gets whether this layer has a GPU execution implementation for inference.

protected override bool SupportsGpuExecution { get; }

Property Value

bool

Remarks

Override this to return true when the layer implements ForwardGpu(params IGpuTensor<T>[]). The actual CanExecuteOnGpu property combines this with engine availability.

For Beginners: This flag indicates if the layer has GPU code for the forward pass. Set this to true in derived classes that implement ForwardGpu.

SupportsGpuTraining

Gets whether this layer has full GPU training support (forward, backward, and parameter updates).

public override bool SupportsGpuTraining { get; }

Property Value

bool

Remarks

This property indicates whether the layer can perform its entire training cycle on GPU without downloading data to CPU. A layer has full GPU training support when:

ForwardGpu is implemented
BackwardGpu is implemented
UpdateParametersGpu is implemented (for layers with trainable parameters)
GPU weight/bias/gradient buffers are properly managed

For Beginners: This tells you if training can happen entirely on GPU.

GPU-resident training is much faster because:

Data stays on GPU between forward and backward passes
No expensive CPU-GPU transfers during each training step
GPU kernels handle all gradient computation

Only layers that return true here can participate in fully GPU-resident training.

SupportsJitCompilation

Gets whether this layer supports JIT compilation.

public override bool SupportsJitCompilation { get; }

Property Value

bool: True if the layer can be JIT compiled, false otherwise.

Remarks

This property indicates whether the layer has implemented ExportComputationGraph() and can benefit from JIT compilation. All layers MUST implement this property.

For Beginners: JIT compilation can make inference 5-10x faster by converting the layer's operations into optimized native code.

Layers should return false if they:

Have not yet implemented a working ExportComputationGraph()
Use dynamic operations that change based on input data
Are too simple to benefit from JIT compilation

When false, the layer will use the standard Forward() method instead.

SupportsTraining

Gets a value indicating whether this layer supports training.

public override bool SupportsTraining { get; }

Property Value

bool: Always false as concatenate layers have no trainable parameters.

Remarks

This property returns false because concatenate layers don't have any trainable parameters. The layer simply combines inputs and passes gradients through during backpropagation without modifications.

For Beginners: This property tells you that this layer cannot learn from data.

A value of false means:

The layer doesn't contain any values that will change during training
It performs a fixed operation (concatenation) that doesn't need to be learned
It still participates in passing information during training, but doesn't change itself

This is different from layers like dense or convolutional layers that do have trainable parameters (weights and biases) that get updated during learning.

Methods

Backward(Tensor<T>)

Performs the backward pass of the concatenate layer.

public override Tensor<T> Backward(Tensor<T> outputGradient)

Parameters

outputGradient Tensor<T>: The gradient of the loss with respect to the layer's output.

Returns

Tensor<T>: The gradient of the loss with respect to the layer's input.

Remarks

This method implements the backward pass of the concatenate layer, which is used during training to propagate error gradients back through the network. It splits the output gradient along the concatenation axis and distributes the pieces to the corresponding input gradients.

For Beginners: This method routes the error gradients back to the correct inputs during training.

During the backward pass:

The layer receives error gradients from the next layer
If an activation function was used, its derivative is applied
The gradient is split along the same axis used for concatenation
Each piece of the gradient is sent back to the corresponding input

For example, if you joined three tensors of widths 10, 20, and 15:

The incoming gradient would have width 45
This method would split it into pieces of width 10, 20, and 15
Each piece would be sent back to its original source

This is how the training signal flows backward through the network, allowing each connected layer to learn from the error.

Exceptions

InvalidOperationException: Thrown when backward is called before forward.

BackwardGpu(IGpuTensor<T>)

Performs the backward pass of the layer on GPU.

public override IGpuTensor<T> BackwardGpu(IGpuTensor<T> outputGradient)

Parameters

outputGradient IGpuTensor<T>: The GPU-resident gradient of the loss with respect to the layer's output.

Returns

IGpuTensor<T>: The GPU-resident gradient of the loss with respect to the layer's input.

Remarks

This method performs the layer's backward computation entirely on GPU, including:

Computing input gradients to pass to previous layers
Computing and storing weight gradients on GPU (for layers with trainable parameters)
Computing and storing bias gradients on GPU

For Beginners: This is like Backward() but runs entirely on GPU.

During GPU training:

Output gradients come in (on GPU)
Input gradients are computed (stay on GPU)
Weight/bias gradients are computed and stored (on GPU)
Input gradients are returned for the previous layer

All data stays on GPU - no CPU round-trips needed!

Exceptions

NotSupportedException: Thrown when the layer does not support GPU training.
InvalidOperationException: Thrown if ForwardGpu was not called first.

ExportComputationGraph(List<ComputationNode<T>>)

Exports the layer's computation graph for JIT compilation.

public override ComputationNode<T> ExportComputationGraph(List<ComputationNode<T>> inputNodes)

Parameters

inputNodes List<ComputationNode<T>>: List to populate with input computation nodes.

Returns

ComputationNode<T>: The output computation node representing the layer's operation.

Remarks

This method constructs a computation graph representation of the layer's forward pass that can be JIT compiled for faster inference. All layers MUST implement this method to support JIT compilation.

For Beginners: JIT (Just-In-Time) compilation converts the layer's operations into optimized native code for 5-10x faster inference.

To support JIT compilation, a layer must:

Implement this method to export its computation graph
Set SupportsJitCompilation to true
Use ComputationNode and TensorOperations to build the graph

All layers are required to implement this method, even if they set SupportsJitCompilation = false.

Forward(Tensor<T>)

This method is not supported by ConcatenateLayer and will throw an exception.

public override Tensor<T> Forward(Tensor<T> input)

Parameters

input Tensor<T>: The input tensor.

Returns

Tensor<T>: Never returns as it always throws an exception.

Remarks

This method overrides the base Forward method that accepts a single input tensor, but it always throws an exception because concatenate layers require multiple inputs by definition. Use the Forward method that accepts multiple inputs instead.

For Beginners: This method is included because all layers must follow the same interface, but it can't be used with concatenate layers.

A concatenate layer must have at least two inputs to join together, so this method that only takes one input will always throw an error.

Instead, you should use the other Forward method that accepts multiple inputs (params Tensor<T>[] inputs).

Exceptions

NotSupportedException: Always thrown as ConcatenateLayer requires multiple inputs.

Forward(params Tensor<T>[])

Performs the forward pass of the concatenate layer with multiple inputs.

public override Tensor<T> Forward(params Tensor<T>[] inputs)

Parameters

inputs Tensor<T>[]: The input tensors to concatenate.

Returns

Tensor<T>: The output tensor after concatenation and activation.

Remarks

This method implements the forward pass of the concatenate layer. It combines the input tensors along the specified axis, applies the activation function (if any), and returns the result. The inputs and output are cached for use during the backward pass.

For Beginners: This method joins multiple inputs together during the network's forward pass.

The forward pass:

Takes in all input tensors
Joins them together along the specified axis
Applies the activation function (if any)
Returns the combined result

This method also saves the inputs and output for later use during training.

For example, if you pass in tensors representing image features and text features, this method will join them into a single tensor containing both types of features.

Exceptions

ArgumentException: Thrown when fewer than two input tensors are provided.

ForwardGpu(params IGpuTensor<T>[])

Performs the forward pass of the layer on GPU.

public override IGpuTensor<T> ForwardGpu(params IGpuTensor<T>[] inputs)

Parameters

inputs IGpuTensor<T>[]: The GPU-resident input tensor(s).

Returns

IGpuTensor<T>: The GPU-resident output tensor.

Remarks

This method performs the layer's forward computation entirely on GPU. The input and output tensors remain in GPU memory, avoiding expensive CPU-GPU transfers.

For Beginners: This is like Forward() but runs on the graphics card.

The key difference:

Forward() uses CPU tensors that may be copied to/from GPU
ForwardGpu() keeps everything on GPU the whole time

Override this in derived classes that support GPU acceleration.

Exceptions

NotSupportedException: Thrown when the layer does not support GPU execution.

GetParameters()

Gets all trainable parameters from the layer as a single vector.

public override Vector<T> GetParameters()

Returns

Vector<T>: An empty vector as concatenate layers have no parameters.

Remarks

This method returns an empty vector because concatenate layers don't have any trainable parameters.

For Beginners: This method returns an empty list because concatenate layers don't have any learnable values.

Unlike layers with weights and biases, the concatenate layer doesn't have any parameters that need to be saved or loaded. It's just a fixed operation that joins inputs together.

This method is still required because all layers must follow the same interface, but it simply returns an empty vector in this case.

ResetState()

Resets the internal state of the concatenate layer.

public override void ResetState()

Remarks

This method resets the internal state of the concatenate layer, including the cached inputs and output. This is useful when starting to process a new sequence or batch after processing a previous one.

For Beginners: This method clears the layer's temporary memory to start fresh.

When resetting the state:

Stored inputs and outputs are cleared
The layer forgets any information from previous batches

This is important for:

Processing a new, unrelated batch of data
Preventing information from one batch affecting another
Freeing up memory that's no longer needed

Since the concatenate layer doesn't have learnable parameters, this only clears the cached values used during a single forward/backward pass.

UpdateParameters(T)

Updates the parameters of the layer using the calculated gradients.

public override void UpdateParameters(T learningRate)

Parameters

learningRate T: The learning rate to use for the parameter updates.

Remarks

This method is a no-op for concatenate layers since they have no trainable parameters to update.

For Beginners: This method doesn't do anything for concatenate layers because there are no parameters to update.

Unlike layers with weights and biases that need to be updated during training, the concatenate layer just passes data through without learning any parameters.

This method is still required to be implemented because all layers must follow the same interface, but it doesn't actually do anything for this type of layer.

Table of Contents

Class ConcatenateLayer<T>

Type Parameters

Remarks

Constructors

ConcatenateLayer(int[][], int, IActivationFunction<T>?)

Parameters

Remarks

Exceptions

ConcatenateLayer(int[][], int, IVectorActivationFunction<T>?)

Parameters

Remarks

Exceptions

Properties

SupportsGpuExecution

Property Value

Remarks

SupportsGpuTraining

Property Value

Remarks

SupportsJitCompilation

Property Value

Remarks

SupportsTraining

Property Value

Remarks

Methods

Backward(Tensor<T>)

Parameters

Returns

Remarks

Exceptions

BackwardGpu(IGpuTensor<T>)

Parameters

Returns

Remarks

Exceptions

ExportComputationGraph(List<ComputationNode<T>>)

Parameters

Returns

Remarks

Forward(Tensor<T>)

Parameters

Returns

Remarks

Exceptions

Forward(params Tensor<T>[])

Parameters

Returns

Remarks

Exceptions

ForwardGpu(params IGpuTensor<T>[])

Parameters

Returns

Remarks

Exceptions

GetParameters()

Returns

Remarks

ResetState()

Remarks

UpdateParameters(T)

Parameters

Remarks