Class ConcatenateLayer<T>
- Namespace
- AiDotNet.NeuralNetworks.Layers
- Assembly
- AiDotNet.dll
Represents a neural network layer that concatenates multiple inputs along a specified axis.
public class ConcatenateLayer<T> : LayerBase<T>, ILayer<T>, IJitCompilable<T>, IDiagnosticsProvider, IWeightLoadable<T>, IDisposable
Type Parameters
TThe numeric type used for calculations, typically float or double.
- Inheritance
-
LayerBase<T>ConcatenateLayer<T>
- Implements
-
ILayer<T>
- Inherited Members
Remarks
A concatenate layer combines multiple input tensors into a single output tensor by joining them along a specified axis. For example, if you have two tensors of shape [batch_size, 10] and [batch_size, 15], concatenating them along axis 1 would produce a tensor of shape [batch_size, 25]. This layer doesn't have any trainable parameters and simply passes the gradients back to the appropriate input tensors during backpropagation.
For Beginners: A concatenate layer joins multiple inputs together to make one bigger output.
Think of it like joining arrays or lists:
- If you have two lists [1, 2, 3] and [4, 5], concatenating them gives [1, 2, 3, 4, 5]
In neural networks, we often work with multi-dimensional data, so we need to specify which dimension (axis) to join along:
- Axis 0 would join along the first dimension (like stacking sheets of paper)
- Axis 1 would join along the second dimension (like extending rows sideways)
- Axis 2 would join along the third dimension (like extending columns downward)
For example, if you have:
- One tensor representing features from an image: [batch_size, 100]
- Another tensor representing features from text: [batch_size, 50]
You could use a concatenate layer with axis=1 to create a combined feature tensor of shape [batch_size, 150] that contains both sets of features side by side.
Constructors
ConcatenateLayer(int[][], int, IActivationFunction<T>?)
Initializes a new instance of the ConcatenateLayer<T> class with a scalar activation function.
public ConcatenateLayer(int[][] inputShapes, int axis, IActivationFunction<T>? activationFunction = null)
Parameters
inputShapesint[][]The shapes of the input tensors to be concatenated.
axisintThe axis along which to concatenate the inputs.
activationFunctionIActivationFunction<T>The activation function to apply after concatenation. Defaults to identity if not specified.
Remarks
This constructor creates a new concatenate layer using the specified input shapes and concatenation axis. It validates the input shapes to ensure they are compatible for concatenation, and calculates the output shape based on the input shapes and axis. The activation function is applied to the output after concatenation.
For Beginners: This constructor creates a new concatenate layer with a standard activation function.
When creating a concatenate layer, you need to specify:
- The shapes of all the inputs that will be joined together
- Which dimension (axis) to join them along
- Optionally, an activation function to apply after joining
For example, if you have two inputs with shapes [32, 10] and [32, 20], and specify axis=1, the output shape will be [32, 30].
The default activation is the "identity" function, which doesn't change the values at all.
Exceptions
- ArgumentException
Thrown when fewer than two input shapes are provided or when input shapes have different ranks.
ConcatenateLayer(int[][], int, IVectorActivationFunction<T>?)
Initializes a new instance of the ConcatenateLayer<T> class with a vector activation function.
public ConcatenateLayer(int[][] inputShapes, int axis, IVectorActivationFunction<T>? vectorActivationFunction = null)
Parameters
inputShapesint[][]The shapes of the input tensors to be concatenated.
axisintThe axis along which to concatenate the inputs.
vectorActivationFunctionIVectorActivationFunction<T>The vector activation function to apply after concatenation. Defaults to identity if not specified.
Remarks
This constructor creates a new concatenate layer using the specified input shapes and concatenation axis. It validates the input shapes to ensure they are compatible for concatenation, and calculates the output shape based on the input shapes and axis. This overload accepts a vector activation function, which operates on entire vectors rather than individual elements.
For Beginners: This constructor creates a new concatenate layer with a vector-based activation function.
A vector activation function:
- Operates on entire groups of numbers at once, rather than one at a time
- Can capture relationships between different elements in the output
- Defaults to the Identity function, which doesn't change the values
This constructor works the same way as the other one, but it's useful when you need more complex activation patterns that consider the relationships between different outputs.
Exceptions
- ArgumentException
Thrown when fewer than two input shapes are provided or when input shapes have different ranks.
Properties
SupportsGpuExecution
Gets whether this layer has a GPU execution implementation for inference.
protected override bool SupportsGpuExecution { get; }
Property Value
Remarks
Override this to return true when the layer implements ForwardGpu(params IGpuTensor<T>[]). The actual CanExecuteOnGpu property combines this with engine availability.
For Beginners: This flag indicates if the layer has GPU code for the forward pass. Set this to true in derived classes that implement ForwardGpu.
SupportsGpuTraining
Gets whether this layer has full GPU training support (forward, backward, and parameter updates).
public override bool SupportsGpuTraining { get; }
Property Value
Remarks
This property indicates whether the layer can perform its entire training cycle on GPU without downloading data to CPU. A layer has full GPU training support when:
- ForwardGpu is implemented
- BackwardGpu is implemented
- UpdateParametersGpu is implemented (for layers with trainable parameters)
- GPU weight/bias/gradient buffers are properly managed
For Beginners: This tells you if training can happen entirely on GPU.
GPU-resident training is much faster because:
- Data stays on GPU between forward and backward passes
- No expensive CPU-GPU transfers during each training step
- GPU kernels handle all gradient computation
Only layers that return true here can participate in fully GPU-resident training.
SupportsJitCompilation
Gets whether this layer supports JIT compilation.
public override bool SupportsJitCompilation { get; }
Property Value
- bool
True if the layer can be JIT compiled, false otherwise.
Remarks
This property indicates whether the layer has implemented ExportComputationGraph() and can benefit from JIT compilation. All layers MUST implement this property.
For Beginners: JIT compilation can make inference 5-10x faster by converting the layer's operations into optimized native code.
Layers should return false if they:
- Have not yet implemented a working ExportComputationGraph()
- Use dynamic operations that change based on input data
- Are too simple to benefit from JIT compilation
When false, the layer will use the standard Forward() method instead.
SupportsTraining
Gets a value indicating whether this layer supports training.
public override bool SupportsTraining { get; }
Property Value
- bool
Always
falseas concatenate layers have no trainable parameters.
Remarks
This property returns false because concatenate layers don't have any trainable parameters. The layer simply combines inputs and passes gradients through during backpropagation without modifications.
For Beginners: This property tells you that this layer cannot learn from data.
A value of false means:
- The layer doesn't contain any values that will change during training
- It performs a fixed operation (concatenation) that doesn't need to be learned
- It still participates in passing information during training, but doesn't change itself
This is different from layers like dense or convolutional layers that do have trainable parameters (weights and biases) that get updated during learning.
Methods
Backward(Tensor<T>)
Performs the backward pass of the concatenate layer.
public override Tensor<T> Backward(Tensor<T> outputGradient)
Parameters
outputGradientTensor<T>The gradient of the loss with respect to the layer's output.
Returns
- Tensor<T>
The gradient of the loss with respect to the layer's input.
Remarks
This method implements the backward pass of the concatenate layer, which is used during training to propagate error gradients back through the network. It splits the output gradient along the concatenation axis and distributes the pieces to the corresponding input gradients.
For Beginners: This method routes the error gradients back to the correct inputs during training.
During the backward pass:
- The layer receives error gradients from the next layer
- If an activation function was used, its derivative is applied
- The gradient is split along the same axis used for concatenation
- Each piece of the gradient is sent back to the corresponding input
For example, if you joined three tensors of widths 10, 20, and 15:
- The incoming gradient would have width 45
- This method would split it into pieces of width 10, 20, and 15
- Each piece would be sent back to its original source
This is how the training signal flows backward through the network, allowing each connected layer to learn from the error.
Exceptions
- InvalidOperationException
Thrown when backward is called before forward.
BackwardGpu(IGpuTensor<T>)
Performs the backward pass of the layer on GPU.
public override IGpuTensor<T> BackwardGpu(IGpuTensor<T> outputGradient)
Parameters
outputGradientIGpuTensor<T>The GPU-resident gradient of the loss with respect to the layer's output.
Returns
- IGpuTensor<T>
The GPU-resident gradient of the loss with respect to the layer's input.
Remarks
This method performs the layer's backward computation entirely on GPU, including:
- Computing input gradients to pass to previous layers
- Computing and storing weight gradients on GPU (for layers with trainable parameters)
- Computing and storing bias gradients on GPU
For Beginners: This is like Backward() but runs entirely on GPU.
During GPU training:
- Output gradients come in (on GPU)
- Input gradients are computed (stay on GPU)
- Weight/bias gradients are computed and stored (on GPU)
- Input gradients are returned for the previous layer
All data stays on GPU - no CPU round-trips needed!
Exceptions
- NotSupportedException
Thrown when the layer does not support GPU training.
- InvalidOperationException
Thrown if ForwardGpu was not called first.
ExportComputationGraph(List<ComputationNode<T>>)
Exports the layer's computation graph for JIT compilation.
public override ComputationNode<T> ExportComputationGraph(List<ComputationNode<T>> inputNodes)
Parameters
inputNodesList<ComputationNode<T>>List to populate with input computation nodes.
Returns
- ComputationNode<T>
The output computation node representing the layer's operation.
Remarks
This method constructs a computation graph representation of the layer's forward pass that can be JIT compiled for faster inference. All layers MUST implement this method to support JIT compilation.
For Beginners: JIT (Just-In-Time) compilation converts the layer's operations into optimized native code for 5-10x faster inference.
To support JIT compilation, a layer must:
- Implement this method to export its computation graph
- Set SupportsJitCompilation to true
- Use ComputationNode and TensorOperations to build the graph
All layers are required to implement this method, even if they set SupportsJitCompilation = false.
Forward(Tensor<T>)
This method is not supported by ConcatenateLayer and will throw an exception.
public override Tensor<T> Forward(Tensor<T> input)
Parameters
inputTensor<T>The input tensor.
Returns
- Tensor<T>
Never returns as it always throws an exception.
Remarks
This method overrides the base Forward method that accepts a single input tensor, but it always throws an exception because concatenate layers require multiple inputs by definition. Use the Forward method that accepts multiple inputs instead.
For Beginners: This method is included because all layers must follow the same interface, but it can't be used with concatenate layers.
A concatenate layer must have at least two inputs to join together, so this method that only takes one input will always throw an error.
Instead, you should use the other Forward method that accepts multiple inputs (params Tensor<T>[] inputs).
Exceptions
- NotSupportedException
Always thrown as ConcatenateLayer requires multiple inputs.
Forward(params Tensor<T>[])
Performs the forward pass of the concatenate layer with multiple inputs.
public override Tensor<T> Forward(params Tensor<T>[] inputs)
Parameters
inputsTensor<T>[]The input tensors to concatenate.
Returns
- Tensor<T>
The output tensor after concatenation and activation.
Remarks
This method implements the forward pass of the concatenate layer. It combines the input tensors along the specified axis, applies the activation function (if any), and returns the result. The inputs and output are cached for use during the backward pass.
For Beginners: This method joins multiple inputs together during the network's forward pass.
The forward pass:
- Takes in all input tensors
- Joins them together along the specified axis
- Applies the activation function (if any)
- Returns the combined result
This method also saves the inputs and output for later use during training.
For example, if you pass in tensors representing image features and text features, this method will join them into a single tensor containing both types of features.
Exceptions
- ArgumentException
Thrown when fewer than two input tensors are provided.
ForwardGpu(params IGpuTensor<T>[])
Performs the forward pass of the layer on GPU.
public override IGpuTensor<T> ForwardGpu(params IGpuTensor<T>[] inputs)
Parameters
inputsIGpuTensor<T>[]The GPU-resident input tensor(s).
Returns
- IGpuTensor<T>
The GPU-resident output tensor.
Remarks
This method performs the layer's forward computation entirely on GPU. The input and output tensors remain in GPU memory, avoiding expensive CPU-GPU transfers.
For Beginners: This is like Forward() but runs on the graphics card.
The key difference:
- Forward() uses CPU tensors that may be copied to/from GPU
- ForwardGpu() keeps everything on GPU the whole time
Override this in derived classes that support GPU acceleration.
Exceptions
- NotSupportedException
Thrown when the layer does not support GPU execution.
GetParameters()
Gets all trainable parameters from the layer as a single vector.
public override Vector<T> GetParameters()
Returns
- Vector<T>
An empty vector as concatenate layers have no parameters.
Remarks
This method returns an empty vector because concatenate layers don't have any trainable parameters.
For Beginners: This method returns an empty list because concatenate layers don't have any learnable values.
Unlike layers with weights and biases, the concatenate layer doesn't have any parameters that need to be saved or loaded. It's just a fixed operation that joins inputs together.
This method is still required because all layers must follow the same interface, but it simply returns an empty vector in this case.
ResetState()
Resets the internal state of the concatenate layer.
public override void ResetState()
Remarks
This method resets the internal state of the concatenate layer, including the cached inputs and output. This is useful when starting to process a new sequence or batch after processing a previous one.
For Beginners: This method clears the layer's temporary memory to start fresh.
When resetting the state:
- Stored inputs and outputs are cleared
- The layer forgets any information from previous batches
This is important for:
- Processing a new, unrelated batch of data
- Preventing information from one batch affecting another
- Freeing up memory that's no longer needed
Since the concatenate layer doesn't have learnable parameters, this only clears the cached values used during a single forward/backward pass.
UpdateParameters(T)
Updates the parameters of the layer using the calculated gradients.
public override void UpdateParameters(T learningRate)
Parameters
learningRateTThe learning rate to use for the parameter updates.
Remarks
This method is a no-op for concatenate layers since they have no trainable parameters to update.
For Beginners: This method doesn't do anything for concatenate layers because there are no parameters to update.
Unlike layers with weights and biases that need to be updated during training, the concatenate layer just passes data through without learning any parameters.
This method is still required to be implemented because all layers must follow the same interface, but it doesn't actually do anything for this type of layer.