Class PaddingLayer<T>
- Namespace
- AiDotNet.NeuralNetworks.Layers
- Assembly
- AiDotNet.dll
Represents a layer that adds padding to the input tensor.
public class PaddingLayer<T> : LayerBase<T>, ILayer<T>, IJitCompilable<T>, IDiagnosticsProvider, IWeightLoadable<T>, IDisposable
Type Parameters
TThe numeric type used for calculations, typically float or double.
- Inheritance
-
LayerBase<T>PaddingLayer<T>
- Implements
-
ILayer<T>
- Inherited Members
Remarks
The PaddingLayer adds a specified amount of padding around the edges of the input tensor. This is commonly used in convolutional neural networks to preserve spatial dimensions after convolution operations or to provide additional context at the boundaries of the input. The padding is added symmetrically on both sides of each dimension of the input tensor.
For Beginners: This layer adds extra space around the edges of your data.
Think of it like adding a frame around a picture:
- You have an image (your input data)
- The padding adds extra space around all sides of the image
- The padding is filled with zeros by default
This is useful for:
- Preserving the size of images when applying convolutions
- Preventing loss of information at the edges of the data
- Giving convolutional filters more context at the boundaries
For example, if you have a 28×28 image and add padding of 2 pixels on all sides, you get a 32×32 image with your original data in the center and zeros around the edges.
Constructors
PaddingLayer(int[], int[], IActivationFunction<T>?)
Initializes a new instance of the LayerBase<T> class with the specified shapes and element-wise activation function.
public PaddingLayer(int[] inputShape, int[] padding, IActivationFunction<T>? activationFunction = null)
Parameters
inputShapeint[]The shape of the input tensor.
paddingint[]The amount of padding to add to each dimension.
activationFunctionIActivationFunction<T>The activation function to apply after processing. Defaults to Identity if not specified.
Remarks
This constructor creates a new Layer with the specified input and output shapes and element-wise activation function.
For Beginners: This creates a new layer with a standard activation function.
In addition to the shapes, this also sets up:
- A scalar activation function that processes each value independently
- The foundation for a layer that transforms data in a specific way
For example, you might create a layer with a ReLU activation function, which turns all negative values to zero while keeping positive values.
PaddingLayer(int[], int[], IVectorActivationFunction<T>?)
Initializes a new instance of the PaddingLayer<T> class with the specified input shape, padding, and a vector activation function.
public PaddingLayer(int[] inputShape, int[] padding, IVectorActivationFunction<T>? vectorActivationFunction = null)
Parameters
inputShapeint[]The shape of the input tensor.
paddingint[]The amount of padding to add to each dimension.
vectorActivationFunctionIVectorActivationFunction<T>The vector activation function to apply after processing. Defaults to Identity if not specified.
Remarks
This constructor creates a PaddingLayer with the specified input shape and padding amounts. The output shape is calculated by adding twice the padding amount to each dimension of the input shape. This overload accepts a vector activation function, which operates on entire vectors rather than individual elements.
For Beginners: This constructor sets up the layer with a vector-based activation function.
A vector activation function:
- Operates on entire groups of numbers at once, rather than one at a time
- Can capture relationships between different elements in the output
- Defaults to the Identity function, which doesn't change the values
This constructor is useful when you need more complex activation patterns that consider the relationships between different values after padding.
Properties
SupportsGpuExecution
Gets whether this layer has a GPU execution implementation for inference.
protected override bool SupportsGpuExecution { get; }
Property Value
Remarks
Override this to return true when the layer implements ForwardGpu(params IGpuTensor<T>[]). The actual CanExecuteOnGpu property combines this with engine availability.
For Beginners: This flag indicates if the layer has GPU code for the forward pass. Set this to true in derived classes that implement ForwardGpu.
SupportsGpuTraining
Gets whether this layer has full GPU training support (forward, backward, and parameter updates).
public override bool SupportsGpuTraining { get; }
Property Value
Remarks
This property indicates whether the layer can perform its entire training cycle on GPU without downloading data to CPU. A layer has full GPU training support when:
- ForwardGpu is implemented
- BackwardGpu is implemented
- UpdateParametersGpu is implemented (for layers with trainable parameters)
- GPU weight/bias/gradient buffers are properly managed
For Beginners: This tells you if training can happen entirely on GPU.
GPU-resident training is much faster because:
- Data stays on GPU between forward and backward passes
- No expensive CPU-GPU transfers during each training step
- GPU kernels handle all gradient computation
Only layers that return true here can participate in fully GPU-resident training.
SupportsJitCompilation
Gets whether this layer supports JIT compilation.
public override bool SupportsJitCompilation { get; }
Property Value
- bool
True if the layer can be JIT compiled, false otherwise.
Remarks
This property indicates whether the layer has implemented ExportComputationGraph() and can benefit from JIT compilation. All layers MUST implement this property.
For Beginners: JIT compilation can make inference 5-10x faster by converting the layer's operations into optimized native code.
Layers should return false if they:
- Have not yet implemented a working ExportComputationGraph()
- Use dynamic operations that change based on input data
- Are too simple to benefit from JIT compilation
When false, the layer will use the standard Forward() method instead.
SupportsTraining
Gets a value indicating whether this layer supports training.
public override bool SupportsTraining { get; }
Property Value
- bool
Always
truebecause the PaddingLayer supports backpropagation, even though it has no parameters.
Remarks
This property indicates whether the layer supports backpropagation during training. Although the PaddingLayer has no trainable parameters, it still supports the backward pass to propagate gradients to previous layers.
For Beginners: This property tells you if the layer can participate in the training process.
A value of true means:
- The layer can pass gradient information backward during training
- It's part of the learning process, even though it doesn't have learnable parameters
While this layer doesn't have weights or biases that get updated during training, it still needs to properly handle gradients to ensure that layers before it can learn correctly.
Methods
Backward(Tensor<T>)
Performs the backward pass of the padding layer.
public override Tensor<T> Backward(Tensor<T> outputGradient)
Parameters
outputGradientTensor<T>The gradient of the loss with respect to the layer's output.
Returns
- Tensor<T>
The gradient of the loss with respect to the layer's input.
Remarks
This method implements the backward pass of the padding layer, which is used during training to propagate error gradients back through the network. It extracts the gradients corresponding to the original input positions from the output gradient tensor, ignoring the gradients in the padded regions. The method applies the activation function derivative to the result.
For Beginners: This method calculates how changes in the input would affect the final output.
During the backward pass:
- The layer receives gradients for the entire padded output tensor
- It extracts only the gradients corresponding to the original input area
- The gradients in the padded regions are ignored (since they don't correspond to any input)
This is essentially the reverse of the forward pass:
- Forward: copy input to center of larger padded tensor
- Backward: extract central region of gradient tensor that corresponds to the original input
This allows the network to learn as if the padding wasn't there, while still benefiting from the additional context it provides.
Exceptions
- InvalidOperationException
Thrown when backward is called before forward.
BackwardGpu(IGpuTensor<T>)
Performs the backward pass on GPU tensors.
public override IGpuTensor<T> BackwardGpu(IGpuTensor<T> outputGradient)
Parameters
outputGradientIGpuTensor<T>The GPU-resident gradient tensor.
Returns
- IGpuTensor<T>
The gradient with respect to the input (center region extracted).
Remarks
The backward pass extracts the center region of the gradient tensor, which corresponds to the original input positions. This is the reverse of the forward pass padding operation. Uses SliceGpu to extract center region for each padded dimension.
ExportComputationGraph(List<ComputationNode<T>>)
Exports the layer's computation graph for JIT compilation.
public override ComputationNode<T> ExportComputationGraph(List<ComputationNode<T>> inputNodes)
Parameters
inputNodesList<ComputationNode<T>>List to populate with input computation nodes.
Returns
- ComputationNode<T>
The output computation node representing the layer's operation.
Remarks
This method constructs a computation graph representation of the layer's forward pass that can be JIT compiled for faster inference. All layers MUST implement this method to support JIT compilation.
For Beginners: JIT (Just-In-Time) compilation converts the layer's operations into optimized native code for 5-10x faster inference.
To support JIT compilation, a layer must:
- Implement this method to export its computation graph
- Set SupportsJitCompilation to true
- Use ComputationNode and TensorOperations to build the graph
All layers are required to implement this method, even if they set SupportsJitCompilation = false.
Forward(Tensor<T>)
Performs the forward pass of the padding layer.
public override Tensor<T> Forward(Tensor<T> input)
Parameters
inputTensor<T>The input tensor to process.
Returns
- Tensor<T>
The output tensor after padding and activation.
Remarks
This method implements the forward pass of the padding layer. It creates a new tensor with the padded dimensions, copies the input data to the appropriate positions in the padded tensor, and applies the activation function to the result. The input tensor is cached for use during the backward pass.
For Beginners: This method performs the actual padding operation.
During the forward pass:
- The method creates a new, larger tensor to hold the padded data
- It copies the original data to the center of this new tensor
- The areas around the edges are implicitly filled with zeros
- Finally, it applies the activation function to the result
For example, with a 3×3 image and padding of 1:
- The output is a 5×5 image
- The original 3×3 data is in the center
- The outer border of width 1 is filled with zeros
The method also saves the input for later use in backpropagation.
ForwardGpu(params IGpuTensor<T>[])
Performs the forward pass of the layer on GPU.
public override IGpuTensor<T> ForwardGpu(params IGpuTensor<T>[] inputs)
Parameters
inputsIGpuTensor<T>[]The GPU-resident input tensor(s).
Returns
- IGpuTensor<T>
The GPU-resident output tensor.
Remarks
This method performs the layer's forward computation entirely on GPU. The input and output tensors remain in GPU memory, avoiding expensive CPU-GPU transfers.
For Beginners: This is like Forward() but runs on the graphics card.
The key difference:
- Forward() uses CPU tensors that may be copied to/from GPU
- ForwardGpu() keeps everything on GPU the whole time
Override this in derived classes that support GPU acceleration.
Exceptions
- NotSupportedException
Thrown when the layer does not support GPU execution.
GetParameters()
Gets all trainable parameters from the padding layer as a single vector.
public override Vector<T> GetParameters()
Returns
- Vector<T>
An empty vector since PaddingLayer has no trainable parameters.
Remarks
This method retrieves all trainable parameters from the layer as a single vector. Since PaddingLayer has no trainable parameters, it returns an empty vector.
For Beginners: This method returns all the learnable values in the layer.
Since PaddingLayer:
- Only performs a fixed operation (adding zeros around the edges)
- Has no weights, biases, or other learnable parameters
- The method returns an empty list
This is different from layers like Dense layers, which would return their weights and biases.
ResetState()
Resets the internal state of the padding layer.
public override void ResetState()
Remarks
This method resets the internal state of the padding layer, including the cached input tensor. This is useful when starting to process a new sequence or batch of data.
For Beginners: This method clears the layer's memory to start fresh.
When resetting the state:
- Stored input from previous processing is cleared
- The layer forgets any information from previous data batches
This is important for:
- Processing a new, unrelated batch of data
- Ensuring clean state before a new training epoch
- Preventing information from one batch affecting another
While the PaddingLayer doesn't maintain long-term state across samples, clearing these cached values helps with memory management and ensuring a clean processing pipeline.
UpdateParameters(T)
Updates the parameters of the padding layer using the calculated gradients.
public override void UpdateParameters(T learningRate)
Parameters
learningRateTThe learning rate to use for the parameter updates.
Remarks
This method is part of the training process, but since PaddingLayer has no trainable parameters, this method does nothing.
For Beginners: This method would normally update a layer's internal values during training.
However, since PaddingLayer just performs a fixed operation (adding zeros around the edges) and doesn't have any internal values that can be learned or adjusted, this method is empty.
This is unlike layers such as Dense or Convolutional layers, which have weights and biases that get updated during training.