Table of Contents

Class PaddingLayer<T>

Namespace
AiDotNet.NeuralNetworks.Layers
Assembly
AiDotNet.dll

Represents a layer that adds padding to the input tensor.

public class PaddingLayer<T> : LayerBase<T>, ILayer<T>, IJitCompilable<T>, IDiagnosticsProvider, IWeightLoadable<T>, IDisposable

Type Parameters

T

The numeric type used for calculations, typically float or double.

Inheritance
PaddingLayer<T>
Implements
Inherited Members

Remarks

The PaddingLayer adds a specified amount of padding around the edges of the input tensor. This is commonly used in convolutional neural networks to preserve spatial dimensions after convolution operations or to provide additional context at the boundaries of the input. The padding is added symmetrically on both sides of each dimension of the input tensor.

For Beginners: This layer adds extra space around the edges of your data.

Think of it like adding a frame around a picture:

  • You have an image (your input data)
  • The padding adds extra space around all sides of the image
  • The padding is filled with zeros by default

This is useful for:

  • Preserving the size of images when applying convolutions
  • Preventing loss of information at the edges of the data
  • Giving convolutional filters more context at the boundaries

For example, if you have a 28×28 image and add padding of 2 pixels on all sides, you get a 32×32 image with your original data in the center and zeros around the edges.

Constructors

PaddingLayer(int[], int[], IActivationFunction<T>?)

Initializes a new instance of the LayerBase<T> class with the specified shapes and element-wise activation function.

public PaddingLayer(int[] inputShape, int[] padding, IActivationFunction<T>? activationFunction = null)

Parameters

inputShape int[]

The shape of the input tensor.

padding int[]

The amount of padding to add to each dimension.

activationFunction IActivationFunction<T>

The activation function to apply after processing. Defaults to Identity if not specified.

Remarks

This constructor creates a new Layer with the specified input and output shapes and element-wise activation function.

For Beginners: This creates a new layer with a standard activation function.

In addition to the shapes, this also sets up:

  • A scalar activation function that processes each value independently
  • The foundation for a layer that transforms data in a specific way

For example, you might create a layer with a ReLU activation function, which turns all negative values to zero while keeping positive values.

PaddingLayer(int[], int[], IVectorActivationFunction<T>?)

Initializes a new instance of the PaddingLayer<T> class with the specified input shape, padding, and a vector activation function.

public PaddingLayer(int[] inputShape, int[] padding, IVectorActivationFunction<T>? vectorActivationFunction = null)

Parameters

inputShape int[]

The shape of the input tensor.

padding int[]

The amount of padding to add to each dimension.

vectorActivationFunction IVectorActivationFunction<T>

The vector activation function to apply after processing. Defaults to Identity if not specified.

Remarks

This constructor creates a PaddingLayer with the specified input shape and padding amounts. The output shape is calculated by adding twice the padding amount to each dimension of the input shape. This overload accepts a vector activation function, which operates on entire vectors rather than individual elements.

For Beginners: This constructor sets up the layer with a vector-based activation function.

A vector activation function:

  • Operates on entire groups of numbers at once, rather than one at a time
  • Can capture relationships between different elements in the output
  • Defaults to the Identity function, which doesn't change the values

This constructor is useful when you need more complex activation patterns that consider the relationships between different values after padding.

Properties

SupportsGpuExecution

Gets whether this layer has a GPU execution implementation for inference.

protected override bool SupportsGpuExecution { get; }

Property Value

bool

Remarks

Override this to return true when the layer implements ForwardGpu(params IGpuTensor<T>[]). The actual CanExecuteOnGpu property combines this with engine availability.

For Beginners: This flag indicates if the layer has GPU code for the forward pass. Set this to true in derived classes that implement ForwardGpu.

SupportsGpuTraining

Gets whether this layer has full GPU training support (forward, backward, and parameter updates).

public override bool SupportsGpuTraining { get; }

Property Value

bool

Remarks

This property indicates whether the layer can perform its entire training cycle on GPU without downloading data to CPU. A layer has full GPU training support when:

  • ForwardGpu is implemented
  • BackwardGpu is implemented
  • UpdateParametersGpu is implemented (for layers with trainable parameters)
  • GPU weight/bias/gradient buffers are properly managed

For Beginners: This tells you if training can happen entirely on GPU.

GPU-resident training is much faster because:

  • Data stays on GPU between forward and backward passes
  • No expensive CPU-GPU transfers during each training step
  • GPU kernels handle all gradient computation

Only layers that return true here can participate in fully GPU-resident training.

SupportsJitCompilation

Gets whether this layer supports JIT compilation.

public override bool SupportsJitCompilation { get; }

Property Value

bool

True if the layer can be JIT compiled, false otherwise.

Remarks

This property indicates whether the layer has implemented ExportComputationGraph() and can benefit from JIT compilation. All layers MUST implement this property.

For Beginners: JIT compilation can make inference 5-10x faster by converting the layer's operations into optimized native code.

Layers should return false if they:

  • Have not yet implemented a working ExportComputationGraph()
  • Use dynamic operations that change based on input data
  • Are too simple to benefit from JIT compilation

When false, the layer will use the standard Forward() method instead.

SupportsTraining

Gets a value indicating whether this layer supports training.

public override bool SupportsTraining { get; }

Property Value

bool

Always true because the PaddingLayer supports backpropagation, even though it has no parameters.

Remarks

This property indicates whether the layer supports backpropagation during training. Although the PaddingLayer has no trainable parameters, it still supports the backward pass to propagate gradients to previous layers.

For Beginners: This property tells you if the layer can participate in the training process.

A value of true means:

  • The layer can pass gradient information backward during training
  • It's part of the learning process, even though it doesn't have learnable parameters

While this layer doesn't have weights or biases that get updated during training, it still needs to properly handle gradients to ensure that layers before it can learn correctly.

Methods

Backward(Tensor<T>)

Performs the backward pass of the padding layer.

public override Tensor<T> Backward(Tensor<T> outputGradient)

Parameters

outputGradient Tensor<T>

The gradient of the loss with respect to the layer's output.

Returns

Tensor<T>

The gradient of the loss with respect to the layer's input.

Remarks

This method implements the backward pass of the padding layer, which is used during training to propagate error gradients back through the network. It extracts the gradients corresponding to the original input positions from the output gradient tensor, ignoring the gradients in the padded regions. The method applies the activation function derivative to the result.

For Beginners: This method calculates how changes in the input would affect the final output.

During the backward pass:

  • The layer receives gradients for the entire padded output tensor
  • It extracts only the gradients corresponding to the original input area
  • The gradients in the padded regions are ignored (since they don't correspond to any input)

This is essentially the reverse of the forward pass:

  • Forward: copy input to center of larger padded tensor
  • Backward: extract central region of gradient tensor that corresponds to the original input

This allows the network to learn as if the padding wasn't there, while still benefiting from the additional context it provides.

Exceptions

InvalidOperationException

Thrown when backward is called before forward.

BackwardGpu(IGpuTensor<T>)

Performs the backward pass on GPU tensors.

public override IGpuTensor<T> BackwardGpu(IGpuTensor<T> outputGradient)

Parameters

outputGradient IGpuTensor<T>

The GPU-resident gradient tensor.

Returns

IGpuTensor<T>

The gradient with respect to the input (center region extracted).

Remarks

The backward pass extracts the center region of the gradient tensor, which corresponds to the original input positions. This is the reverse of the forward pass padding operation. Uses SliceGpu to extract center region for each padded dimension.

ExportComputationGraph(List<ComputationNode<T>>)

Exports the layer's computation graph for JIT compilation.

public override ComputationNode<T> ExportComputationGraph(List<ComputationNode<T>> inputNodes)

Parameters

inputNodes List<ComputationNode<T>>

List to populate with input computation nodes.

Returns

ComputationNode<T>

The output computation node representing the layer's operation.

Remarks

This method constructs a computation graph representation of the layer's forward pass that can be JIT compiled for faster inference. All layers MUST implement this method to support JIT compilation.

For Beginners: JIT (Just-In-Time) compilation converts the layer's operations into optimized native code for 5-10x faster inference.

To support JIT compilation, a layer must:

  1. Implement this method to export its computation graph
  2. Set SupportsJitCompilation to true
  3. Use ComputationNode and TensorOperations to build the graph

All layers are required to implement this method, even if they set SupportsJitCompilation = false.

Forward(Tensor<T>)

Performs the forward pass of the padding layer.

public override Tensor<T> Forward(Tensor<T> input)

Parameters

input Tensor<T>

The input tensor to process.

Returns

Tensor<T>

The output tensor after padding and activation.

Remarks

This method implements the forward pass of the padding layer. It creates a new tensor with the padded dimensions, copies the input data to the appropriate positions in the padded tensor, and applies the activation function to the result. The input tensor is cached for use during the backward pass.

For Beginners: This method performs the actual padding operation.

During the forward pass:

  • The method creates a new, larger tensor to hold the padded data
  • It copies the original data to the center of this new tensor
  • The areas around the edges are implicitly filled with zeros
  • Finally, it applies the activation function to the result

For example, with a 3×3 image and padding of 1:

  • The output is a 5×5 image
  • The original 3×3 data is in the center
  • The outer border of width 1 is filled with zeros

The method also saves the input for later use in backpropagation.

ForwardGpu(params IGpuTensor<T>[])

Performs the forward pass of the layer on GPU.

public override IGpuTensor<T> ForwardGpu(params IGpuTensor<T>[] inputs)

Parameters

inputs IGpuTensor<T>[]

The GPU-resident input tensor(s).

Returns

IGpuTensor<T>

The GPU-resident output tensor.

Remarks

This method performs the layer's forward computation entirely on GPU. The input and output tensors remain in GPU memory, avoiding expensive CPU-GPU transfers.

For Beginners: This is like Forward() but runs on the graphics card.

The key difference:

  • Forward() uses CPU tensors that may be copied to/from GPU
  • ForwardGpu() keeps everything on GPU the whole time

Override this in derived classes that support GPU acceleration.

Exceptions

NotSupportedException

Thrown when the layer does not support GPU execution.

GetParameters()

Gets all trainable parameters from the padding layer as a single vector.

public override Vector<T> GetParameters()

Returns

Vector<T>

An empty vector since PaddingLayer has no trainable parameters.

Remarks

This method retrieves all trainable parameters from the layer as a single vector. Since PaddingLayer has no trainable parameters, it returns an empty vector.

For Beginners: This method returns all the learnable values in the layer.

Since PaddingLayer:

  • Only performs a fixed operation (adding zeros around the edges)
  • Has no weights, biases, or other learnable parameters
  • The method returns an empty list

This is different from layers like Dense layers, which would return their weights and biases.

ResetState()

Resets the internal state of the padding layer.

public override void ResetState()

Remarks

This method resets the internal state of the padding layer, including the cached input tensor. This is useful when starting to process a new sequence or batch of data.

For Beginners: This method clears the layer's memory to start fresh.

When resetting the state:

  • Stored input from previous processing is cleared
  • The layer forgets any information from previous data batches

This is important for:

  • Processing a new, unrelated batch of data
  • Ensuring clean state before a new training epoch
  • Preventing information from one batch affecting another

While the PaddingLayer doesn't maintain long-term state across samples, clearing these cached values helps with memory management and ensuring a clean processing pipeline.

UpdateParameters(T)

Updates the parameters of the padding layer using the calculated gradients.

public override void UpdateParameters(T learningRate)

Parameters

learningRate T

The learning rate to use for the parameter updates.

Remarks

This method is part of the training process, but since PaddingLayer has no trainable parameters, this method does nothing.

For Beginners: This method would normally update a layer's internal values during training.

However, since PaddingLayer just performs a fixed operation (adding zeros around the edges) and doesn't have any internal values that can be learned or adjusted, this method is empty.

This is unlike layers such as Dense or Convolutional layers, which have weights and biases that get updated during training.