Table of Contents

Class PoolingLayer<T>

Namespace
AiDotNet.NeuralNetworks.Layers
Assembly
AiDotNet.dll

Represents a layer that performs pooling operations on input tensors.

public class PoolingLayer<T> : LayerBase<T>, ILayer<T>, IJitCompilable<T>, IDiagnosticsProvider, IWeightLoadable<T>, IDisposable

Type Parameters

T

The numeric type used for calculations, typically float or double.

Inheritance
PoolingLayer<T>
Implements
Inherited Members

Remarks

The PoolingLayer reduces the spatial dimensions (height and width) of input tensors by applying either max pooling or average pooling within local regions. This operation is commonly used in convolutional neural networks to reduce the spatial dimensions of feature maps, which helps to reduce computation, provide translation invariance, and control overfitting.

For Beginners: This layer helps reduce the size of your data while keeping the important information.

Think of it like creating a thumbnail of an image:

  • The pooling layer divides your input into small regions (e.g., 2×2 squares)
  • For each region, it either:
    • Takes the maximum value (max pooling): good for detecting features like edges
    • Takes the average value (average pooling): good for preserving background information
  • This creates a smaller output with fewer pixels but retains the important features

For example, using 2×2 max pooling on a 4×4 image would give you a 2×2 output, where each value is the maximum from its corresponding 2×2 region in the input.

Pooling helps make your neural network:

  • More efficient (by reducing the amount of data)
  • More robust (by being less sensitive to exact positions of features)
  • Less prone to overfitting (by reducing the number of parameters)

Constructors

PoolingLayer(int, int, int, int, int, PoolingType)

Initializes a new instance of the PoolingLayer<T> class with the specified dimensions and pooling parameters.

public PoolingLayer(int inputDepth, int inputHeight, int inputWidth, int poolSize, int stride, PoolingType type = PoolingType.Max)

Parameters

inputDepth int

The depth (number of channels) of the input tensor.

inputHeight int

The height of the input tensor.

inputWidth int

The width of the input tensor.

poolSize int

The size of the pooling window.

stride int

The stride of the pooling operation.

type PoolingType

The type of pooling to perform. Defaults to Max.

Remarks

This constructor creates a PoolingLayer with the specified input dimensions and pooling parameters. The output dimensions are calculated based on the input dimensions, pool size, and stride.

For Beginners: This constructor sets up the layer with the necessary dimensions and pooling options.

When creating a PoolingLayer, you need to specify:

  • inputDepth: The number of channels in your input (e.g., 3 for RGB images)
  • inputHeight: The height of your input
  • inputWidth: The width of your input
  • poolSize: The size of the pooling regions (e.g., 2 for 2×2 regions)
  • stride: How far to move the pooling window each step
  • type: Whether to use max pooling or average pooling (defaults to max)

The constructor automatically calculates what the output dimensions will be based on these parameters. For example, a 28×28 input with pool size 2 and stride 2 would produce a 14×14 output.

Properties

PoolSize

Gets the size of the pooling window.

public int PoolSize { get; }

Property Value

int

The size of the pooling window (both height and width).

Remarks

This property indicates the size of the square window used for pooling operations. For example, a pool size of 2 means that pooling is performed on 2×2 regions of the input.

For Beginners: This property defines how large each pooling region is.

For example:

  • PoolSize = 2 means each pooling region is 2×2 pixels
  • PoolSize = 3 means each pooling region is 3×3 pixels

Larger pool sizes reduce the output dimensions more dramatically but might lose more detail. Common values are 2 and 3.

Stride

Gets the stride of the pooling operation.

public int Stride { get; }

Property Value

int

The number of pixels to move the pooling window for each step.

Remarks

This property indicates how many pixels the pooling window moves for each step. For example, a stride of 2 means the pooling window moves 2 pixels horizontally and vertically for each new pooling operation.

For Beginners: This property defines how far the pooling window moves each step.

For example:

  • Stride = 1 means the window slides just 1 pixel at a time (creates overlapping regions)
  • Stride = 2 means the window jumps 2 pixels each time (typical value, creates non-overlapping regions)

Usually, the stride is set equal to the pool size for non-overlapping pooling regions, but you can use a smaller stride if you want the regions to overlap.

SupportsGpuExecution

Gets a value indicating whether this layer has a GPU implementation.

protected override bool SupportsGpuExecution { get; }

Property Value

bool

SupportsJitCompilation

Gets whether this pooling layer supports JIT compilation.

public override bool SupportsJitCompilation { get; }

Property Value

bool

True if the layer is properly configured.

Remarks

This property indicates whether the layer can be JIT compiled. The layer supports JIT if: - Input shape is configured

For Beginners: This tells you if this layer can use JIT compilation for faster inference.

The layer can be JIT compiled if:

  • The layer has been initialized with valid input shape

Pooling has no trainable parameters, so it can be JIT compiled immediately after initialization. It's a purely computational operation that:

  • Selects maximum values (max pooling) or averages values (average pooling)
  • Reduces spatial dimensions for efficiency
  • Provides translation invariance

JIT compilation optimizes:

  • Window sliding and boundary handling
  • Parallel operations across channels
  • Memory access patterns for cache efficiency
  • Special handling for max pooling index tracking

Once initialized, JIT compilation can provide significant speedup (5-10x) especially for large feature maps in CNNs where pooling is applied extensively.

SupportsTraining

Gets a value indicating whether this layer supports training.

public override bool SupportsTraining { get; }

Property Value

bool

Always true because the PoolingLayer supports backpropagation, even though it has no parameters.

Remarks

This property indicates whether the layer supports backpropagation during training. Although the PoolingLayer has no trainable parameters, it still supports the backward pass to propagate gradients to previous layers.

For Beginners: This property tells you if the layer can participate in the training process.

A value of true means:

  • The layer can pass gradient information backward during training
  • It's part of the learning process, even though it doesn't have learnable parameters

While this layer doesn't have weights or biases that get updated during training, it still needs to properly handle gradients to ensure that layers before it can learn correctly.

Type

Gets the type of pooling operation to perform.

public PoolingType Type { get; }

Property Value

PoolingType

The pooling type (Max or Average).

Remarks

This property indicates whether to use max pooling (which takes the maximum value within each pooling window) or average pooling (which computes the average of all values within each pooling window).

For Beginners: This property determines which mathematical operation is used for pooling.

There are two main types:

  • Max pooling: Takes the largest value in each region
    • Good for detecting if a feature is present somewhere in the region
    • Commonly used in most CNNs
  • Average pooling: Takes the average of all values in each region
    • Good for preserving background information
    • Sometimes used for the final layer of a network

Max pooling tends to be more common because it's better at preserving important features like edges and patterns.

Methods

Backward(Tensor<T>)

Performs the backward pass of the pooling layer.

public override Tensor<T> Backward(Tensor<T> outputGradient)

Parameters

outputGradient Tensor<T>

The gradient of the loss with respect to the layer's output.

Returns

Tensor<T>

The gradient of the loss with respect to the layer's input.

Remarks

This method implements the backward pass of the pooling layer, which is used during training to propagate error gradients back through the network. For max pooling, gradients are passed only to the positions that had the maximum values in each pooling region. For average pooling, gradients are distributed equally across all positions in each pooling region.

For Beginners: This method calculates how changes in the input would affect the final output.

During the backward pass:

  • The layer receives gradients for each position in the output tensor
  • It needs to pass these gradients back to the appropriate positions in the input tensor

For max pooling:

  • Only the position that had the maximum value gets the gradient
  • All other positions in the pooling region get zero gradient
  • This is because changing non-maximum values wouldn't affect the output

For average pooling:

  • The gradient is divided equally among all positions in the pooling region
  • Each position gets (output gradient) / (pool size × pool size)
  • This is because each input position contributes equally to the average

This approach follows the chain rule of calculus for the respective pooling operations.

Exceptions

InvalidOperationException

Thrown when backward is called before forward or when _maxIndices is null during max pooling.

ExportComputationGraph(List<ComputationNode<T>>)

Exports the pooling layer as a computation graph for JIT compilation.

public override ComputationNode<T> ExportComputationGraph(List<ComputationNode<T>> inputNodes)

Parameters

inputNodes List<ComputationNode<T>>

List to which the input node will be added.

Returns

ComputationNode<T>

The output computation node representing the pooling operation.

Remarks

This method creates a symbolic computation graph for JIT compilation: 1. Creates a symbolic input node with shape [batch=1, channels, height, width] 2. Applies either MaxPool2D or AvgPool2D based on the pooling type 3. No learnable parameters needed (pooling is parameter-free)

For Beginners: This method builds a symbolic representation of pooling for JIT.

JIT compilation converts the pooling operation into optimized native code. Pooling (max or average):

  • Reduces spatial dimensions by selecting max or averaging values in each window
  • Slides a window across the input with specified stride
  • Provides translation invariance and reduces overfitting
  • Has no trainable parameters (purely computational)

The symbolic graph allows the JIT compiler to:

  • Optimize the sliding window computation
  • Generate SIMD-optimized code for parallel operations
  • Fuse operations with adjacent layers

Pooling is essential in CNNs for dimensionality reduction and feature extraction. JIT compilation provides 5-10x speedup by optimizing window operations.

Exceptions

ArgumentNullException

Thrown when inputNodes is null.

InvalidOperationException

Thrown when layer shape is not configured.

Forward(Tensor<T>)

Performs the forward pass of the pooling layer.

public override Tensor<T> Forward(Tensor<T> input)

Parameters

input Tensor<T>

The input tensor to process.

Returns

Tensor<T>

The output tensor after pooling.

Remarks

This method implements the forward pass of the pooling layer. It divides the input tensor into regions according to the pool size and stride, and then applies either max pooling or average pooling to each region. The result is a tensor with reduced spatial dimensions.

For Beginners: This method performs the actual pooling operation.

During the forward pass:

  • The method divides the input into regions based on pool size and stride
  • For each region, it either:
    • Finds the maximum value (for max pooling)
    • Calculates the average value (for average pooling)
  • It saves these values in the output tensor

For max pooling, it also keeps track of which position in each region had the maximum value. This information is needed later during backpropagation.

The method also saves the input for later use in backpropagation.

GetParameters()

Gets all trainable parameters from the pooling layer as a single vector.

public override Vector<T> GetParameters()

Returns

Vector<T>

An empty vector since PoolingLayer has no trainable parameters.

Remarks

This method retrieves all trainable parameters from the layer as a single vector. Since PoolingLayer has no trainable parameters, it returns an empty vector.

For Beginners: This method returns all the learnable values in the layer.

Since PoolingLayer:

  • Only performs fixed mathematical operations (max or average calculation)
  • Has no weights, biases, or other learnable parameters
  • The method returns an empty list

This is different from layers like Dense layers, which would return their weights and biases.

ResetState()

Resets the internal state of the pooling layer.

public override void ResetState()

Remarks

This method resets the internal state of the pooling layer, including the cached input tensor and maximum indices. This is useful when starting to process a new sequence or batch of data.

For Beginners: This method clears the layer's memory to start fresh.

When resetting the state:

  • Stored input from previous processing is cleared
  • For max pooling, the stored positions of maximum values are cleared
  • The layer forgets any information from previous data batches

This is important for:

  • Processing a new, unrelated batch of data
  • Ensuring clean state before a new training epoch
  • Preventing information from one batch affecting another

While the PoolingLayer doesn't maintain long-term state across samples, clearing these cached values helps with memory management and ensuring a clean processing pipeline.

UpdateParameters(T)

Updates the parameters of the pooling layer using the calculated gradients.

public override void UpdateParameters(T learningRate)

Parameters

learningRate T

The learning rate to use for the parameter updates.

Remarks

This method is part of the training process, but since PoolingLayer has no trainable parameters, this method does nothing.

For Beginners: This method would normally update a layer's internal values during training.

However, since PoolingLayer just performs a fixed mathematical operation (pooling) and doesn't have any internal values that can be learned or adjusted, this method is empty.

This is unlike layers such as Dense or Convolutional layers, which have weights and biases that get updated during training.