Table of Contents

Class MaxPoolingLayer<T>

Namespace
AiDotNet.NeuralNetworks.Layers
Assembly
AiDotNet.dll

Implements a max pooling layer for neural networks, which reduces the spatial dimensions of the input by taking the maximum value in each pooling window.

public class MaxPoolingLayer<T> : LayerBase<T>, ILayer<T>, IJitCompilable<T>, IDiagnosticsProvider, IWeightLoadable<T>, IDisposable

Type Parameters

T

The numeric type used for computations (typically float or double).

Inheritance
MaxPoolingLayer<T>
Implements
Inherited Members

Remarks

For Beginners: A max pooling layer helps reduce the size of data flowing through a neural network while keeping the most important information. It works by dividing the input into small windows (determined by the pool size) and keeping only the largest value from each window.

Think of it like summarizing a detailed picture: instead of describing every pixel, you just point out the most noticeable feature in each area of the image.

This helps the network:

  1. Focus on the most important features
  2. Reduce computation needs
  3. Make the model more robust to small changes in input position

Constructors

MaxPoolingLayer(int[], int, int)

Creates a new max pooling layer with the specified parameters.

public MaxPoolingLayer(int[] inputShape, int poolSize, int stride)

Parameters

inputShape int[]

The shape of the input data (channels, height, width).

poolSize int

The size of the pooling window.

stride int

The step size when moving the pooling window.

Remarks

For Beginners: This constructor sets up the max pooling layer with your chosen settings. It calculates what the output shape will be based on your input shape, pool size, and strides.

Properties

PoolSize

Gets the size of the pooling window.

public int PoolSize { get; }

Property Value

int

Remarks

For Beginners: This determines how large of an area we look at when selecting the maximum value. For example, a pool size of 2 means we look at 2×2 squares of the input.

Stride

Gets the step size when moving the pooling window across the input.

public int Stride { get; }

Property Value

int

Remarks

For Beginners: This controls how much we move our window each time. For example, a stride of 2 means we move the window 2 pixels at a time, which reduces the output size to half of the input size (assuming pool size is also 2).

SupportsGpuExecution

Indicates that this layer supports GPU-accelerated execution.

protected override bool SupportsGpuExecution { get; }

Property Value

bool

SupportsJitCompilation

Gets whether this layer supports JIT compilation.

public override bool SupportsJitCompilation { get; }

Property Value

bool

True if the layer can be JIT compiled, false otherwise.

Remarks

This property indicates whether the layer has implemented ExportComputationGraph() and can benefit from JIT compilation. All layers MUST implement this property.

For Beginners: JIT compilation can make inference 5-10x faster by converting the layer's operations into optimized native code.

Layers should return false if they:

  • Have not yet implemented a working ExportComputationGraph()
  • Use dynamic operations that change based on input data
  • Are too simple to benefit from JIT compilation

When false, the layer will use the standard Forward() method instead.

SupportsTraining

Gets a value indicating whether this layer supports training.

public override bool SupportsTraining { get; }

Property Value

bool

true if the layer has trainable parameters and supports backpropagation; otherwise, false.

Remarks

This property indicates whether the layer can be trained through backpropagation. Layers with trainable parameters such as weights and biases typically return true, while layers that only perform fixed transformations (like pooling or activation layers) typically return false.

For Beginners: This property tells you if the layer can learn from data.

A value of true means:

  • The layer has parameters that can be adjusted during training
  • It will improve its performance as it sees more data
  • It participates in the learning process

A value of false means:

  • The layer doesn't have any adjustable parameters
  • It performs the same operation regardless of training
  • It doesn't need to learn (but may still be useful)

Methods

Backward(Tensor<T>)

Performs the backward pass of the max pooling operation.

public override Tensor<T> Backward(Tensor<T> outputGradient)

Parameters

outputGradient Tensor<T>

The gradient flowing back from the next layer.

Returns

Tensor<T>

The gradient to pass to the previous layer.

Remarks

For Beginners: During training, neural networks need to adjust their parameters based on how much error they made. This adjustment flows backward through the network.

In max pooling, only the maximum value from each window contributed to the output. So during the backward pass, the gradient only flows back to that maximum value's position. All other positions receive zero gradient because they didn't contribute to the output.

Think of it like giving credit only to the team member who contributed the most to a project.

Exceptions

ArgumentException

Thrown when the output gradient tensor doesn't have 3 dimensions.

BackwardGpu(IGpuTensor<T>)

Performs GPU-resident backward pass of max pooling.

public override IGpuTensor<T> BackwardGpu(IGpuTensor<T> outputGradient)

Parameters

outputGradient IGpuTensor<T>

The gradient of the output on GPU.

Returns

IGpuTensor<T>

The gradient with respect to input as a GPU-resident tensor.

Deserialize(BinaryReader)

Loads the layer's configuration from a binary stream.

public override void Deserialize(BinaryReader reader)

Parameters

reader BinaryReader

The binary reader to read the data from.

Remarks

For Beginners: This method loads previously saved settings for the layer. It's the counterpart to Serialize - if Serialize is like saving your game, Deserialize is like loading that saved game.

ExportComputationGraph(List<ComputationNode<T>>)

Exports the layer's computation graph for JIT compilation.

public override ComputationNode<T> ExportComputationGraph(List<ComputationNode<T>> inputNodes)

Parameters

inputNodes List<ComputationNode<T>>

List to populate with input computation nodes.

Returns

ComputationNode<T>

The output computation node representing the layer's operation.

Remarks

This method constructs a computation graph representation of the layer's forward pass that can be JIT compiled for faster inference. All layers MUST implement this method to support JIT compilation.

For Beginners: JIT (Just-In-Time) compilation converts the layer's operations into optimized native code for 5-10x faster inference.

To support JIT compilation, a layer must:

  1. Implement this method to export its computation graph
  2. Set SupportsJitCompilation to true
  3. Use ComputationNode and TensorOperations to build the graph

All layers are required to implement this method, even if they set SupportsJitCompilation = false.

Forward(Tensor<T>)

Performs the forward pass of the max pooling operation.

public override Tensor<T> Forward(Tensor<T> input)

Parameters

input Tensor<T>

The input tensor to apply max pooling to.

Returns

Tensor<T>

The output tensor after max pooling.

Remarks

For Beginners: This is where the actual max pooling happens. For each small window in the input:

  1. We look at all values in that window
  2. We find the largest value
  3. We put that value in the output
  4. We remember where that maximum value was located (for the backward pass)

The method processes the input channel by channel, sliding the pooling window across the height and width dimensions.

Exceptions

ArgumentException

Thrown when the input tensor doesn't have 3 dimensions.

ForwardGpu(params IGpuTensor<T>[])

Performs GPU-resident forward pass of max pooling, keeping all data on GPU.

public override IGpuTensor<T> ForwardGpu(params IGpuTensor<T>[] inputs)

Parameters

inputs IGpuTensor<T>[]

Returns

IGpuTensor<T>

The pooled output as a GPU-resident tensor.

GetActivationTypes()

Returns the activation functions used by this layer.

public override IEnumerable<ActivationFunction> GetActivationTypes()

Returns

IEnumerable<ActivationFunction>

An empty collection since max pooling layers don't use activation functions.

Remarks

For Beginners: Activation functions are mathematical operations that determine the output of a neural network node. They introduce non-linearity, which helps neural networks learn complex patterns.

However, max pooling layers don't use activation functions - they simply select the maximum value from each window. That's why this method returns an empty collection.

GetParameters()

Gets all trainable parameters of the layer.

public override Vector<T> GetParameters()

Returns

Vector<T>

An empty vector since max pooling layers have no trainable parameters.

Remarks

For Beginners: This method returns all the values that can be adjusted during training.

Many neural network layers have weights and biases that get updated as the network learns. However, max pooling layers simply select the maximum value from each window - there are no weights or biases to adjust.

This is why the method returns an empty vector (essentially a list with no elements).

GetPoolSize()

Indicates whether this layer supports training operations.

public int[] GetPoolSize()

Returns

int[]

An array containing the pool size for height and width dimensions.

Remarks

For Beginners: This property tells the neural network system whether this layer can be trained (adjusted) during the learning process. Max pooling layers don't have parameters to train, but they do support the training process by allowing gradients to flow backward through them.

GetStride()

Gets the stride for the pooling operation.

public int[] GetStride()

Returns

int[]

An array containing the stride for height and width dimensions.

ResetState()

Resets the internal state of the layer.

public override void ResetState()

Remarks

For Beginners: This method clears any information the layer has stored from previous calculations.

During the forward pass, the max pooling layer remembers which positions had the maximum values (stored in _maxIndices). This is needed for the backward pass.

Resetting the state clears this memory, which is useful when:

  1. Starting a new training session
  2. Processing a new batch of data
  3. Switching from training to evaluation mode

It's like wiping a whiteboard clean before starting a new calculation.

Serialize(BinaryWriter)

Saves the layer's configuration to a binary stream.

public override void Serialize(BinaryWriter writer)

Parameters

writer BinaryWriter

The binary writer to write the data to.

Remarks

For Beginners: This method saves the layer's settings (pool size and stride) so that you can reload the exact same layer later. It's like saving your game progress so you can continue from where you left off.

UpdateParameters(T)

Updates the layer's parameters during training.

public override void UpdateParameters(T learningRate)

Parameters

learningRate T

The learning rate that controls how much parameters change.

Remarks

For Beginners: This method is part of the neural network training process.

During training, most layers need to update their internal values (parameters) to learn from data. However, max pooling layers don't have any trainable parameters - they just pass through the maximum values from each window.

Think of it like a simple rule that doesn't need to be adjusted: "Always pick the largest number." Since this rule never changes, there's nothing to update in this method.