Class MaxPoolingLayer<T>
- Namespace
- AiDotNet.NeuralNetworks.Layers
- Assembly
- AiDotNet.dll
Implements a max pooling layer for neural networks, which reduces the spatial dimensions of the input by taking the maximum value in each pooling window.
public class MaxPoolingLayer<T> : LayerBase<T>, ILayer<T>, IJitCompilable<T>, IDiagnosticsProvider, IWeightLoadable<T>, IDisposable
Type Parameters
TThe numeric type used for computations (typically float or double).
- Inheritance
-
LayerBase<T>MaxPoolingLayer<T>
- Implements
-
ILayer<T>
- Inherited Members
Remarks
For Beginners: A max pooling layer helps reduce the size of data flowing through a neural network while keeping the most important information. It works by dividing the input into small windows (determined by the pool size) and keeping only the largest value from each window.
Think of it like summarizing a detailed picture: instead of describing every pixel, you just point out the most noticeable feature in each area of the image.
This helps the network:
- Focus on the most important features
- Reduce computation needs
- Make the model more robust to small changes in input position
Constructors
MaxPoolingLayer(int[], int, int)
Creates a new max pooling layer with the specified parameters.
public MaxPoolingLayer(int[] inputShape, int poolSize, int stride)
Parameters
inputShapeint[]The shape of the input data (channels, height, width).
poolSizeintThe size of the pooling window.
strideintThe step size when moving the pooling window.
Remarks
For Beginners: This constructor sets up the max pooling layer with your chosen settings. It calculates what the output shape will be based on your input shape, pool size, and strides.
Properties
PoolSize
Gets the size of the pooling window.
public int PoolSize { get; }
Property Value
Remarks
For Beginners: This determines how large of an area we look at when selecting the maximum value. For example, a pool size of 2 means we look at 2×2 squares of the input.
Stride
Gets the step size when moving the pooling window across the input.
public int Stride { get; }
Property Value
Remarks
For Beginners: This controls how much we move our window each time. For example, a stride of 2 means we move the window 2 pixels at a time, which reduces the output size to half of the input size (assuming pool size is also 2).
SupportsGpuExecution
Indicates that this layer supports GPU-accelerated execution.
protected override bool SupportsGpuExecution { get; }
Property Value
SupportsJitCompilation
Gets whether this layer supports JIT compilation.
public override bool SupportsJitCompilation { get; }
Property Value
- bool
True if the layer can be JIT compiled, false otherwise.
Remarks
This property indicates whether the layer has implemented ExportComputationGraph() and can benefit from JIT compilation. All layers MUST implement this property.
For Beginners: JIT compilation can make inference 5-10x faster by converting the layer's operations into optimized native code.
Layers should return false if they:
- Have not yet implemented a working ExportComputationGraph()
- Use dynamic operations that change based on input data
- Are too simple to benefit from JIT compilation
When false, the layer will use the standard Forward() method instead.
SupportsTraining
Gets a value indicating whether this layer supports training.
public override bool SupportsTraining { get; }
Property Value
- bool
trueif the layer has trainable parameters and supports backpropagation; otherwise,false.
Remarks
This property indicates whether the layer can be trained through backpropagation. Layers with trainable parameters such as weights and biases typically return true, while layers that only perform fixed transformations (like pooling or activation layers) typically return false.
For Beginners: This property tells you if the layer can learn from data.
A value of true means:
- The layer has parameters that can be adjusted during training
- It will improve its performance as it sees more data
- It participates in the learning process
A value of false means:
- The layer doesn't have any adjustable parameters
- It performs the same operation regardless of training
- It doesn't need to learn (but may still be useful)
Methods
Backward(Tensor<T>)
Performs the backward pass of the max pooling operation.
public override Tensor<T> Backward(Tensor<T> outputGradient)
Parameters
outputGradientTensor<T>The gradient flowing back from the next layer.
Returns
- Tensor<T>
The gradient to pass to the previous layer.
Remarks
For Beginners: During training, neural networks need to adjust their parameters based on how much error they made. This adjustment flows backward through the network.
In max pooling, only the maximum value from each window contributed to the output. So during the backward pass, the gradient only flows back to that maximum value's position. All other positions receive zero gradient because they didn't contribute to the output.
Think of it like giving credit only to the team member who contributed the most to a project.
Exceptions
- ArgumentException
Thrown when the output gradient tensor doesn't have 3 dimensions.
BackwardGpu(IGpuTensor<T>)
Performs GPU-resident backward pass of max pooling.
public override IGpuTensor<T> BackwardGpu(IGpuTensor<T> outputGradient)
Parameters
outputGradientIGpuTensor<T>The gradient of the output on GPU.
Returns
- IGpuTensor<T>
The gradient with respect to input as a GPU-resident tensor.
Deserialize(BinaryReader)
Loads the layer's configuration from a binary stream.
public override void Deserialize(BinaryReader reader)
Parameters
readerBinaryReaderThe binary reader to read the data from.
Remarks
For Beginners: This method loads previously saved settings for the layer. It's the counterpart to Serialize - if Serialize is like saving your game, Deserialize is like loading that saved game.
ExportComputationGraph(List<ComputationNode<T>>)
Exports the layer's computation graph for JIT compilation.
public override ComputationNode<T> ExportComputationGraph(List<ComputationNode<T>> inputNodes)
Parameters
inputNodesList<ComputationNode<T>>List to populate with input computation nodes.
Returns
- ComputationNode<T>
The output computation node representing the layer's operation.
Remarks
This method constructs a computation graph representation of the layer's forward pass that can be JIT compiled for faster inference. All layers MUST implement this method to support JIT compilation.
For Beginners: JIT (Just-In-Time) compilation converts the layer's operations into optimized native code for 5-10x faster inference.
To support JIT compilation, a layer must:
- Implement this method to export its computation graph
- Set SupportsJitCompilation to true
- Use ComputationNode and TensorOperations to build the graph
All layers are required to implement this method, even if they set SupportsJitCompilation = false.
Forward(Tensor<T>)
Performs the forward pass of the max pooling operation.
public override Tensor<T> Forward(Tensor<T> input)
Parameters
inputTensor<T>The input tensor to apply max pooling to.
Returns
- Tensor<T>
The output tensor after max pooling.
Remarks
For Beginners: This is where the actual max pooling happens. For each small window in the input:
- We look at all values in that window
- We find the largest value
- We put that value in the output
- We remember where that maximum value was located (for the backward pass)
The method processes the input channel by channel, sliding the pooling window across the height and width dimensions.
Exceptions
- ArgumentException
Thrown when the input tensor doesn't have 3 dimensions.
ForwardGpu(params IGpuTensor<T>[])
Performs GPU-resident forward pass of max pooling, keeping all data on GPU.
public override IGpuTensor<T> ForwardGpu(params IGpuTensor<T>[] inputs)
Parameters
inputsIGpuTensor<T>[]
Returns
- IGpuTensor<T>
The pooled output as a GPU-resident tensor.
GetActivationTypes()
Returns the activation functions used by this layer.
public override IEnumerable<ActivationFunction> GetActivationTypes()
Returns
- IEnumerable<ActivationFunction>
An empty collection since max pooling layers don't use activation functions.
Remarks
For Beginners: Activation functions are mathematical operations that determine the output of a neural network node. They introduce non-linearity, which helps neural networks learn complex patterns.
However, max pooling layers don't use activation functions - they simply select the maximum value from each window. That's why this method returns an empty collection.
GetParameters()
Gets all trainable parameters of the layer.
public override Vector<T> GetParameters()
Returns
- Vector<T>
An empty vector since max pooling layers have no trainable parameters.
Remarks
For Beginners: This method returns all the values that can be adjusted during training.
Many neural network layers have weights and biases that get updated as the network learns. However, max pooling layers simply select the maximum value from each window - there are no weights or biases to adjust.
This is why the method returns an empty vector (essentially a list with no elements).
GetPoolSize()
Indicates whether this layer supports training operations.
public int[] GetPoolSize()
Returns
- int[]
An array containing the pool size for height and width dimensions.
Remarks
For Beginners: This property tells the neural network system whether this layer can be trained (adjusted) during the learning process. Max pooling layers don't have parameters to train, but they do support the training process by allowing gradients to flow backward through them.
GetStride()
Gets the stride for the pooling operation.
public int[] GetStride()
Returns
- int[]
An array containing the stride for height and width dimensions.
ResetState()
Resets the internal state of the layer.
public override void ResetState()
Remarks
For Beginners: This method clears any information the layer has stored from previous calculations.
During the forward pass, the max pooling layer remembers which positions had the maximum values (stored in _maxIndices). This is needed for the backward pass.
Resetting the state clears this memory, which is useful when:
- Starting a new training session
- Processing a new batch of data
- Switching from training to evaluation mode
It's like wiping a whiteboard clean before starting a new calculation.
Serialize(BinaryWriter)
Saves the layer's configuration to a binary stream.
public override void Serialize(BinaryWriter writer)
Parameters
writerBinaryWriterThe binary writer to write the data to.
Remarks
For Beginners: This method saves the layer's settings (pool size and stride) so that you can reload the exact same layer later. It's like saving your game progress so you can continue from where you left off.
UpdateParameters(T)
Updates the layer's parameters during training.
public override void UpdateParameters(T learningRate)
Parameters
learningRateTThe learning rate that controls how much parameters change.
Remarks
For Beginners: This method is part of the neural network training process.
During training, most layers need to update their internal values (parameters) to learn from data. However, max pooling layers don't have any trainable parameters - they just pass through the maximum values from each window.
Think of it like a simple rule that doesn't need to be adjusted: "Always pick the largest number." Since this rule never changes, there's nothing to update in this method.