Class GlobalPoolingLayer<T>
- Namespace
- AiDotNet.NeuralNetworks.Layers
- Assembly
- AiDotNet.dll
Represents a global pooling layer that reduces spatial dimensions to a single value per channel.
public class GlobalPoolingLayer<T> : LayerBase<T>, ILayer<T>, IJitCompilable<T>, IDiagnosticsProvider, IWeightLoadable<T>, IDisposable
Type Parameters
TThe numeric type used for calculations, typically float or double.
- Inheritance
-
LayerBase<T>GlobalPoolingLayer<T>
- Implements
-
ILayer<T>
- Inherited Members
Remarks
A global pooling layer reduces the spatial dimensions (height and width) of the input feature maps to a single value per channel. This is achieved by applying a pooling operation (such as max or average) across the entire spatial extent of each channel. Global pooling is often used at the end of convolutional neural networks to reduce the spatial dimensions before connecting to fully connected layers, providing some translation invariance and reducing the number of parameters.
For Beginners: A global pooling layer summarizes each feature map into a single value.
Imagine you have a set of 2D feature maps (like heat maps showing where different features appear):
- Global pooling looks at each entire feature map
- It creates a single number that represents that entire feature map
- This dramatically reduces the amount of data while preserving the most important information
For example, with 64 feature maps of size 7×7:
- Input: 7×7—64 (3,136 values)
- Output: 1×1—64 (64 values, one per feature map)
There are two main types of global pooling:
- Global Max Pooling: Takes the maximum value from each feature map (useful for detecting if a feature appears anywhere in the input)
- Global Average Pooling: Takes the average of all values in each feature map (useful for determining the overall presence of a feature)
Global pooling is often used as the final layer before classification, replacing large fully connected layers and reducing overfitting.
Constructors
GlobalPoolingLayer(int[], PoolingType, IActivationFunction<T>?)
public GlobalPoolingLayer(int[] inputShape, PoolingType poolingType, IActivationFunction<T>? activationFunction = null)
Parameters
inputShapeint[]poolingTypePoolingTypeactivationFunctionIActivationFunction<T>
GlobalPoolingLayer(int[], PoolingType, IVectorActivationFunction<T>)
Initializes a new instance of the GlobalPoolingLayer<T> class with a vector activation function.
public GlobalPoolingLayer(int[] inputShape, PoolingType poolingType, IVectorActivationFunction<T> vectorActivationFunction)
Parameters
inputShapeint[]The shape of the input tensor (typically [batchSize, height, width, channels]).
poolingTypePoolingTypeThe type of pooling operation to apply (Max or Average).
vectorActivationFunctionIVectorActivationFunction<T>The vector activation function to apply after pooling (required to disambiguate from IActivationFunction overload).
Remarks
This constructor creates a new global pooling layer with the specified input shape, pooling type, and vector activation function. The output shape is calculated to have the same batch size and number of channels as the input, but with spatial dimensions reduced to 1×1. Unlike the other constructor, this one accepts a vector activation function that operates on entire vectors rather than individual scalar values.
For Beginners: This is an alternative setup that uses a different kind of activation function.
This constructor is almost identical to the first one, but with one key difference:
- Regular activation: processes each pooled value separately
- Vector activation: processes groups of pooled values together
Vector activation functions are useful when the relationship between different channels needs to be considered. For example, softmax might be applied across all channels to normalize them into a probability distribution.
For most cases, the standard constructor with regular activation functions is sufficient.
Properties
SupportsGpuExecution
Gets a value indicating whether this layer supports GPU execution.
protected override bool SupportsGpuExecution { get; }
Property Value
- bool
trueif GPU execution is supported; otherwise,false.
SupportsGpuTraining
Gets whether this layer has full GPU training support (forward, backward, and parameter updates).
public override bool SupportsGpuTraining { get; }
Property Value
Remarks
This property indicates whether the layer can perform its entire training cycle on GPU without downloading data to CPU. A layer has full GPU training support when:
- ForwardGpu is implemented
- BackwardGpu is implemented
- UpdateParametersGpu is implemented (for layers with trainable parameters)
- GPU weight/bias/gradient buffers are properly managed
For Beginners: This tells you if training can happen entirely on GPU.
GPU-resident training is much faster because:
- Data stays on GPU between forward and backward passes
- No expensive CPU-GPU transfers during each training step
- GPU kernels handle all gradient computation
Only layers that return true here can participate in fully GPU-resident training.
SupportsJitCompilation
Gets whether this layer supports JIT compilation.
public override bool SupportsJitCompilation { get; }
Property Value
- bool
True if the layer can be JIT compiled, false otherwise.
Remarks
This property indicates whether the layer has implemented ExportComputationGraph() and can benefit from JIT compilation. All layers MUST implement this property.
For Beginners: JIT compilation can make inference 5-10x faster by converting the layer's operations into optimized native code.
Layers should return false if they:
- Have not yet implemented a working ExportComputationGraph()
- Use dynamic operations that change based on input data
- Are too simple to benefit from JIT compilation
When false, the layer will use the standard Forward() method instead.
SupportsTraining
Gets a value indicating whether this layer supports training.
public override bool SupportsTraining { get; }
Property Value
- bool
Always
falsebecause global pooling layers have no trainable parameters.
Remarks
This property indicates that the global pooling layer doesn't have any trainable parameters. The layer simply performs a pooling operation without any weights or biases that need to be learned. However, it still participates in backpropagation by passing gradients back to the previous layer.
For Beginners: This property tells you that this layer doesn't learn or change during training.
A value of false means:
- The layer has no weights or biases to adjust
- It performs a fixed operation (pooling) that doesn't change
- It's a transformation layer, not a learning layer
Unlike convolutional or fully connected layers which learn patterns from data, the global pooling layer just reduces spatial dimensions using a fixed operation.
It's like a data processing step rather than a learning step in the network.
Methods
Backward(Tensor<T>)
Performs the backward pass of the global pooling layer to compute gradients.
public override Tensor<T> Backward(Tensor<T> outputGradient)
Parameters
outputGradientTensor<T>The gradient tensor from the next layer. Shape: [batchSize, 1, 1, channels].
Returns
- Tensor<T>
The gradient tensor to be passed to the previous layer. Shape: [batchSize, height, width, channels].
Remarks
This method implements the backward pass (backpropagation) of the global pooling layer. For average pooling, the gradient is distributed equally among all positions in the input that contributed to the average. For max pooling, the gradient is assigned only to the position that had the maximum value in the forward pass. This reflects how each position in the input contributed to the output during the forward pass.
For Beginners: This is where the layer passes error information back to previous layers.
The backward pass works differently depending on the pooling type:
For average pooling:
- The gradient for each output value is divided equally among all input positions
- Every position in a feature map gets the same small portion of the gradient
- This reflects that each input position contributed equally to the average
For max pooling:
- The gradient for each output value is assigned only to the input position that had the maximum value
- Only the "winning" position gets the gradient, all others get zero
- This reflects that only the maximum value contributed to the output
This process ensures that the network learns appropriately based on how each input position influenced the pooled output.
Exceptions
- InvalidOperationException
Thrown when backward is called before forward.
BackwardGpu(IGpuTensor<T>)
Performs the backward pass of the layer on GPU.
public override IGpuTensor<T> BackwardGpu(IGpuTensor<T> outputGradient)
Parameters
outputGradientIGpuTensor<T>The GPU-resident gradient of the loss with respect to the layer's output.
Returns
- IGpuTensor<T>
The GPU-resident gradient of the loss with respect to the layer's input.
Remarks
This method performs the layer's backward computation entirely on GPU, including:
- Computing input gradients to pass to previous layers
- Computing and storing weight gradients on GPU (for layers with trainable parameters)
- Computing and storing bias gradients on GPU
For Beginners: This is like Backward() but runs entirely on GPU.
During GPU training:
- Output gradients come in (on GPU)
- Input gradients are computed (stay on GPU)
- Weight/bias gradients are computed and stored (on GPU)
- Input gradients are returned for the previous layer
All data stays on GPU - no CPU round-trips needed!
Exceptions
- NotSupportedException
Thrown when the layer does not support GPU training.
- InvalidOperationException
Thrown if ForwardGpu was not called first.
ExportComputationGraph(List<ComputationNode<T>>)
Exports the layer's computation graph for JIT compilation.
public override ComputationNode<T> ExportComputationGraph(List<ComputationNode<T>> inputNodes)
Parameters
inputNodesList<ComputationNode<T>>List to populate with input computation nodes.
Returns
- ComputationNode<T>
The output computation node representing the layer's operation.
Remarks
This method constructs a computation graph representation of the layer's forward pass that can be JIT compiled for faster inference. All layers MUST implement this method to support JIT compilation.
For Beginners: JIT (Just-In-Time) compilation converts the layer's operations into optimized native code for 5-10x faster inference.
To support JIT compilation, a layer must:
- Implement this method to export its computation graph
- Set SupportsJitCompilation to true
- Use ComputationNode and TensorOperations to build the graph
All layers are required to implement this method, even if they set SupportsJitCompilation = false.
Forward(Tensor<T>)
Performs the forward pass of the global pooling layer.
public override Tensor<T> Forward(Tensor<T> input)
Parameters
inputTensor<T>The input tensor to process. Shape: [batchSize, height, width, channels].
Returns
- Tensor<T>
The output tensor after global pooling. Shape: [batchSize, 1, 1, channels].
Remarks
This method implements the forward pass of the global pooling layer. For each channel in each example, it applies the specified pooling operation (max or average) across the entire spatial dimensions. For max pooling, it finds the maximum value in each channel. For average pooling, it computes the mean of all values in each channel. The result is a tensor with the same batch size and number of channels, but with spatial dimensions reduced to 1×1.
For Beginners: This is where the layer processes input data by pooling across entire feature maps.
The forward pass works in these steps:
- For each example in the batch and each channel:
- If using average pooling: Calculate the average of all values in the feature map
- If using max pooling: Find the maximum value in the feature map
- Store these pooled values in the output tensor
- Apply the activation function (if specified)
- Save the input and output for use during backpropagation
This global pooling operation efficiently summarizes each feature map into a single value, drastically reducing the data dimensions while preserving the most important information about each feature.
ForwardGpu(params IGpuTensor<T>[])
Performs GPU-accelerated forward pass using GPU-resident tensors.
public override IGpuTensor<T> ForwardGpu(params IGpuTensor<T>[] inputs)
Parameters
inputsIGpuTensor<T>[]
Returns
- IGpuTensor<T>
The GPU-resident output tensor after global pooling.
GetParameters()
Gets the trainable parameters of the layer.
public override Vector<T> GetParameters()
Returns
- Vector<T>
An empty vector since global pooling layers have no trainable parameters.
Remarks
This method is a required override from the base class, but the global pooling layer has no trainable parameters to retrieve, so it returns an empty vector.
For Beginners: This method returns an empty list because pooling layers have no learnable values.
Unlike layers with weights and biases:
- Global pooling layers don't have any parameters that change during training
- They perform a fixed operation (pooling) that doesn't involve learning
- There are no values to save when storing a trained model
This method returns an empty vector, indicating there are no parameters to collect.
ResetState()
Resets the internal state of the layer.
public override void ResetState()
Remarks
This method resets the internal state of the layer by clearing the cached input and output tensors from the previous forward pass. This is useful when starting to process a new batch of data or when switching between training and inference modes.
For Beginners: This method clears the layer's memory to start fresh.
When resetting the state:
- The saved input and output tensors are cleared
- This frees up memory and prepares for new data
This is typically called:
- Between training batches
- When switching from training to evaluation mode
- When starting to process completely new data
It's like wiping a whiteboard clean before starting a new calculation.
UpdateParameters(T)
Updates the parameters of the layer based on the calculated gradients.
public override void UpdateParameters(T learningRate)
Parameters
learningRateTThe learning rate to use for parameter updates.
Remarks
This method is a required override from the base class, but the global pooling layer has no trainable parameters to update, so it performs no operation.
For Beginners: This method does nothing because pooling layers have no adjustable weights.
Unlike layers like convolutional or fully connected layers:
- Global pooling layers don't have weights or biases to learn
- They perform a fixed operation (finding maximum or average values)
- There's nothing to update during training
This method exists only to fulfill the requirements of the base layer class. The pooling layer influences the network by reducing dimensions and providing translation invariance, not by learning parameters.