Class SubpixelConvolutionalLayer<T>
- Namespace
- AiDotNet.NeuralNetworks.Layers
- Assembly
- AiDotNet.dll
Represents a subpixel convolutional layer that performs convolution followed by pixel shuffling for upsampling.
public class SubpixelConvolutionalLayer<T> : LayerBase<T>, ILayer<T>, IJitCompilable<T>, IDiagnosticsProvider, IWeightLoadable<T>, IDisposable
Type Parameters
TThe numeric type used for calculations, typically float or double.
- Inheritance
-
LayerBase<T>SubpixelConvolutionalLayer<T>
- Implements
-
ILayer<T>
- Inherited Members
Remarks
A subpixel convolutional layer combines convolution with a pixel shuffling operation to efficiently increase spatial resolution of feature maps. It first applies convolution to produce an output with more channels, then rearranges these channels into a higher resolution output with fewer channels. This approach is particularly useful for super-resolution tasks and generative models where upsampling is required.
For Beginners: This layer helps make images larger and more detailed in neural networks.
Think of it like rearranging a small mosaic to create a larger picture:
- First, the layer creates many detailed patterns from the input (convolution step)
- Then, it rearranges these patterns to form a larger, higher-resolution output (pixel shuffling step)
For example, if you're working with a low-resolution image that's 32×32 pixels, this layer can help transform it into a higher-resolution image of 64×64 or 128×128 pixels by intelligently filling in the details between the original pixels.
This is often used in applications like:
- Making blurry images clearer (super-resolution)
- Generating detailed images from rough sketches
- Converting low-quality videos to higher quality
Constructors
SubpixelConvolutionalLayer(int, int, int, int, int, int, IActivationFunction<T>?)
Initializes a new instance of the SubpixelConvolutionalLayer<T> class with scalar activation function.
public SubpixelConvolutionalLayer(int inputDepth, int outputDepth, int upscaleFactor, int kernelSize, int inputHeight, int inputWidth, IActivationFunction<T>? activation = null)
Parameters
inputDepthintThe number of channels in the input tensor.
outputDepthintThe number of channels in the output tensor after upscaling.
upscaleFactorintThe factor by which to increase spatial dimensions.
kernelSizeintThe size of the convolutional kernel.
inputHeightintThe height of the input tensor.
inputWidthintThe width of the input tensor.
activationIActivationFunction<T>The activation function to apply after processing. Defaults to ReLU if not specified.
Remarks
This constructor creates a subpixel convolutional layer with the specified dimensions and parameters. It initializes the convolutional kernels and biases with appropriate values for training.
For Beginners: This constructor creates a new subpixel convolutional layer.
The parameters you provide determine:
- inputDepth: How many channels the input has (like RGB for images would be 3)
- outputDepth: How many channels the output will have after upscaling
- upscaleFactor: How much larger the output will be (2 means twice as wide and tall)
- kernelSize: How large an area the layer examines for each calculation (3 is common)
- inputHeight/inputWidth: The dimensions of the input data
- activation: What mathematical function to apply to the results (ReLU is default)
These settings help the layer know exactly what kind of data it's working with and how to transform it into a higher-resolution output.
SubpixelConvolutionalLayer(int, int, int, int, int, int, IVectorActivationFunction<T>?)
Initializes a new instance of the SubpixelConvolutionalLayer<T> class with vector activation function.
public SubpixelConvolutionalLayer(int inputDepth, int outputDepth, int upscaleFactor, int kernelSize, int inputHeight, int inputWidth, IVectorActivationFunction<T>? vectorActivation = null)
Parameters
inputDepthintThe number of channels in the input tensor.
outputDepthintThe number of channels in the output tensor after upscaling.
upscaleFactorintThe factor by which to increase spatial dimensions.
kernelSizeintThe size of the convolutional kernel.
inputHeightintThe height of the input tensor.
inputWidthintThe width of the input tensor.
vectorActivationIVectorActivationFunction<T>The vector activation function to apply after processing. Defaults to ReLU if not specified.
Remarks
This constructor creates a subpixel convolutional layer with the specified dimensions and parameters. It uses a vector activation function, which operates on entire vectors rather than individual elements.
For Beginners: This constructor is similar to the previous one, but uses vector activations.
Vector activations:
- Process entire groups of numbers at once, rather than one at a time
- Can capture relationships between different elements
- Allow for more complex transformations
This version is useful when you need more sophisticated processing that considers how different features relate to each other, rather than treating each feature independently.
Properties
SupportsGpuExecution
Gets a value indicating whether this layer supports GPU execution.
protected override bool SupportsGpuExecution { get; }
Property Value
SupportsJitCompilation
Gets whether this layer supports JIT compilation.
public override bool SupportsJitCompilation { get; }
Property Value
- bool
True, as all required operations (Conv2D, PixelShuffle) are available.
Remarks
Subpixel convolutional layers support JIT compilation using Conv2D and PixelShuffle operations from TensorOperations. The layer requires both convolution and pixel shuffling operations which are available in the computation graph.
SupportsTraining
Gets a value indicating whether this layer supports training.
public override bool SupportsTraining { get; }
Property Value
- bool
truefor this layer, as it contains trainable parameters (kernels and biases).
Remarks
This property indicates whether the subpixel convolutional layer can be trained through backpropagation. Since this layer has trainable parameters (kernels and biases), it supports training.
For Beginners: This property tells you if the layer can learn from data.
A value of true means:
- The layer has internal values (kernels and biases) that can be adjusted during training
- It will improve its performance as it sees more data
- It participates in the learning process
For this layer, the value is always true because it needs to learn which patterns are most important for upscaling the input effectively.
Methods
Backward(Tensor<T>)
Performs the backward pass of the subpixel convolutional layer.
public override Tensor<T> Backward(Tensor<T> outputGradient)
Parameters
outputGradientTensor<T>The gradient of the loss with respect to the layer's output.
Returns
- Tensor<T>
The gradient of the loss with respect to the layer's input.
Remarks
This method implements the backward pass of the subpixel convolutional layer, which is used during training to propagate error gradients back through the network. It calculates gradients for the input and for all trainable parameters (kernels and biases).
For Beginners: This method is used during training to calculate how the layer's input and parameters should change to reduce errors.
During the backward pass, we reverse the steps from the forward pass:
First, calculate how the activation function affects the gradient
Reverse the pixel shuffling:
- Convert the gradient from high resolution back to the lower resolution with more channels
- This helps determine how each output channel contributed to the errors
Calculate three types of gradients:
- How the input should change (inputGradient)
- How the kernels should change (kernelGradients)
- How the biases should change (biasGradients)
These gradients tell the network how to adjust its parameters during the update step to improve its performance on the next forward pass.
Exceptions
- InvalidOperationException
Thrown when trying to perform a backward pass before a forward pass.
BackwardGpu(IGpuTensor<T>)
Performs the GPU-resident backward pass of the subpixel convolutional layer.
public override IGpuTensor<T> BackwardGpu(IGpuTensor<T> outputGradient)
Parameters
outputGradientIGpuTensor<T>The GPU tensor containing the gradient of the loss with respect to the layer's output.
Returns
- IGpuTensor<T>
The gradient of the loss with respect to the layer's input.
ExportComputationGraph(List<ComputationNode<T>>)
Exports this layer's computation as a differentiable computation graph for JIT compilation.
public override ComputationNode<T> ExportComputationGraph(List<ComputationNode<T>> inputNodes)
Parameters
inputNodesList<ComputationNode<T>>List to which input variable nodes should be added.
Returns
- ComputationNode<T>
The output computation node representing this layer's operation.
Remarks
This method builds a computation graph representation of the subpixel convolution operation. Subpixel convolution combines convolution with pixel shuffling (depth-to-space rearrangement).
For Beginners: This creates an optimized version for faster inference.
For subpixel convolutional layers:
- Creates placeholders for input, convolution kernels, and biases
- Applies convolution operation
- Applies pixel shuffle (depth-to-space) rearrangement
- Applies activation function
- Returns a computation graph for efficient execution
Exceptions
- ArgumentNullException
Thrown when inputNodes is null.
- InvalidOperationException
Thrown when weights/biases are not initialized or activation is not supported.
Forward(Tensor<T>)
Performs the forward pass of the subpixel convolutional layer.
public override Tensor<T> Forward(Tensor<T> input)
Parameters
inputTensor<T>The input tensor to process.
Returns
- Tensor<T>
The output tensor after convolution, pixel shuffling, and activation.
Remarks
This method implements the forward pass of the subpixel convolutional layer. It first applies convolution to produce a tensor with more channels, then performs pixel shuffling to rearrange these channels into a higher resolution output with fewer channels. Finally, it applies the activation function.
For Beginners: This method processes the input data through the upscaling steps.
The process works in three main steps:
Convolution:
- The input is processed using the learned pattern detectors (kernels)
- This creates a version with many more channels than the final output needs
- These extra channels contain the information needed to create a larger image
Pixel Shuffling:
- The many channels are rearranged into a larger spatial grid
- This effectively increases the resolution of the image
- The proper arrangements of pixels creates the higher resolution output
Activation:
- A mathematical function is applied to introduce non-linearity
- This helps the network learn more complex patterns
- The final output has higher resolution but fewer channels
For example, with upscaleFactor=2:
- A 32×32×64 input might become 32×32×256 after convolution
- Then become 64×64×64 after pixel shuffling (4 times more pixels, 1/4 the channels)
ForwardGpu(params IGpuTensor<T>[])
Performs the GPU-resident forward pass of the subpixel convolutional layer.
public override IGpuTensor<T> ForwardGpu(params IGpuTensor<T>[] inputs)
Parameters
inputsIGpuTensor<T>[]The GPU input tensors.
Returns
- IGpuTensor<T>
The GPU output tensor after convolution, pixel shuffle, and activation.
Remarks
All computations stay on GPU: Conv2D → PixelShuffle (reshape+permute+reshape) → Activation.
GetParameters()
Gets all trainable parameters of the layer as a single vector.
public override Vector<T> GetParameters()
Returns
- Vector<T>
A vector containing all trainable parameters.
Remarks
This method retrieves all trainable parameters (kernels and biases) of the layer and combines them into a single vector. This is useful for optimization algorithms that operate on all parameters at once, or for saving and loading model weights.
For Beginners: This method collects all the learnable values from the layer.
The parameters:
- Are the numbers that the neural network learns during training
- Include all kernels and biases from the layer
- Are combined into a single long list (vector)
This is useful for:
- Saving the model to disk
- Loading parameters from a previously trained model
- Advanced optimization techniques that need access to all parameters
ResetState()
Resets the internal state of the layer and reinitializes weights.
public override void ResetState()
Remarks
This method resets the internal state of the layer, clearing cached values from forward and backward passes, resetting momentum, and reinitializing the weights and biases. This is useful when starting new training or when implementing networks that need to reset their state between sequences.
For Beginners: This method clears the layer's memory and starts fresh.
When resetting the state:
- Stored inputs and outputs are cleared
- Calculated gradients are cleared
- Momentum is reset to zero
- Weights and biases are reinitialized to new random values
This is useful for:
- Starting a new training session
- Getting out of a "stuck" state where learning has plateaued
- Testing how the layer performs with different initializations
Think of it like wiping a whiteboard clean and starting over with a fresh approach.
UpdateParameters(T)
Updates the parameters of the layer using calculated gradients and momentum.
public override void UpdateParameters(T learningRate)
Parameters
learningRateTThe learning rate to control the size of parameter updates.
Remarks
This method updates the convolutional kernels and biases based on the gradients calculated during the backward pass. It uses momentum to stabilize the updates and applies weight decay to the kernels to prevent overfitting.
For Beginners: This method adjusts the layer's pattern detectors to improve performance.
During parameter updates:
Momentum is calculated:
- 90% of the previous update direction (momentum)
- 10% of the current gradient direction
- This helps maintain a steady learning direction
Weights are updated using:
- The momentum-adjusted gradient (for direction)
- The learning rate (for step size)
- Weight decay (to prevent overfitting by keeping weights small)
Biases are updated using:
- The momentum-adjusted gradient
- The learning rate
- No weight decay (biases typically don't cause overfitting)
Think of it like navigating a mountain: momentum helps you keep moving in a consistent direction despite small bumps, while weight decay prevents you from taking extreme paths.
Exceptions
- InvalidOperationException
Thrown when trying to update parameters before calculating gradients.
UpdateParametersGpu(IGpuOptimizerConfig)
Updates parameters using GPU-based optimizer.
public override void UpdateParametersGpu(IGpuOptimizerConfig config)
Parameters
configIGpuOptimizerConfigGPU optimizer configuration specifying the optimizer type and hyperparameters.