Table of Contents

Class SubpixelConvolutionalLayer<T>

Namespace
AiDotNet.NeuralNetworks.Layers
Assembly
AiDotNet.dll

Represents a subpixel convolutional layer that performs convolution followed by pixel shuffling for upsampling.

public class SubpixelConvolutionalLayer<T> : LayerBase<T>, ILayer<T>, IJitCompilable<T>, IDiagnosticsProvider, IWeightLoadable<T>, IDisposable

Type Parameters

T

The numeric type used for calculations, typically float or double.

Inheritance
SubpixelConvolutionalLayer<T>
Implements
Inherited Members

Remarks

A subpixel convolutional layer combines convolution with a pixel shuffling operation to efficiently increase spatial resolution of feature maps. It first applies convolution to produce an output with more channels, then rearranges these channels into a higher resolution output with fewer channels. This approach is particularly useful for super-resolution tasks and generative models where upsampling is required.

For Beginners: This layer helps make images larger and more detailed in neural networks.

Think of it like rearranging a small mosaic to create a larger picture:

  • First, the layer creates many detailed patterns from the input (convolution step)
  • Then, it rearranges these patterns to form a larger, higher-resolution output (pixel shuffling step)

For example, if you're working with a low-resolution image that's 32×32 pixels, this layer can help transform it into a higher-resolution image of 64×64 or 128×128 pixels by intelligently filling in the details between the original pixels.

This is often used in applications like:

  • Making blurry images clearer (super-resolution)
  • Generating detailed images from rough sketches
  • Converting low-quality videos to higher quality

Constructors

SubpixelConvolutionalLayer(int, int, int, int, int, int, IActivationFunction<T>?)

Initializes a new instance of the SubpixelConvolutionalLayer<T> class with scalar activation function.

public SubpixelConvolutionalLayer(int inputDepth, int outputDepth, int upscaleFactor, int kernelSize, int inputHeight, int inputWidth, IActivationFunction<T>? activation = null)

Parameters

inputDepth int

The number of channels in the input tensor.

outputDepth int

The number of channels in the output tensor after upscaling.

upscaleFactor int

The factor by which to increase spatial dimensions.

kernelSize int

The size of the convolutional kernel.

inputHeight int

The height of the input tensor.

inputWidth int

The width of the input tensor.

activation IActivationFunction<T>

The activation function to apply after processing. Defaults to ReLU if not specified.

Remarks

This constructor creates a subpixel convolutional layer with the specified dimensions and parameters. It initializes the convolutional kernels and biases with appropriate values for training.

For Beginners: This constructor creates a new subpixel convolutional layer.

The parameters you provide determine:

  • inputDepth: How many channels the input has (like RGB for images would be 3)
  • outputDepth: How many channels the output will have after upscaling
  • upscaleFactor: How much larger the output will be (2 means twice as wide and tall)
  • kernelSize: How large an area the layer examines for each calculation (3 is common)
  • inputHeight/inputWidth: The dimensions of the input data
  • activation: What mathematical function to apply to the results (ReLU is default)

These settings help the layer know exactly what kind of data it's working with and how to transform it into a higher-resolution output.

SubpixelConvolutionalLayer(int, int, int, int, int, int, IVectorActivationFunction<T>?)

Initializes a new instance of the SubpixelConvolutionalLayer<T> class with vector activation function.

public SubpixelConvolutionalLayer(int inputDepth, int outputDepth, int upscaleFactor, int kernelSize, int inputHeight, int inputWidth, IVectorActivationFunction<T>? vectorActivation = null)

Parameters

inputDepth int

The number of channels in the input tensor.

outputDepth int

The number of channels in the output tensor after upscaling.

upscaleFactor int

The factor by which to increase spatial dimensions.

kernelSize int

The size of the convolutional kernel.

inputHeight int

The height of the input tensor.

inputWidth int

The width of the input tensor.

vectorActivation IVectorActivationFunction<T>

The vector activation function to apply after processing. Defaults to ReLU if not specified.

Remarks

This constructor creates a subpixel convolutional layer with the specified dimensions and parameters. It uses a vector activation function, which operates on entire vectors rather than individual elements.

For Beginners: This constructor is similar to the previous one, but uses vector activations.

Vector activations:

  • Process entire groups of numbers at once, rather than one at a time
  • Can capture relationships between different elements
  • Allow for more complex transformations

This version is useful when you need more sophisticated processing that considers how different features relate to each other, rather than treating each feature independently.

Properties

SupportsGpuExecution

Gets a value indicating whether this layer supports GPU execution.

protected override bool SupportsGpuExecution { get; }

Property Value

bool

SupportsJitCompilation

Gets whether this layer supports JIT compilation.

public override bool SupportsJitCompilation { get; }

Property Value

bool

True, as all required operations (Conv2D, PixelShuffle) are available.

Remarks

Subpixel convolutional layers support JIT compilation using Conv2D and PixelShuffle operations from TensorOperations. The layer requires both convolution and pixel shuffling operations which are available in the computation graph.

SupportsTraining

Gets a value indicating whether this layer supports training.

public override bool SupportsTraining { get; }

Property Value

bool

true for this layer, as it contains trainable parameters (kernels and biases).

Remarks

This property indicates whether the subpixel convolutional layer can be trained through backpropagation. Since this layer has trainable parameters (kernels and biases), it supports training.

For Beginners: This property tells you if the layer can learn from data.

A value of true means:

  • The layer has internal values (kernels and biases) that can be adjusted during training
  • It will improve its performance as it sees more data
  • It participates in the learning process

For this layer, the value is always true because it needs to learn which patterns are most important for upscaling the input effectively.

Methods

Backward(Tensor<T>)

Performs the backward pass of the subpixel convolutional layer.

public override Tensor<T> Backward(Tensor<T> outputGradient)

Parameters

outputGradient Tensor<T>

The gradient of the loss with respect to the layer's output.

Returns

Tensor<T>

The gradient of the loss with respect to the layer's input.

Remarks

This method implements the backward pass of the subpixel convolutional layer, which is used during training to propagate error gradients back through the network. It calculates gradients for the input and for all trainable parameters (kernels and biases).

For Beginners: This method is used during training to calculate how the layer's input and parameters should change to reduce errors.

During the backward pass, we reverse the steps from the forward pass:

  1. First, calculate how the activation function affects the gradient

  2. Reverse the pixel shuffling:

    • Convert the gradient from high resolution back to the lower resolution with more channels
    • This helps determine how each output channel contributed to the errors
  3. Calculate three types of gradients:

    • How the input should change (inputGradient)
    • How the kernels should change (kernelGradients)
    • How the biases should change (biasGradients)

These gradients tell the network how to adjust its parameters during the update step to improve its performance on the next forward pass.

Exceptions

InvalidOperationException

Thrown when trying to perform a backward pass before a forward pass.

BackwardGpu(IGpuTensor<T>)

Performs the GPU-resident backward pass of the subpixel convolutional layer.

public override IGpuTensor<T> BackwardGpu(IGpuTensor<T> outputGradient)

Parameters

outputGradient IGpuTensor<T>

The GPU tensor containing the gradient of the loss with respect to the layer's output.

Returns

IGpuTensor<T>

The gradient of the loss with respect to the layer's input.

ExportComputationGraph(List<ComputationNode<T>>)

Exports this layer's computation as a differentiable computation graph for JIT compilation.

public override ComputationNode<T> ExportComputationGraph(List<ComputationNode<T>> inputNodes)

Parameters

inputNodes List<ComputationNode<T>>

List to which input variable nodes should be added.

Returns

ComputationNode<T>

The output computation node representing this layer's operation.

Remarks

This method builds a computation graph representation of the subpixel convolution operation. Subpixel convolution combines convolution with pixel shuffling (depth-to-space rearrangement).

For Beginners: This creates an optimized version for faster inference.

For subpixel convolutional layers:

  • Creates placeholders for input, convolution kernels, and biases
  • Applies convolution operation
  • Applies pixel shuffle (depth-to-space) rearrangement
  • Applies activation function
  • Returns a computation graph for efficient execution

Exceptions

ArgumentNullException

Thrown when inputNodes is null.

InvalidOperationException

Thrown when weights/biases are not initialized or activation is not supported.

Forward(Tensor<T>)

Performs the forward pass of the subpixel convolutional layer.

public override Tensor<T> Forward(Tensor<T> input)

Parameters

input Tensor<T>

The input tensor to process.

Returns

Tensor<T>

The output tensor after convolution, pixel shuffling, and activation.

Remarks

This method implements the forward pass of the subpixel convolutional layer. It first applies convolution to produce a tensor with more channels, then performs pixel shuffling to rearrange these channels into a higher resolution output with fewer channels. Finally, it applies the activation function.

For Beginners: This method processes the input data through the upscaling steps.

The process works in three main steps:

  1. Convolution:

    • The input is processed using the learned pattern detectors (kernels)
    • This creates a version with many more channels than the final output needs
    • These extra channels contain the information needed to create a larger image
  2. Pixel Shuffling:

    • The many channels are rearranged into a larger spatial grid
    • This effectively increases the resolution of the image
    • The proper arrangements of pixels creates the higher resolution output
  3. Activation:

    • A mathematical function is applied to introduce non-linearity
    • This helps the network learn more complex patterns
    • The final output has higher resolution but fewer channels

For example, with upscaleFactor=2:

  • A 32×32×64 input might become 32×32×256 after convolution
  • Then become 64×64×64 after pixel shuffling (4 times more pixels, 1/4 the channels)

ForwardGpu(params IGpuTensor<T>[])

Performs the GPU-resident forward pass of the subpixel convolutional layer.

public override IGpuTensor<T> ForwardGpu(params IGpuTensor<T>[] inputs)

Parameters

inputs IGpuTensor<T>[]

The GPU input tensors.

Returns

IGpuTensor<T>

The GPU output tensor after convolution, pixel shuffle, and activation.

Remarks

All computations stay on GPU: Conv2D → PixelShuffle (reshape+permute+reshape) → Activation.

GetParameters()

Gets all trainable parameters of the layer as a single vector.

public override Vector<T> GetParameters()

Returns

Vector<T>

A vector containing all trainable parameters.

Remarks

This method retrieves all trainable parameters (kernels and biases) of the layer and combines them into a single vector. This is useful for optimization algorithms that operate on all parameters at once, or for saving and loading model weights.

For Beginners: This method collects all the learnable values from the layer.

The parameters:

  • Are the numbers that the neural network learns during training
  • Include all kernels and biases from the layer
  • Are combined into a single long list (vector)

This is useful for:

  • Saving the model to disk
  • Loading parameters from a previously trained model
  • Advanced optimization techniques that need access to all parameters

ResetState()

Resets the internal state of the layer and reinitializes weights.

public override void ResetState()

Remarks

This method resets the internal state of the layer, clearing cached values from forward and backward passes, resetting momentum, and reinitializing the weights and biases. This is useful when starting new training or when implementing networks that need to reset their state between sequences.

For Beginners: This method clears the layer's memory and starts fresh.

When resetting the state:

  • Stored inputs and outputs are cleared
  • Calculated gradients are cleared
  • Momentum is reset to zero
  • Weights and biases are reinitialized to new random values

This is useful for:

  • Starting a new training session
  • Getting out of a "stuck" state where learning has plateaued
  • Testing how the layer performs with different initializations

Think of it like wiping a whiteboard clean and starting over with a fresh approach.

UpdateParameters(T)

Updates the parameters of the layer using calculated gradients and momentum.

public override void UpdateParameters(T learningRate)

Parameters

learningRate T

The learning rate to control the size of parameter updates.

Remarks

This method updates the convolutional kernels and biases based on the gradients calculated during the backward pass. It uses momentum to stabilize the updates and applies weight decay to the kernels to prevent overfitting.

For Beginners: This method adjusts the layer's pattern detectors to improve performance.

During parameter updates:

  1. Momentum is calculated:

    • 90% of the previous update direction (momentum)
    • 10% of the current gradient direction
    • This helps maintain a steady learning direction
  2. Weights are updated using:

    • The momentum-adjusted gradient (for direction)
    • The learning rate (for step size)
    • Weight decay (to prevent overfitting by keeping weights small)
  3. Biases are updated using:

    • The momentum-adjusted gradient
    • The learning rate
    • No weight decay (biases typically don't cause overfitting)

Think of it like navigating a mountain: momentum helps you keep moving in a consistent direction despite small bumps, while weight decay prevents you from taking extreme paths.

Exceptions

InvalidOperationException

Thrown when trying to update parameters before calculating gradients.

UpdateParametersGpu(IGpuOptimizerConfig)

Updates parameters using GPU-based optimizer.

public override void UpdateParametersGpu(IGpuOptimizerConfig config)

Parameters

config IGpuOptimizerConfig

GPU optimizer configuration specifying the optimizer type and hyperparameters.