Table of Contents

Class PixelShuffleLayer<T>

Namespace
AiDotNet.NeuralNetworks.Layers
Assembly
AiDotNet.dll

Pixel shuffle (sub-pixel convolution) layer for efficient spatial upsampling.

public class PixelShuffleLayer<T> : LayerBase<T>, ILayer<T>, IJitCompilable<T>, IDiagnosticsProvider, IWeightLoadable<T>, IDisposable

Type Parameters

T

The numeric type used for calculations, typically float or double.

Inheritance
PixelShuffleLayer<T>
Implements
Inherited Members

Remarks

Pixel shuffle rearranges elements from the channel dimension into spatial dimensions, effectively upscaling the spatial resolution. This is more efficient than transposed convolution (deconvolution) for upsampling operations.

For a 2x upscaling, the layer takes 4 channel values and arranges them as a 2x2 spatial block. The operation follows the formula: [batch, channels * r^2, height, width] -> [batch, channels, height * r, width * r]

For Beginners: Imagine you have a small image and want to make it bigger.

Pixel shuffle works by:

  1. Starting with extra channel information (4x more channels for 2x upscaling)
  2. Rearranging those channel values into spatial positions
  3. Creating a larger image with the same amount of total information

For example, with 2x upscaling:

  • Input: 64 channels × 32×32 pixels
  • Output: 16 channels × 64×64 pixels (same total data, different arrangement)

This is commonly used in super-resolution models like Real-ESRGAN and ESPCN.

Constructors

PixelShuffleLayer(int[], int)

Initializes a new instance of the PixelShuffleLayer<T> class.

public PixelShuffleLayer(int[] inputShape, int upscaleFactor)

Parameters

inputShape int[]

The shape of the input tensor. Supports any rank >= 3.

upscaleFactor int

The spatial upscaling factor (e.g., 2 for 2x upscaling).

Remarks

For Beginners: Create a pixel shuffle layer to upscale your feature maps.

The input channels must be divisible by upscaleFactor². For example:

  • 2x upscaling requires input channels divisible by 4
  • 4x upscaling requires input channels divisible by 16

Example usage:

// Create a 2x upscaling layer for 64-channel 32×32 feature maps
var pixelShuffle = new PixelShuffleLayer<float>(
    inputShape: new[] { 64, 32, 32 },  // [channels, height, width]
    upscaleFactor: 2
);
// Output will be [16, 64, 64] - 4x fewer channels, 4x more pixels

Properties

SupportsGpuExecution

Indicates whether this layer supports GPU execution. PixelShuffle uses GPU Reshape and Permute operations for efficient rearrangement.

protected override bool SupportsGpuExecution { get; }

Property Value

bool

SupportsJitCompilation

Gets whether this layer supports JIT compilation.

public override bool SupportsJitCompilation { get; }

Property Value

bool

True if the layer can be JIT compiled, false otherwise.

Remarks

This property indicates whether the layer has implemented ExportComputationGraph() and can benefit from JIT compilation. All layers MUST implement this property.

For Beginners: JIT compilation can make inference 5-10x faster by converting the layer's operations into optimized native code.

Layers should return false if they:

  • Have not yet implemented a working ExportComputationGraph()
  • Use dynamic operations that change based on input data
  • Are too simple to benefit from JIT compilation

When false, the layer will use the standard Forward() method instead.

SupportsTraining

Gets a value indicating whether this layer supports training.

public override bool SupportsTraining { get; }

Property Value

bool

true if the layer has trainable parameters and supports backpropagation; otherwise, false.

Remarks

This property indicates whether the layer can be trained through backpropagation. Layers with trainable parameters such as weights and biases typically return true, while layers that only perform fixed transformations (like pooling or activation layers) typically return false.

For Beginners: This property tells you if the layer can learn from data.

A value of true means:

  • The layer has parameters that can be adjusted during training
  • It will improve its performance as it sees more data
  • It participates in the learning process

A value of false means:

  • The layer doesn't have any adjustable parameters
  • It performs the same operation regardless of training
  • It doesn't need to learn (but may still be useful)

UpscaleFactor

Gets the upscaling factor used by this layer.

public int UpscaleFactor { get; }

Property Value

int

Remarks

The upscaling factor determines how much the spatial dimensions are increased. A factor of 2 doubles both width and height, a factor of 4 quadruples them.

Methods

Backward(Tensor<T>)

Performs the backward pass of the layer.

public override Tensor<T> Backward(Tensor<T> outputGradient)

Parameters

outputGradient Tensor<T>

The gradient of the loss with respect to the layer's output.

Returns

Tensor<T>

The gradient of the loss with respect to the layer's input.

Remarks

This abstract method must be implemented by derived classes to define the backward pass of the layer. The backward pass propagates error gradients from the output of the layer back to its input, and calculates gradients for any trainable parameters.

For Beginners: This method is used during training to calculate how the layer's input should change to reduce errors.

During the backward pass:

  1. The layer receives information about how its output contributed to errors
  2. It calculates how its parameters should change to reduce errors
  3. It calculates how its input should change, which will be used by earlier layers

This is the core of how neural networks learn from their mistakes during training.

BackwardGpu(IGpuTensor<T>)

Performs the backward pass using GPU-resident tensors.

public override IGpuTensor<T> BackwardGpu(IGpuTensor<T> outputGradient)

Parameters

outputGradient IGpuTensor<T>

The gradient from the next layer.

Returns

IGpuTensor<T>

The gradient with respect to the input.

Remarks

The backward pass of pixel shuffle (pixel unshuffle) reverses the forward: Reshape [N, C, H*r, W*r] -> [N, C, H, r, W, r] -> Permute (inverse) -> [N, C, r, r, H, W] -> Reshape [N, C*r², H, W]

ExportComputationGraph(List<ComputationNode<T>>)

Exports the layer's computation graph for JIT compilation.

public override ComputationNode<T> ExportComputationGraph(List<ComputationNode<T>> inputNodes)

Parameters

inputNodes List<ComputationNode<T>>

List to populate with input computation nodes.

Returns

ComputationNode<T>

The output computation node representing the layer's operation.

Remarks

This method constructs a computation graph representation of the layer's forward pass that can be JIT compiled for faster inference. All layers MUST implement this method to support JIT compilation.

For Beginners: JIT (Just-In-Time) compilation converts the layer's operations into optimized native code for 5-10x faster inference.

To support JIT compilation, a layer must:

  1. Implement this method to export its computation graph
  2. Set SupportsJitCompilation to true
  3. Use ComputationNode and TensorOperations to build the graph

All layers are required to implement this method, even if they set SupportsJitCompilation = false.

Forward(Tensor<T>)

Performs the forward pass of the layer.

public override Tensor<T> Forward(Tensor<T> input)

Parameters

input Tensor<T>

The input tensor to process.

Returns

Tensor<T>

The output tensor after processing.

Remarks

This abstract method must be implemented by derived classes to define the forward pass of the layer. The forward pass transforms the input tensor according to the layer's operation and activation function.

For Beginners: This method processes your data through the layer.

The forward pass:

  • Takes input data from the previous layer or the network input
  • Applies the layer's specific transformation (like convolution or matrix multiplication)
  • Applies any activation function
  • Passes the result to the next layer

This is where the actual data processing happens during both training and prediction.

ForwardGpu(params IGpuTensor<T>[])

Performs the forward pass using GPU-resident tensors.

public override IGpuTensor<T> ForwardGpu(params IGpuTensor<T>[] inputs)

Parameters

inputs IGpuTensor<T>[]

The GPU-resident input tensors.

Returns

IGpuTensor<T>

A GPU-resident output tensor after pixel shuffle.

Remarks

Pixel shuffle is implemented as: Reshape -> Permute -> Reshape [N, C*r², H, W] -> [N, C, r, r, H, W] -> [N, C, H, r, W, r] -> [N, C, H*r, W*r] All operations stay GPU-resident.

GetParameters()

Gets all trainable parameters of the layer as a single vector.

public override Vector<T> GetParameters()

Returns

Vector<T>

A vector containing all trainable parameters.

Remarks

This abstract method must be implemented by derived classes to provide access to all trainable parameters of the layer as a single vector. This is useful for optimization algorithms that operate on all parameters at once, or for saving and loading model weights.

For Beginners: This method collects all the learnable values from the layer.

The parameters:

  • Are the numbers that the neural network learns during training
  • Include weights, biases, and other learnable values
  • Are combined into a single long list (vector)

This is useful for:

  • Saving the model to disk
  • Loading parameters from a previously trained model
  • Advanced optimization techniques that need access to all parameters

ResetState()

Resets the internal state of the layer.

public override void ResetState()

Remarks

This abstract method must be implemented by derived classes to reset any internal state the layer maintains between forward and backward passes. This is useful when starting to process a new sequence or when implementing stateful recurrent networks.

For Beginners: This method clears the layer's memory to start fresh.

When resetting the state:

  • Cached inputs and outputs are cleared
  • Any temporary calculations are discarded
  • The layer is ready to process new data without being influenced by previous data

This is important for:

  • Processing a new, unrelated sequence
  • Preventing information from one sequence affecting another
  • Starting a new training episode

UpdateParameters(T)

Updates the parameters of the layer using the calculated gradients.

public override void UpdateParameters(T learningRate)

Parameters

learningRate T

The learning rate to use for the parameter updates.

Remarks

This abstract method must be implemented by derived classes to define how the layer's parameters are updated during training. The learning rate controls the size of the parameter updates.

For Beginners: This method updates the layer's internal values during training.

When updating parameters:

  • The weights, biases, or other parameters are adjusted to reduce prediction errors
  • The learning rate controls how big each update step is
  • Smaller learning rates mean slower but more stable learning
  • Larger learning rates mean faster but potentially unstable learning

This is how the layer "learns" from data over time, gradually improving its ability to extract useful patterns from inputs.