Table of Contents

Class TransitionLayer<T>

Namespace
AiDotNet.NeuralNetworks.Layers
Assembly
AiDotNet.dll

Implements a Transition Layer from the DenseNet architecture.

public class TransitionLayer<T> : LayerBase<T>, ILayer<T>, IJitCompilable<T>, IDiagnosticsProvider, IWeightLoadable<T>, IDisposable, IChainableComputationGraph<T>

Type Parameters

T

The numeric type used for calculations.

Inheritance
TransitionLayer<T>
Implements
Inherited Members

Remarks

A Transition Layer is placed between Dense Blocks to reduce the number of feature maps and spatial dimensions. It performs: 1. Batch Normalization 2. 1x1 Convolution (channel reduction by compression factor) 3. 2x2 Average Pooling with stride 2 (spatial dimension halving)

Architecture:

Input (C channels, H×W)
  ↓
BN → ReLU → Conv1x1 (C × theta channels)
  ↓
AvgPool 2×2, stride 2
  ↓
Output (C × theta channels, H/2 × W/2)
Where theta is the compression factor (default: 0.5).

For Beginners: The transition layer acts as a "bottleneck" between dense blocks.

Its purposes:

  • Reduce feature map channels (compression): Dense blocks produce many channels
  • Reduce spatial size (pooling): Helps control computational cost
  • Improve model compactness without sacrificing accuracy

The compression factor (theta) controls how much to reduce channels. theta=0.5 means halving the channels at each transition.

Constructors

TransitionLayer(int, int, int, double)

Initializes a new instance of the TransitionLayer<T> class.

public TransitionLayer(int inputChannels, int inputHeight, int inputWidth, double compressionFactor = 0.5)

Parameters

inputChannels int

The number of input channels.

inputHeight int

The input feature map height.

inputWidth int

The input feature map width.

compressionFactor double

The channel compression factor (default: 0.5).

Properties

OutputChannels

Gets the number of output channels.

public int OutputChannels { get; }

Property Value

int

SupportsGpuExecution

Gets a value indicating whether this layer supports GPU execution. All sub-layers (BatchNorm, Conv, AvgPool) support GPU.

protected override bool SupportsGpuExecution { get; }

Property Value

bool

SupportsJitCompilation

Gets whether this layer supports JIT compilation.

public override bool SupportsJitCompilation { get; }

Property Value

bool

SupportsTraining

Gets a value indicating whether this layer supports training.

public override bool SupportsTraining { get; }

Property Value

bool

Methods

Backward(Tensor<T>)

Performs the backward pass of the Transition Layer.

public override Tensor<T> Backward(Tensor<T> outputGradient)

Parameters

outputGradient Tensor<T>

The gradient of the loss with respect to the output.

Returns

Tensor<T>

The gradient of the loss with respect to the input.

BackwardGpu(IGpuTensor<T>)

Computes the gradient of the loss with respect to the input on the GPU.

public override IGpuTensor<T> BackwardGpu(IGpuTensor<T> outputGradient)

Parameters

outputGradient IGpuTensor<T>

The gradient of the loss with respect to the layer's output.

Returns

IGpuTensor<T>

The gradient of the loss with respect to the layer's input.

BuildComputationGraph(ComputationNode<T>, string)

Builds the computation graph for this layer using the provided input node.

public ComputationNode<T> BuildComputationGraph(ComputationNode<T> inputNode, string namePrefix)

Parameters

inputNode ComputationNode<T>

The input computation node from the parent layer.

namePrefix string

Prefix for naming internal nodes (for debugging/visualization).

Returns

ComputationNode<T>

The output computation node representing this layer's computation.

Remarks

Unlike ILayer<T>.ExportComputationGraph, this method does NOT create a new input variable. Instead, it uses the provided inputNode as its input, allowing the parent layer to chain multiple sub-layers together in a single computation graph.

The namePrefix parameter should be used to prefix all internal node names to avoid naming conflicts when multiple instances of the same layer type are used.

ExportComputationGraph(List<ComputationNode<T>>)

Exports the layer's computation graph for JIT compilation.

public override ComputationNode<T> ExportComputationGraph(List<ComputationNode<T>> inputNodes)

Parameters

inputNodes List<ComputationNode<T>>

List to populate with input computation nodes.

Returns

ComputationNode<T>

The output computation node representing the layer's operation.

Remarks

This method constructs a computation graph representation of the layer's forward pass that can be JIT compiled for faster inference. All layers MUST implement this method to support JIT compilation.

For Beginners: JIT (Just-In-Time) compilation converts the layer's operations into optimized native code for 5-10x faster inference.

To support JIT compilation, a layer must:

  1. Implement this method to export its computation graph
  2. Set SupportsJitCompilation to true
  3. Use ComputationNode and TensorOperations to build the graph

All layers are required to implement this method, even if they set SupportsJitCompilation = false.

Forward(Tensor<T>)

Performs the forward pass of the Transition Layer.

public override Tensor<T> Forward(Tensor<T> input)

Parameters

input Tensor<T>

The input tensor [B, C, H, W] or [C, H, W].

Returns

Tensor<T>

The output tensor with reduced channels and spatial dimensions.

ForwardGpu(params IGpuTensor<T>[])

Performs the forward pass using GPU-resident tensors.

public override IGpuTensor<T> ForwardGpu(params IGpuTensor<T>[] inputs)

Parameters

inputs IGpuTensor<T>[]

The GPU-resident input tensors.

Returns

IGpuTensor<T>

A GPU-resident output tensor.

Remarks

Chains GPU operations through sub-layers: BN → ReLU → Conv1x1 → AvgPool. All intermediate results stay GPU-resident.

GetParameters()

Gets all trainable parameters from the layer.

public override Vector<T> GetParameters()

Returns

Vector<T>

ResetState()

Resets the internal state of the layer.

public override void ResetState()

SetParameters(Vector<T>)

Sets all trainable parameters from the given parameter vector.

public override void SetParameters(Vector<T> parameters)

Parameters

parameters Vector<T>

The parameter vector containing all layer parameters.

UpdateParameters(T)

Updates the parameters of all sub-layers.

public override void UpdateParameters(T learningRate)

Parameters

learningRate T

The learning rate for parameter updates.