Class TransitionLayer<T>

Namespace: AiDotNet.NeuralNetworks.Layers

Assembly: AiDotNet.dll

Implements a Transition Layer from the DenseNet architecture.

public class TransitionLayer<T> : LayerBase<T>, ILayer<T>, IJitCompilable<T>, IDiagnosticsProvider, IWeightLoadable<T>, IDisposable, IChainableComputationGraph<T>

Type Parameters

T: The numeric type used for calculations.

Inheritance: object

LayerBase<T>

TransitionLayer<T>

Implements: ILayer<T>

IJitCompilable<T>

IDiagnosticsProvider

IWeightLoadable<T>

IDisposable

IChainableComputationGraph<T>

Inherited Members: LayerBase<T>.Engine

LayerBase<T>.ScalarActivation

LayerBase<T>.VectorActivation

LayerBase<T>.UsingVectorActivation

LayerBase<T>.NumOps

LayerBase<T>.Random

LayerBase<T>.Parameters

LayerBase<T>.ParameterGradients

LayerBase<T>.InputShape

LayerBase<T>.InputShapes

LayerBase<T>.UpdateInputShape(int[])

LayerBase<T>.OutputShape

LayerBase<T>.IsTrainingMode

LayerBase<T>.InitializationStrategy

LayerBase<T>.IsInitialized

LayerBase<T>.InitializationLock

LayerBase<T>.EnsureInitialized()

LayerBase<T>.UseAutodiff

LayerBase<T>.SetTrainingMode(bool)

LayerBase<T>.GetParameterGradients()

LayerBase<T>.ClearGradients()

LayerBase<T>.GetInputShape()

LayerBase<T>.GetInputShapes()

LayerBase<T>.GetOutputShape()

LayerBase<T>.GetWeights()

LayerBase<T>.GetBiases()

LayerBase<T>.MapActivationToFused()

LayerBase<T>.SupportsGpuTraining

LayerBase<T>.CanExecuteOnGpu

LayerBase<T>.CanTrainOnGpu

LayerBase<T>.UpdateParametersGpu(IGpuOptimizerConfig)

LayerBase<T>.UploadWeightsToGpu()

LayerBase<T>.DownloadWeightsFromGpu()

LayerBase<T>.ZeroGradientsGpu()

LayerBase<T>.GetActivationTypes()

LayerBase<T>.Forward(params Tensor<T>[])

LayerBase<T>.ApplyActivation(Tensor<T>)

LayerBase<T>.ApplyActivation(Vector<T>)

LayerBase<T>.ActivateTensor(IActivationFunction<T>, Tensor<T>)

LayerBase<T>.ActivateTensor(IVectorActivationFunction<T>, Tensor<T>)

LayerBase<T>.CalculateInputShape(int, int, int)

LayerBase<T>.CalculateOutputShape(int, int, int)

LayerBase<T>.Clone()

LayerBase<T>.DerivativeTensor(IActivationFunction<T>, Tensor<T>)

LayerBase<T>.ApplyActivationDerivative(T, T)

LayerBase<T>.ApplyActivationDerivative(Tensor<T>, Tensor<T>)

LayerBase<T>.ComputeActivationJacobian(Vector<T>)

LayerBase<T>.ApplyActivationDerivative(Vector<T>, Vector<T>)

LayerBase<T>.UpdateParameters(Vector<T>)

LayerBase<T>.ParameterCount

LayerBase<T>.Serialize(BinaryWriter)

LayerBase<T>.Deserialize(BinaryReader)

LayerBase<T>.GetDiagnostics()

LayerBase<T>.ApplyActivationToGraph(ComputationNode<T>)

LayerBase<T>.CanActivationBeJitted()

LayerBase<T>.RegisterTrainableParameter(Tensor<T>, PersistentTensorRole)

LayerBase<T>.InvalidateTrainableParameter(Tensor<T>)

LayerBase<T>.HasGpuActivation()

LayerBase<T>.ApplyActivationForwardGpu(IDirectGpuBackend, IGpuBuffer, IGpuBuffer, int)

LayerBase<T>.ApplyActivationBackwardGpu(IDirectGpuBackend, IGpuBuffer, IGpuBuffer, IGpuBuffer, IGpuBuffer, int)

LayerBase<T>.GetFusedActivationType()

LayerBase<T>.ApplyGpuActivation(IDirectGpuBackend, IGpuBuffer, IGpuBuffer, int, FusedActivationType)

LayerBase<T>.ApplyGpuActivationBackward(IDirectGpuBackend, IGpuBuffer, IGpuBuffer, IGpuBuffer, IGpuBuffer, int, FusedActivationType, float)

LayerBase<T>.Dispose()

LayerBase<T>.Dispose(bool)

LayerBase<T>.WeightParameterName

LayerBase<T>.BiasParameterName

LayerBase<T>.SetWeights(Tensor<T>)

LayerBase<T>.SetBiases(Tensor<T>)

LayerBase<T>.GetParameterNames()

LayerBase<T>.TryGetParameter(string, out Tensor<T>)

LayerBase<T>.SetParameter(string, Tensor<T>)

LayerBase<T>.GetParameterShape(string)

LayerBase<T>.NamedParameterCount

LayerBase<T>.ValidateWeights(IEnumerable<string>, Func<string, string>)

LayerBase<T>.LoadWeights(Dictionary<string, Tensor<T>>, Func<string, string>, bool)

object.Equals(object)

object.Equals(object, object)

object.GetHashCode()

object.GetType()

object.MemberwiseClone()

object.ReferenceEquals(object, object)

object.ToString()

Remarks

A Transition Layer is placed between Dense Blocks to reduce the number of feature maps and spatial dimensions. It performs: 1. Batch Normalization 2. 1x1 Convolution (channel reduction by compression factor) 3. 2x2 Average Pooling with stride 2 (spatial dimension halving)

Architecture:

Input (C channels, H×W)
  ↓
BN → ReLU → Conv1x1 (C × theta channels)
  ↓
AvgPool 2×2, stride 2
  ↓
Output (C × theta channels, H/2 × W/2)

Where theta is the compression factor (default: 0.5).

For Beginners: The transition layer acts as a "bottleneck" between dense blocks.

Its purposes:

Reduce feature map channels (compression): Dense blocks produce many channels
Reduce spatial size (pooling): Helps control computational cost
Improve model compactness without sacrificing accuracy

The compression factor (theta) controls how much to reduce channels. theta=0.5 means halving the channels at each transition.

Constructors

TransitionLayer(int, int, int, double)

Initializes a new instance of the TransitionLayer<T> class.

public TransitionLayer(int inputChannels, int inputHeight, int inputWidth, double compressionFactor = 0.5)

Parameters

inputChannels int: The number of input channels.
inputHeight int: The input feature map height.
inputWidth int: The input feature map width.
compressionFactor double: The channel compression factor (default: 0.5).

Properties

OutputChannels

Gets the number of output channels.

public int OutputChannels { get; }

Property Value

int

SupportsGpuExecution

Gets a value indicating whether this layer supports GPU execution. All sub-layers (BatchNorm, Conv, AvgPool) support GPU.

protected override bool SupportsGpuExecution { get; }

Property Value

bool

SupportsJitCompilation

Gets whether this layer supports JIT compilation.

public override bool SupportsJitCompilation { get; }

Property Value

bool

SupportsTraining

Gets a value indicating whether this layer supports training.

public override bool SupportsTraining { get; }

Property Value

bool

Methods

Backward(Tensor<T>)

Performs the backward pass of the Transition Layer.

public override Tensor<T> Backward(Tensor<T> outputGradient)

Parameters

outputGradient Tensor<T>: The gradient of the loss with respect to the output.

Returns

Tensor<T>: The gradient of the loss with respect to the input.

BackwardGpu(IGpuTensor<T>)

Computes the gradient of the loss with respect to the input on the GPU.

public override IGpuTensor<T> BackwardGpu(IGpuTensor<T> outputGradient)

Parameters

outputGradient IGpuTensor<T>: The gradient of the loss with respect to the layer's output.

Returns

IGpuTensor<T>: The gradient of the loss with respect to the layer's input.

BuildComputationGraph(ComputationNode<T>, string)

Builds the computation graph for this layer using the provided input node.

public ComputationNode<T> BuildComputationGraph(ComputationNode<T> inputNode, string namePrefix)

Parameters

inputNode ComputationNode<T>: The input computation node from the parent layer.
namePrefix string: Prefix for naming internal nodes (for debugging/visualization).

Returns

ComputationNode<T>: The output computation node representing this layer's computation.

Remarks

Unlike ILayer<T>.ExportComputationGraph, this method does NOT create a new input variable. Instead, it uses the provided inputNode as its input, allowing the parent layer to chain multiple sub-layers together in a single computation graph.

The namePrefix parameter should be used to prefix all internal node names to avoid naming conflicts when multiple instances of the same layer type are used.

ExportComputationGraph(List<ComputationNode<T>>)

Exports the layer's computation graph for JIT compilation.

public override ComputationNode<T> ExportComputationGraph(List<ComputationNode<T>> inputNodes)

Parameters

inputNodes List<ComputationNode<T>>: List to populate with input computation nodes.

Returns

ComputationNode<T>: The output computation node representing the layer's operation.

Remarks

This method constructs a computation graph representation of the layer's forward pass that can be JIT compiled for faster inference. All layers MUST implement this method to support JIT compilation.

For Beginners: JIT (Just-In-Time) compilation converts the layer's operations into optimized native code for 5-10x faster inference.

To support JIT compilation, a layer must:

Implement this method to export its computation graph
Set SupportsJitCompilation to true
Use ComputationNode and TensorOperations to build the graph

All layers are required to implement this method, even if they set SupportsJitCompilation = false.

Forward(Tensor<T>)

Performs the forward pass of the Transition Layer.

public override Tensor<T> Forward(Tensor<T> input)

Parameters

input Tensor<T>: The input tensor [B, C, H, W] or [C, H, W].

Returns

Tensor<T>: The output tensor with reduced channels and spatial dimensions.

ForwardGpu(params IGpuTensor<T>[])

Performs the forward pass using GPU-resident tensors.

public override IGpuTensor<T> ForwardGpu(params IGpuTensor<T>[] inputs)

Parameters

inputs IGpuTensor<T>[]: The GPU-resident input tensors.

Returns

IGpuTensor<T>: A GPU-resident output tensor.

Remarks

Chains GPU operations through sub-layers: BN → ReLU → Conv1x1 → AvgPool. All intermediate results stay GPU-resident.

GetParameters()

Gets all trainable parameters from the layer.

public override Vector<T> GetParameters()

Returns

Vector<T>

ResetState()

Resets the internal state of the layer.

public override void ResetState()

SetParameters(Vector<T>)

Sets all trainable parameters from the given parameter vector.

public override void SetParameters(Vector<T> parameters)

Parameters

parameters Vector<T>: The parameter vector containing all layer parameters.

UpdateParameters(T)

Updates the parameters of all sub-layers.

public override void UpdateParameters(T learningRate)

Parameters

learningRate T: The learning rate for parameter updates.

Table of Contents

Class TransitionLayer<T>

Type Parameters

Remarks

Constructors

TransitionLayer(int, int, int, double)

Parameters

Properties

OutputChannels

Property Value

SupportsGpuExecution

Property Value

SupportsJitCompilation

Property Value

SupportsTraining

Property Value

Methods

Backward(Tensor<T>)

Parameters

Returns

BackwardGpu(IGpuTensor<T>)

Parameters

Returns

BuildComputationGraph(ComputationNode<T>, string)

Parameters

Returns

Remarks

ExportComputationGraph(List<ComputationNode<T>>)

Parameters

Returns

Remarks

Forward(Tensor<T>)

Parameters

Returns

ForwardGpu(params IGpuTensor<T>[])

Parameters

Returns

Remarks

GetParameters()

Returns

ResetState()

SetParameters(Vector<T>)

Parameters

UpdateParameters(T)

Parameters