Table of Contents

Class BottleneckBlock<T>

Namespace
AiDotNet.NeuralNetworks.Layers
Assembly
AiDotNet.dll

Implements the BottleneckBlock used in ResNet50, ResNet101, and ResNet152 architectures.

public class BottleneckBlock<T> : LayerBase<T>, ILayer<T>, IJitCompilable<T>, IDiagnosticsProvider, IWeightLoadable<T>, IDisposable

Type Parameters

T

The numeric type used for calculations, typically float or double.

Inheritance
BottleneckBlock<T>
Implements
Inherited Members

Remarks

The BottleneckBlock uses a 1x1-3x3-1x1 convolution pattern, where the 1x1 layers reduce and then restore dimensions (with expansion), and the 3x3 layer is the bottleneck with smaller channels. This design is more computationally efficient than stacking 3x3 convolutions for deep networks.

Architecture:

Input ─┬─ Conv1x1 ─ BN ─ ReLU ─ Conv3x3 ─ BN ─ ReLU ─ Conv1x1 ─ BN ─┬─ (+) ─ ReLU ─ Output
       │                                                             │
       └────────────────────── [Downsample?] ────────────────────────┘

The first 1x1 conv reduces channels, the 3x3 processes at reduced channels, and the final 1x1 expands channels by a factor of 4.

For Beginners: The BottleneckBlock is like a compressed processing pipeline.

Think of it as:

  1. First 1x1 conv: "Compress" - reduce the number of channels (like compressing a file)
  2. 3x3 conv: "Process" - do the heavy computation on the compressed representation
  3. Second 1x1 conv: "Expand" - restore and expand the channels

This is more efficient because:

  • The expensive 3x3 convolution works on fewer channels
  • The overall result has high capacity (4x expansion)
  • Much fewer parameters than three 3x3 convolutions

The expansion factor of 4 means if the base channels is 64, the output will have 256 channels.

Constructors

BottleneckBlock(int, int, int, int, int, bool)

Initializes a new instance of the BottleneckBlock<T> class.

public BottleneckBlock(int inChannels, int baseChannels, int stride = 1, int inputHeight = 56, int inputWidth = 56, bool zeroInitResidual = true)

Parameters

inChannels int

The number of input channels.

baseChannels int

The base channel count (output will be baseChannels * 4).

stride int

The stride for the 3x3 convolution (default: 1).

inputHeight int

The input spatial height.

inputWidth int

The input spatial width.

zeroInitResidual bool

If true, initialize the last BN to zero for better training stability.

Remarks

For Beginners: The baseChannels parameter specifies the "bottleneck" width. The actual output channels will be baseChannels * 4 due to the expansion factor. For example, if baseChannels = 64, the output will have 256 channels.

Fields

Expansion

The expansion factor for BottleneckBlock. Output channels = base channels * 4.

public const int Expansion = 4

Field Value

int

Properties

SupportsGpuExecution

Gets a value indicating whether this layer has a GPU implementation.

protected override bool SupportsGpuExecution { get; }

Property Value

bool

SupportsJitCompilation

Gets whether this layer supports JIT compilation.

public override bool SupportsJitCompilation { get; }

Property Value

bool

Remarks

BottleneckBlock supports JIT compilation when all its sub-layers support JIT. This includes conv1, bn1, conv2, bn2, conv3, bn3, and optionally the downsample layers.

SupportsTraining

Gets a value indicating whether this layer supports training.

public override bool SupportsTraining { get; }

Property Value

bool

Methods

Backward(Tensor<T>)

Performs the backward pass through the BottleneckBlock.

public override Tensor<T> Backward(Tensor<T> outputGradient)

Parameters

outputGradient Tensor<T>

The gradient of the loss with respect to the output.

Returns

Tensor<T>

The gradient of the loss with respect to the input.

BackwardGpu(IGpuTensor<T>)

GPU-accelerated backward pass through the BottleneckBlock.

public override IGpuTensor<T> BackwardGpu(IGpuTensor<T> outputGradient)

Parameters

outputGradient IGpuTensor<T>

The gradient of the loss with respect to the output.

Returns

IGpuTensor<T>

GPU-resident gradient of the loss with respect to the input.

ExportComputationGraph(List<ComputationNode<T>>)

Exports the computation graph for JIT compilation.

public override ComputationNode<T> ExportComputationGraph(List<ComputationNode<T>> inputNodes)

Parameters

inputNodes List<ComputationNode<T>>

List to populate with input computation nodes.

Returns

ComputationNode<T>

The output computation node representing the BottleneckBlock.

Remarks

This method builds a computation graph representing the BottleneckBlock: Input -> Conv1(1x1) -> BN1 -> ReLU -> Conv2(3x3) -> BN2 -> ReLU -> Conv3(1x1) -> BN3 -> (+Identity) -> ReLU -> Output

For JIT compilation, we chain the sub-layer computation graphs together and add the residual connection using TensorOperations.Add.

Forward(Tensor<T>)

Performs the forward pass through the BottleneckBlock.

public override Tensor<T> Forward(Tensor<T> input)

Parameters

input Tensor<T>

The input tensor.

Returns

Tensor<T>

The output tensor after the residual connection.

ForwardGpu(params IGpuTensor<T>[])

Performs the forward pass on GPU, keeping data GPU-resident.

public override IGpuTensor<T> ForwardGpu(params IGpuTensor<T>[] inputs)

Parameters

inputs IGpuTensor<T>[]

The input tensors (expects single input).

Returns

IGpuTensor<T>

The output tensor on GPU.

GetParameters()

Gets all trainable parameters.

public override Vector<T> GetParameters()

Returns

Vector<T>

A vector containing all parameters.

ResetState()

Resets the internal state of the block.

public override void ResetState()

UpdateParameters(T)

Updates the parameters of all internal layers.

public override void UpdateParameters(T learningRate)

Parameters

learningRate T

The learning rate.