Class BottleneckBlock<T>
- Namespace
- AiDotNet.NeuralNetworks.Layers
- Assembly
- AiDotNet.dll
Implements the BottleneckBlock used in ResNet50, ResNet101, and ResNet152 architectures.
public class BottleneckBlock<T> : LayerBase<T>, ILayer<T>, IJitCompilable<T>, IDiagnosticsProvider, IWeightLoadable<T>, IDisposable
Type Parameters
TThe numeric type used for calculations, typically float or double.
- Inheritance
-
LayerBase<T>BottleneckBlock<T>
- Implements
-
ILayer<T>
- Inherited Members
Remarks
The BottleneckBlock uses a 1x1-3x3-1x1 convolution pattern, where the 1x1 layers reduce and then restore dimensions (with expansion), and the 3x3 layer is the bottleneck with smaller channels. This design is more computationally efficient than stacking 3x3 convolutions for deep networks.
Architecture:
Input ─┬─ Conv1x1 ─ BN ─ ReLU ─ Conv3x3 ─ BN ─ ReLU ─ Conv1x1 ─ BN ─┬─ (+) ─ ReLU ─ Output
│ │
└────────────────────── [Downsample?] ────────────────────────┘
The first 1x1 conv reduces channels, the 3x3 processes at reduced channels, and the final 1x1 expands channels by a factor of 4.
For Beginners: The BottleneckBlock is like a compressed processing pipeline.
Think of it as:
- First 1x1 conv: "Compress" - reduce the number of channels (like compressing a file)
- 3x3 conv: "Process" - do the heavy computation on the compressed representation
- Second 1x1 conv: "Expand" - restore and expand the channels
This is more efficient because:
- The expensive 3x3 convolution works on fewer channels
- The overall result has high capacity (4x expansion)
- Much fewer parameters than three 3x3 convolutions
The expansion factor of 4 means if the base channels is 64, the output will have 256 channels.
Constructors
BottleneckBlock(int, int, int, int, int, bool)
Initializes a new instance of the BottleneckBlock<T> class.
public BottleneckBlock(int inChannels, int baseChannels, int stride = 1, int inputHeight = 56, int inputWidth = 56, bool zeroInitResidual = true)
Parameters
inChannelsintThe number of input channels.
baseChannelsintThe base channel count (output will be baseChannels * 4).
strideintThe stride for the 3x3 convolution (default: 1).
inputHeightintThe input spatial height.
inputWidthintThe input spatial width.
zeroInitResidualboolIf true, initialize the last BN to zero for better training stability.
Remarks
For Beginners: The baseChannels parameter specifies the "bottleneck" width. The actual output channels will be baseChannels * 4 due to the expansion factor. For example, if baseChannels = 64, the output will have 256 channels.
Fields
Expansion
The expansion factor for BottleneckBlock. Output channels = base channels * 4.
public const int Expansion = 4
Field Value
Properties
SupportsGpuExecution
Gets a value indicating whether this layer has a GPU implementation.
protected override bool SupportsGpuExecution { get; }
Property Value
SupportsJitCompilation
Gets whether this layer supports JIT compilation.
public override bool SupportsJitCompilation { get; }
Property Value
Remarks
BottleneckBlock supports JIT compilation when all its sub-layers support JIT. This includes conv1, bn1, conv2, bn2, conv3, bn3, and optionally the downsample layers.
SupportsTraining
Gets a value indicating whether this layer supports training.
public override bool SupportsTraining { get; }
Property Value
Methods
Backward(Tensor<T>)
Performs the backward pass through the BottleneckBlock.
public override Tensor<T> Backward(Tensor<T> outputGradient)
Parameters
outputGradientTensor<T>The gradient of the loss with respect to the output.
Returns
- Tensor<T>
The gradient of the loss with respect to the input.
BackwardGpu(IGpuTensor<T>)
GPU-accelerated backward pass through the BottleneckBlock.
public override IGpuTensor<T> BackwardGpu(IGpuTensor<T> outputGradient)
Parameters
outputGradientIGpuTensor<T>The gradient of the loss with respect to the output.
Returns
- IGpuTensor<T>
GPU-resident gradient of the loss with respect to the input.
ExportComputationGraph(List<ComputationNode<T>>)
Exports the computation graph for JIT compilation.
public override ComputationNode<T> ExportComputationGraph(List<ComputationNode<T>> inputNodes)
Parameters
inputNodesList<ComputationNode<T>>List to populate with input computation nodes.
Returns
- ComputationNode<T>
The output computation node representing the BottleneckBlock.
Remarks
This method builds a computation graph representing the BottleneckBlock: Input -> Conv1(1x1) -> BN1 -> ReLU -> Conv2(3x3) -> BN2 -> ReLU -> Conv3(1x1) -> BN3 -> (+Identity) -> ReLU -> Output
For JIT compilation, we chain the sub-layer computation graphs together and add the residual connection using TensorOperations.Add.
Forward(Tensor<T>)
Performs the forward pass through the BottleneckBlock.
public override Tensor<T> Forward(Tensor<T> input)
Parameters
inputTensor<T>The input tensor.
Returns
- Tensor<T>
The output tensor after the residual connection.
ForwardGpu(params IGpuTensor<T>[])
Performs the forward pass on GPU, keeping data GPU-resident.
public override IGpuTensor<T> ForwardGpu(params IGpuTensor<T>[] inputs)
Parameters
inputsIGpuTensor<T>[]The input tensors (expects single input).
Returns
- IGpuTensor<T>
The output tensor on GPU.
GetParameters()
Gets all trainable parameters.
public override Vector<T> GetParameters()
Returns
- Vector<T>
A vector containing all parameters.
ResetState()
Resets the internal state of the block.
public override void ResetState()
UpdateParameters(T)
Updates the parameters of all internal layers.
public override void UpdateParameters(T learningRate)
Parameters
learningRateTThe learning rate.