Class BasicBlock<T>
- Namespace
- AiDotNet.NeuralNetworks.Layers
- Assembly
- AiDotNet.dll
Implements the BasicBlock used in ResNet18 and ResNet34 architectures.
public class BasicBlock<T> : LayerBase<T>, ILayer<T>, IJitCompilable<T>, IDiagnosticsProvider, IWeightLoadable<T>, IDisposable
Type Parameters
TThe numeric type used for calculations, typically float or double.
- Inheritance
-
LayerBase<T>BasicBlock<T>
- Implements
-
ILayer<T>
- Inherited Members
Remarks
The BasicBlock contains two 3x3 convolutional layers with batch normalization and ReLU activation. A skip connection adds the input directly to the output, enabling gradient flow through very deep networks.
Architecture:
Input ─┬─ Conv3x3 ─ BN ─ ReLU ─ Conv3x3 ─ BN ─┬─ (+) ─ ReLU ─ Output
│ │
└───────────── [Downsample?] ───────────┘
For Beginners: The BasicBlock is like a "learning module" with a shortcut.
The key insight is:
- The two conv layers learn to predict what needs to be ADDED to the input (the "residual")
- The skip connection adds the original input back to this learned residual
- This makes it easier to train very deep networks because gradients can flow directly through the skip connection
When the input and output have different dimensions (due to stride or channel changes), a downsample layer (1x1 conv + BN) is used to match the dimensions before adding.
Constructors
BasicBlock(int, int, int, int, int, bool)
Initializes a new instance of the BasicBlock<T> class.
public BasicBlock(int inChannels, int outChannels, int stride = 1, int inputHeight = 56, int inputWidth = 56, bool zeroInitResidual = true)
Parameters
inChannelsintThe number of input channels.
outChannelsintThe number of output channels.
strideintThe stride for the first convolution (default: 1).
inputHeightintThe input spatial height.
inputWidthintThe input spatial width.
zeroInitResidualboolIf true, initialize the last BN to zero for better training stability.
Remarks
For Beginners: When stride > 1, the block will downsample the spatial dimensions. When inChannels != outChannels, a projection shortcut is used to match dimensions.
Fields
Expansion
The expansion factor for BasicBlock. BasicBlock does not expand channels.
public const int Expansion = 1
Field Value
Properties
SupportsGpuExecution
Gets a value indicating whether this layer has a GPU implementation.
protected override bool SupportsGpuExecution { get; }
Property Value
SupportsJitCompilation
Gets whether this layer supports JIT compilation.
public override bool SupportsJitCompilation { get; }
Property Value
Remarks
BasicBlock supports JIT compilation when all its sub-layers support JIT. This includes conv1, bn1, conv2, bn2, and optionally the downsample layers.
SupportsTraining
Gets a value indicating whether this layer supports training.
public override bool SupportsTraining { get; }
Property Value
Methods
Backward(Tensor<T>)
Performs the backward pass through the BasicBlock.
public override Tensor<T> Backward(Tensor<T> outputGradient)
Parameters
outputGradientTensor<T>The gradient of the loss with respect to the output.
Returns
- Tensor<T>
The gradient of the loss with respect to the input.
BackwardGpu(IGpuTensor<T>)
GPU-accelerated backward pass through the BasicBlock.
public override IGpuTensor<T> BackwardGpu(IGpuTensor<T> outputGradient)
Parameters
outputGradientIGpuTensor<T>The gradient of the loss with respect to the output.
Returns
- IGpuTensor<T>
GPU-resident gradient of the loss with respect to the input.
ExportComputationGraph(List<ComputationNode<T>>)
Exports the computation graph for JIT compilation.
public override ComputationNode<T> ExportComputationGraph(List<ComputationNode<T>> inputNodes)
Parameters
inputNodesList<ComputationNode<T>>List to populate with input computation nodes.
Returns
- ComputationNode<T>
The output computation node representing the BasicBlock.
Remarks
This method builds a computation graph representing the BasicBlock: Input -> Conv1 -> BN1 -> ReLU -> Conv2 -> BN2 -> (+Identity) -> ReLU -> Output
For JIT compilation, we chain the sub-layer computation graphs together and add the residual connection using TensorOperations.Add.
Forward(Tensor<T>)
Performs the forward pass through the BasicBlock.
public override Tensor<T> Forward(Tensor<T> input)
Parameters
inputTensor<T>The input tensor.
Returns
- Tensor<T>
The output tensor after the residual connection.
ForwardGpu(params IGpuTensor<T>[])
Performs the forward pass on GPU, keeping data GPU-resident.
public override IGpuTensor<T> ForwardGpu(params IGpuTensor<T>[] inputs)
Parameters
inputsIGpuTensor<T>[]The input tensors (expects single input).
Returns
- IGpuTensor<T>
The output tensor on GPU.
GetParameters()
Gets all trainable parameters.
public override Vector<T> GetParameters()
Returns
- Vector<T>
A vector containing all parameters.
ResetState()
Resets the internal state of the block.
public override void ResetState()
UpdateParameters(T)
Updates the parameters of all internal layers.
public override void UpdateParameters(T learningRate)
Parameters
learningRateTThe learning rate.