Class BasicBlock<T>

Namespace: AiDotNet.NeuralNetworks.Layers

Assembly: AiDotNet.dll

Implements the BasicBlock used in ResNet18 and ResNet34 architectures.

public class BasicBlock<T> : LayerBase<T>, ILayer<T>, IJitCompilable<T>, IDiagnosticsProvider, IWeightLoadable<T>, IDisposable

Type Parameters

T: The numeric type used for calculations, typically float or double.

Inheritance: object

LayerBase<T>

BasicBlock<T>

Implements: ILayer<T>

IJitCompilable<T>

IDiagnosticsProvider

IWeightLoadable<T>

IDisposable

Inherited Members: LayerBase<T>.Engine

LayerBase<T>.ScalarActivation

LayerBase<T>.VectorActivation

LayerBase<T>.UsingVectorActivation

LayerBase<T>.NumOps

LayerBase<T>.Random

LayerBase<T>.Parameters

LayerBase<T>.ParameterGradients

LayerBase<T>.InputShape

LayerBase<T>.InputShapes

LayerBase<T>.UpdateInputShape(int[])

LayerBase<T>.OutputShape

LayerBase<T>.IsTrainingMode

LayerBase<T>.InitializationStrategy

LayerBase<T>.IsInitialized

LayerBase<T>.InitializationLock

LayerBase<T>.EnsureInitialized()

LayerBase<T>.UseAutodiff

LayerBase<T>.SetTrainingMode(bool)

LayerBase<T>.GetParameterGradients()

LayerBase<T>.ClearGradients()

LayerBase<T>.GetInputShape()

LayerBase<T>.GetInputShapes()

LayerBase<T>.GetOutputShape()

LayerBase<T>.GetWeights()

LayerBase<T>.GetBiases()

LayerBase<T>.MapActivationToFused()

LayerBase<T>.SupportsGpuTraining

LayerBase<T>.CanExecuteOnGpu

LayerBase<T>.CanTrainOnGpu

LayerBase<T>.UpdateParametersGpu(IGpuOptimizerConfig)

LayerBase<T>.UploadWeightsToGpu()

LayerBase<T>.DownloadWeightsFromGpu()

LayerBase<T>.ZeroGradientsGpu()

LayerBase<T>.GetActivationTypes()

LayerBase<T>.Forward(params Tensor<T>[])

LayerBase<T>.ApplyActivation(Tensor<T>)

LayerBase<T>.ApplyActivation(Vector<T>)

LayerBase<T>.ActivateTensor(IActivationFunction<T>, Tensor<T>)

LayerBase<T>.ActivateTensor(IVectorActivationFunction<T>, Tensor<T>)

LayerBase<T>.CalculateInputShape(int, int, int)

LayerBase<T>.CalculateOutputShape(int, int, int)

LayerBase<T>.Clone()

LayerBase<T>.DerivativeTensor(IActivationFunction<T>, Tensor<T>)

LayerBase<T>.ApplyActivationDerivative(T, T)

LayerBase<T>.ApplyActivationDerivative(Tensor<T>, Tensor<T>)

LayerBase<T>.ComputeActivationJacobian(Vector<T>)

LayerBase<T>.ApplyActivationDerivative(Vector<T>, Vector<T>)

LayerBase<T>.UpdateParameters(Vector<T>)

LayerBase<T>.ParameterCount

LayerBase<T>.Serialize(BinaryWriter)

LayerBase<T>.Deserialize(BinaryReader)

LayerBase<T>.SetParameters(Vector<T>)

LayerBase<T>.GetDiagnostics()

LayerBase<T>.ApplyActivationToGraph(ComputationNode<T>)

LayerBase<T>.CanActivationBeJitted()

LayerBase<T>.RegisterTrainableParameter(Tensor<T>, PersistentTensorRole)

LayerBase<T>.InvalidateTrainableParameter(Tensor<T>)

LayerBase<T>.HasGpuActivation()

LayerBase<T>.ApplyActivationForwardGpu(IDirectGpuBackend, IGpuBuffer, IGpuBuffer, int)

LayerBase<T>.ApplyActivationBackwardGpu(IDirectGpuBackend, IGpuBuffer, IGpuBuffer, IGpuBuffer, IGpuBuffer, int)

LayerBase<T>.GetFusedActivationType()

LayerBase<T>.ApplyGpuActivation(IDirectGpuBackend, IGpuBuffer, IGpuBuffer, int, FusedActivationType)

LayerBase<T>.ApplyGpuActivationBackward(IDirectGpuBackend, IGpuBuffer, IGpuBuffer, IGpuBuffer, IGpuBuffer, int, FusedActivationType, float)

LayerBase<T>.Dispose()

LayerBase<T>.Dispose(bool)

LayerBase<T>.WeightParameterName

LayerBase<T>.BiasParameterName

LayerBase<T>.SetWeights(Tensor<T>)

LayerBase<T>.SetBiases(Tensor<T>)

LayerBase<T>.GetParameterNames()

LayerBase<T>.TryGetParameter(string, out Tensor<T>)

LayerBase<T>.SetParameter(string, Tensor<T>)

LayerBase<T>.GetParameterShape(string)

LayerBase<T>.NamedParameterCount

LayerBase<T>.ValidateWeights(IEnumerable<string>, Func<string, string>)

LayerBase<T>.LoadWeights(Dictionary<string, Tensor<T>>, Func<string, string>, bool)

object.Equals(object)

object.Equals(object, object)

object.GetHashCode()

object.GetType()

object.MemberwiseClone()

object.ReferenceEquals(object, object)

object.ToString()

Remarks

The BasicBlock contains two 3x3 convolutional layers with batch normalization and ReLU activation. A skip connection adds the input directly to the output, enabling gradient flow through very deep networks.

Architecture:

Input ─┬─ Conv3x3 ─ BN ─ ReLU ─ Conv3x3 ─ BN ─┬─ (+) ─ ReLU ─ Output
       │                                       │
       └───────────── [Downsample?] ───────────┘

For Beginners: The BasicBlock is like a "learning module" with a shortcut.

The key insight is:

The two conv layers learn to predict what needs to be ADDED to the input (the "residual")
The skip connection adds the original input back to this learned residual
This makes it easier to train very deep networks because gradients can flow directly through the skip connection

When the input and output have different dimensions (due to stride or channel changes), a downsample layer (1x1 conv + BN) is used to match the dimensions before adding.

Constructors

BasicBlock(int, int, int, int, int, bool)

Initializes a new instance of the BasicBlock<T> class.

public BasicBlock(int inChannels, int outChannels, int stride = 1, int inputHeight = 56, int inputWidth = 56, bool zeroInitResidual = true)

Parameters

inChannels int: The number of input channels.
outChannels int: The number of output channels.
stride int: The stride for the first convolution (default: 1).
inputHeight int: The input spatial height.
inputWidth int: The input spatial width.
zeroInitResidual bool: If true, initialize the last BN to zero for better training stability.

Remarks

For Beginners: When stride > 1, the block will downsample the spatial dimensions. When inChannels != outChannels, a projection shortcut is used to match dimensions.

Fields

Expansion

The expansion factor for BasicBlock. BasicBlock does not expand channels.

public const int Expansion = 1

Field Value

int

Properties

SupportsGpuExecution

Gets a value indicating whether this layer has a GPU implementation.

protected override bool SupportsGpuExecution { get; }

Property Value

bool

SupportsJitCompilation

Gets whether this layer supports JIT compilation.

public override bool SupportsJitCompilation { get; }

Property Value

bool

Remarks

BasicBlock supports JIT compilation when all its sub-layers support JIT. This includes conv1, bn1, conv2, bn2, and optionally the downsample layers.

SupportsTraining

Gets a value indicating whether this layer supports training.

public override bool SupportsTraining { get; }

Property Value

bool

Methods

Backward(Tensor<T>)

Performs the backward pass through the BasicBlock.

public override Tensor<T> Backward(Tensor<T> outputGradient)

Parameters

outputGradient Tensor<T>: The gradient of the loss with respect to the output.

Returns

Tensor<T>: The gradient of the loss with respect to the input.

BackwardGpu(IGpuTensor<T>)

GPU-accelerated backward pass through the BasicBlock.

public override IGpuTensor<T> BackwardGpu(IGpuTensor<T> outputGradient)

Parameters

outputGradient IGpuTensor<T>: The gradient of the loss with respect to the output.

Returns

IGpuTensor<T>: GPU-resident gradient of the loss with respect to the input.

ExportComputationGraph(List<ComputationNode<T>>)

Exports the computation graph for JIT compilation.

public override ComputationNode<T> ExportComputationGraph(List<ComputationNode<T>> inputNodes)

Parameters

inputNodes List<ComputationNode<T>>: List to populate with input computation nodes.

Returns

ComputationNode<T>: The output computation node representing the BasicBlock.

Remarks

This method builds a computation graph representing the BasicBlock: Input -> Conv1 -> BN1 -> ReLU -> Conv2 -> BN2 -> (+Identity) -> ReLU -> Output

For JIT compilation, we chain the sub-layer computation graphs together and add the residual connection using TensorOperations.Add.

Forward(Tensor<T>)

Performs the forward pass through the BasicBlock.

public override Tensor<T> Forward(Tensor<T> input)

Parameters

input Tensor<T>: The input tensor.

Returns

Tensor<T>: The output tensor after the residual connection.

ForwardGpu(params IGpuTensor<T>[])

Performs the forward pass on GPU, keeping data GPU-resident.

public override IGpuTensor<T> ForwardGpu(params IGpuTensor<T>[] inputs)

Parameters

inputs IGpuTensor<T>[]: The input tensors (expects single input).

Returns

IGpuTensor<T>: The output tensor on GPU.

GetParameters()

Gets all trainable parameters.

public override Vector<T> GetParameters()

Returns

Vector<T>: A vector containing all parameters.

ResetState()

Resets the internal state of the block.

public override void ResetState()

UpdateParameters(T)

Updates the parameters of all internal layers.

public override void UpdateParameters(T learningRate)

Parameters

learningRate T: The learning rate.

Table of Contents

Class BasicBlock<T>

Type Parameters

Remarks

Constructors

BasicBlock(int, int, int, int, int, bool)

Parameters

Remarks

Fields

Expansion

Field Value

Properties

SupportsGpuExecution

Property Value

SupportsJitCompilation

Property Value

Remarks

SupportsTraining

Property Value

Methods

Backward(Tensor<T>)

Parameters

Returns

BackwardGpu(IGpuTensor<T>)

Parameters

Returns

ExportComputationGraph(List<ComputationNode<T>>)

Parameters

Returns

Remarks

Forward(Tensor<T>)

Parameters

Returns

ForwardGpu(params IGpuTensor<T>[])

Parameters

Returns

GetParameters()

Returns

ResetState()

UpdateParameters(T)

Parameters