Table of Contents

Class Conv3DLayer<T>

Namespace
AiDotNet.NeuralNetworks.Layers
Assembly
AiDotNet.dll

Represents a 3D convolutional layer for processing volumetric data like voxel grids.

public class Conv3DLayer<T> : LayerBase<T>, ILayer<T>, IJitCompilable<T>, IDiagnosticsProvider, IWeightLoadable<T>, IDisposable

Type Parameters

T

The numeric type used for calculations, typically float or double.

Inheritance
Conv3DLayer<T>
Implements
Inherited Members

Remarks

A 3D convolutional layer applies learnable filters to volumetric input data to extract spatial features across all three dimensions. This is essential for processing 3D data such as voxelized point clouds, medical imaging (CT/MRI), or video sequences.

For Beginners: A 3D convolutional layer is like a 2D convolution but extended to work with volumetric data.

Think of it like examining a 3D cube of data:

  • A 2D convolution slides a filter across height and width
  • A 3D convolution slides a filter across depth, height, and width

This is useful for:

  • Recognizing 3D shapes from voxel grids (like ModelNet40)
  • Analyzing medical scans (CT, MRI)
  • Processing video frames as a 3D volume

The layer learns to detect 3D patterns like edges, surfaces, and volumes.

Constructors

Conv3DLayer(int, int, int, int, int, int, int, int, IActivationFunction<T>?)

Initializes a new instance of the Conv3DLayer<T> class with specified parameters.

public Conv3DLayer(int inputChannels, int outputChannels, int kernelSize, int inputDepth, int inputHeight, int inputWidth, int stride = 1, int padding = 0, IActivationFunction<T>? activationFunction = null)

Parameters

inputChannels int

Number of input channels.

outputChannels int

Number of output channels (filters).

kernelSize int

Size of the 3D convolution kernel.

inputDepth int

Depth of the input volume.

inputHeight int

Height of the input volume.

inputWidth int

Width of the input volume.

stride int

Stride of the convolution. Defaults to 1.

padding int

Zero-padding added to all sides. Defaults to 0.

activationFunction IActivationFunction<T>

Remarks

For Beginners: This creates a 3D convolutional layer that processes volumetric data.

The layer will: 1. Apply 3D convolution with the specified kernel size 2. Add learned biases 3. Apply the activation function (ReLU by default)

Exceptions

ArgumentOutOfRangeException

Thrown when any dimension parameter is non-positive.

Conv3DLayer(int, int, int, int, int, int, int, int, IVectorActivationFunction<T>?)

Initializes a new instance of the Conv3DLayer<T> class with a vector activation function.

public Conv3DLayer(int inputChannels, int outputChannels, int kernelSize, int inputDepth, int inputHeight, int inputWidth, int stride = 1, int padding = 0, IVectorActivationFunction<T>? vectorActivationFunction = null)

Parameters

inputChannels int

Number of input channels.

outputChannels int

Number of output channels (filters).

kernelSize int

Size of the 3D convolution kernel.

inputDepth int

Depth of the input volume.

inputHeight int

Height of the input volume.

inputWidth int

Width of the input volume.

stride int

Stride of the convolution. Defaults to 1.

padding int

Zero-padding added to all sides. Defaults to 0.

vectorActivationFunction IVectorActivationFunction<T>

The vector activation function to apply. Defaults to ReLU.

Remarks

Vector activation functions operate on entire vectors at once, which can be more efficient for certain operations like Softmax that need to consider all elements together.

Properties

InputChannels

Gets the number of input channels expected by this layer.

public int InputChannels { get; }

Property Value

int

Remarks

Input channels represent the depth of the input volume in the channel dimension. For raw voxel data, this is typically 1 (occupancy). For multi-feature voxels, this could be higher (e.g., density, color, normals).

KernelSize

Gets the size of the 3D convolution kernel (same for depth, height, width).

public int KernelSize { get; }

Property Value

int

Remarks

The kernel size determines the receptive field of each convolution operation. Typical values are 3 (most common), 5, or 7. Larger kernels capture more context but are more computationally expensive.

OutputChannels

Gets the number of output channels (filters) produced by this layer.

public int OutputChannels { get; }

Property Value

int

Remarks

Each output channel corresponds to one learned 3D filter that detects a specific volumetric pattern. More output channels allow the layer to learn more diverse features but increase computational cost.

Padding

Gets the zero-padding applied to all sides of the input volume.

public int Padding { get; }

Property Value

int

Remarks

Padding adds zeros around the input volume to control the output size. With padding = (kernel_size - 1) / 2, the output has the same spatial dimensions as the input (when stride = 1).

ParameterCount

Gets the total number of trainable parameters in the layer.

public override int ParameterCount { get; }

Property Value

int

The sum of the number of kernel weights and biases.

Remarks

This equals: OutputChannels * InputChannels * KernelSize^3 + OutputChannels

Stride

Gets the stride of the convolution (step size when sliding the kernel).

public int Stride { get; }

Property Value

int

Remarks

Stride controls how much the kernel moves between positions. A stride of 1 produces the largest output. Stride of 2 halves each spatial dimension (downsampling).

SupportsGpuExecution

Gets a value indicating whether this layer supports GPU execution.

protected override bool SupportsGpuExecution { get; }

Property Value

bool

SupportsGpuTraining

Gets a value indicating whether this layer supports GPU-resident training.

public override bool SupportsGpuTraining { get; }

Property Value

bool

SupportsJitCompilation

Gets a value indicating whether this layer supports JIT compilation for accelerated execution.

public override bool SupportsJitCompilation { get; }

Property Value

bool

true if kernels and biases are initialized and activation can be JIT compiled.

SupportsTraining

Gets a value indicating whether this layer supports training (backpropagation).

public override bool SupportsTraining { get; }

Property Value

bool

Always true for Conv3DLayer as it has learnable parameters.

Methods

Backward(Tensor<T>)

Performs the backward pass to compute gradients for training.

public override Tensor<T> Backward(Tensor<T> outputGradient)

Parameters

outputGradient Tensor<T>

The gradient of the loss with respect to this layer's output.

Returns

Tensor<T>

The gradient of the loss with respect to this layer's input.

Remarks

The backward pass routes to either manual or autodiff implementation based on the UseAutodiff property.

Exceptions

InvalidOperationException

Thrown when Forward has not been called.

BackwardGpu(IGpuTensor<T>)

Performs the backward pass on GPU tensors.

public override IGpuTensor<T> BackwardGpu(IGpuTensor<T> outputGradient)

Parameters

outputGradient IGpuTensor<T>

GPU tensor containing the gradient of the loss with respect to the output.

Returns

IGpuTensor<T>

GPU tensor containing the gradient of the loss with respect to the input.

Clone()

Creates a deep copy of the layer with the same configuration and parameters.

public override LayerBase<T> Clone()

Returns

LayerBase<T>

A new instance of the Conv3DLayer<T> with identical configuration and parameters.

Remarks

The clone is completely independent from the original layer. Changes to one will not affect the other.

Deserialize(BinaryReader)

Deserializes the layer from a binary stream.

public override void Deserialize(BinaryReader reader)

Parameters

reader BinaryReader

The binary reader to deserialize from.

ExportComputationGraph(List<ComputationNode<T>>)

Exports the layer as a computation graph for JIT compilation.

public override ComputationNode<T> ExportComputationGraph(List<ComputationNode<T>> inputNodes)

Parameters

inputNodes List<ComputationNode<T>>

List to populate with input nodes.

Returns

ComputationNode<T>

The output computation node.

Exceptions

ArgumentNullException

Thrown when inputNodes is null.

InvalidOperationException

Thrown when layer is not properly initialized.

Forward(Tensor<T>)

Performs the forward pass of the 3D convolution operation.

public override Tensor<T> Forward(Tensor<T> input)

Parameters

input Tensor<T>

The input tensor with shape [batch, channels, depth, height, width] or [channels, depth, height, width].

Returns

Tensor<T>

The output tensor after convolution, bias addition, and activation. Shape: [batch, OutputChannels, outD, outH, outW] or [OutputChannels, outD, outH, outW].

Remarks

This method uses the vectorized IEngine.Conv3D operation for CPU/GPU acceleration. The computation flow is: 1. Reshape input to 5D if needed (add batch dimension) 2. Perform 3D convolution using Engine.Conv3D 3. Add biases using Engine.TensorBroadcastAdd 4. Apply activation function

Exceptions

ArgumentException

Thrown when input tensor has invalid rank or dimensions.

ForwardGpu(params IGpuTensor<T>[])

Performs the forward pass using GPU-resident tensors, keeping all data on GPU.

public override IGpuTensor<T> ForwardGpu(params IGpuTensor<T>[] inputs)

Parameters

inputs IGpuTensor<T>[]

GPU-resident input tensor [batch, inChannels, inDepth, inHeight, inWidth] in NCDHW format.

Returns

IGpuTensor<T>

GPU-resident output tensor [batch, outChannels, outDepth, outHeight, outWidth] in NCDHW format.

Remarks

For Beginners: This is the GPU-optimized version of the Forward method. All data stays on the GPU throughout the computation, avoiding expensive CPU-GPU transfers.

GetBiases()

Gets the bias tensor.

public override Tensor<T> GetBiases()

Returns

Tensor<T>

The bias tensor with shape [OutputChannels].

GetFilters()

Gets the convolution filter kernels.

public Tensor<T> GetFilters()

Returns

Tensor<T>

The kernel tensor.

GetParameters()

Gets all trainable parameters as a single vector.

public override Vector<T> GetParameters()

Returns

Vector<T>

A vector containing all kernel and bias parameters.

GetWeights()

Gets the kernel weights tensor.

public override Tensor<T> GetWeights()

Returns

Tensor<T>

The kernel tensor with shape [OutputChannels, InputChannels, KernelSize, KernelSize, KernelSize].

ResetState()

Resets the cached state from forward/backward passes.

public override void ResetState()

Remarks

Call this method to free memory after training is complete or when switching between training and inference modes.

Serialize(BinaryWriter)

Serializes the layer to a binary stream.

public override void Serialize(BinaryWriter writer)

Parameters

writer BinaryWriter

The binary writer to serialize to.

SetParameters(Vector<T>)

Sets all trainable parameters from a single vector.

public override void SetParameters(Vector<T> parameters)

Parameters

parameters Vector<T>

Vector containing all parameters (kernels followed by biases).

Exceptions

ArgumentException

Thrown when parameter count does not match expected.

UpdateParameters(T)

Updates the layer parameters using the computed gradients and learning rate.

public override void UpdateParameters(T learningRate)

Parameters

learningRate T

The learning rate for gradient descent.

Exceptions

InvalidOperationException

Thrown when Backward has not been called.

UpdateParametersGpu(IGpuOptimizerConfig)

Updates parameters on GPU using the configured optimizer.

public override void UpdateParametersGpu(IGpuOptimizerConfig config)

Parameters

config IGpuOptimizerConfig

The GPU optimizer configuration.