Class Conv3DLayer<T>
- Namespace
- AiDotNet.NeuralNetworks.Layers
- Assembly
- AiDotNet.dll
Represents a 3D convolutional layer for processing volumetric data like voxel grids.
public class Conv3DLayer<T> : LayerBase<T>, ILayer<T>, IJitCompilable<T>, IDiagnosticsProvider, IWeightLoadable<T>, IDisposable
Type Parameters
TThe numeric type used for calculations, typically float or double.
- Inheritance
-
LayerBase<T>Conv3DLayer<T>
- Implements
-
ILayer<T>
- Inherited Members
Remarks
A 3D convolutional layer applies learnable filters to volumetric input data to extract spatial features across all three dimensions. This is essential for processing 3D data such as voxelized point clouds, medical imaging (CT/MRI), or video sequences.
For Beginners: A 3D convolutional layer is like a 2D convolution but extended to work with volumetric data.
Think of it like examining a 3D cube of data:
- A 2D convolution slides a filter across height and width
- A 3D convolution slides a filter across depth, height, and width
This is useful for:
- Recognizing 3D shapes from voxel grids (like ModelNet40)
- Analyzing medical scans (CT, MRI)
- Processing video frames as a 3D volume
The layer learns to detect 3D patterns like edges, surfaces, and volumes.
Constructors
Conv3DLayer(int, int, int, int, int, int, int, int, IActivationFunction<T>?)
Initializes a new instance of the Conv3DLayer<T> class with specified parameters.
public Conv3DLayer(int inputChannels, int outputChannels, int kernelSize, int inputDepth, int inputHeight, int inputWidth, int stride = 1, int padding = 0, IActivationFunction<T>? activationFunction = null)
Parameters
inputChannelsintNumber of input channels.
outputChannelsintNumber of output channels (filters).
kernelSizeintSize of the 3D convolution kernel.
inputDepthintDepth of the input volume.
inputHeightintHeight of the input volume.
inputWidthintWidth of the input volume.
strideintStride of the convolution. Defaults to 1.
paddingintZero-padding added to all sides. Defaults to 0.
activationFunctionIActivationFunction<T>
Remarks
For Beginners: This creates a 3D convolutional layer that processes volumetric data.
The layer will: 1. Apply 3D convolution with the specified kernel size 2. Add learned biases 3. Apply the activation function (ReLU by default)
Exceptions
- ArgumentOutOfRangeException
Thrown when any dimension parameter is non-positive.
Conv3DLayer(int, int, int, int, int, int, int, int, IVectorActivationFunction<T>?)
Initializes a new instance of the Conv3DLayer<T> class with a vector activation function.
public Conv3DLayer(int inputChannels, int outputChannels, int kernelSize, int inputDepth, int inputHeight, int inputWidth, int stride = 1, int padding = 0, IVectorActivationFunction<T>? vectorActivationFunction = null)
Parameters
inputChannelsintNumber of input channels.
outputChannelsintNumber of output channels (filters).
kernelSizeintSize of the 3D convolution kernel.
inputDepthintDepth of the input volume.
inputHeightintHeight of the input volume.
inputWidthintWidth of the input volume.
strideintStride of the convolution. Defaults to 1.
paddingintZero-padding added to all sides. Defaults to 0.
vectorActivationFunctionIVectorActivationFunction<T>The vector activation function to apply. Defaults to ReLU.
Remarks
Vector activation functions operate on entire vectors at once, which can be more efficient for certain operations like Softmax that need to consider all elements together.
Properties
InputChannels
Gets the number of input channels expected by this layer.
public int InputChannels { get; }
Property Value
Remarks
Input channels represent the depth of the input volume in the channel dimension. For raw voxel data, this is typically 1 (occupancy). For multi-feature voxels, this could be higher (e.g., density, color, normals).
KernelSize
Gets the size of the 3D convolution kernel (same for depth, height, width).
public int KernelSize { get; }
Property Value
Remarks
The kernel size determines the receptive field of each convolution operation. Typical values are 3 (most common), 5, or 7. Larger kernels capture more context but are more computationally expensive.
OutputChannels
Gets the number of output channels (filters) produced by this layer.
public int OutputChannels { get; }
Property Value
Remarks
Each output channel corresponds to one learned 3D filter that detects a specific volumetric pattern. More output channels allow the layer to learn more diverse features but increase computational cost.
Padding
Gets the zero-padding applied to all sides of the input volume.
public int Padding { get; }
Property Value
Remarks
Padding adds zeros around the input volume to control the output size. With padding = (kernel_size - 1) / 2, the output has the same spatial dimensions as the input (when stride = 1).
ParameterCount
Gets the total number of trainable parameters in the layer.
public override int ParameterCount { get; }
Property Value
- int
The sum of the number of kernel weights and biases.
Remarks
This equals: OutputChannels * InputChannels * KernelSize^3 + OutputChannels
Stride
Gets the stride of the convolution (step size when sliding the kernel).
public int Stride { get; }
Property Value
Remarks
Stride controls how much the kernel moves between positions. A stride of 1 produces the largest output. Stride of 2 halves each spatial dimension (downsampling).
SupportsGpuExecution
Gets a value indicating whether this layer supports GPU execution.
protected override bool SupportsGpuExecution { get; }
Property Value
SupportsGpuTraining
Gets a value indicating whether this layer supports GPU-resident training.
public override bool SupportsGpuTraining { get; }
Property Value
SupportsJitCompilation
Gets a value indicating whether this layer supports JIT compilation for accelerated execution.
public override bool SupportsJitCompilation { get; }
Property Value
- bool
trueif kernels and biases are initialized and activation can be JIT compiled.
SupportsTraining
Gets a value indicating whether this layer supports training (backpropagation).
public override bool SupportsTraining { get; }
Property Value
- bool
Always
truefor Conv3DLayer as it has learnable parameters.
Methods
Backward(Tensor<T>)
Performs the backward pass to compute gradients for training.
public override Tensor<T> Backward(Tensor<T> outputGradient)
Parameters
outputGradientTensor<T>The gradient of the loss with respect to this layer's output.
Returns
- Tensor<T>
The gradient of the loss with respect to this layer's input.
Remarks
The backward pass routes to either manual or autodiff implementation based on the UseAutodiff property.
Exceptions
- InvalidOperationException
Thrown when Forward has not been called.
BackwardGpu(IGpuTensor<T>)
Performs the backward pass on GPU tensors.
public override IGpuTensor<T> BackwardGpu(IGpuTensor<T> outputGradient)
Parameters
outputGradientIGpuTensor<T>GPU tensor containing the gradient of the loss with respect to the output.
Returns
- IGpuTensor<T>
GPU tensor containing the gradient of the loss with respect to the input.
Clone()
Creates a deep copy of the layer with the same configuration and parameters.
public override LayerBase<T> Clone()
Returns
- LayerBase<T>
A new instance of the Conv3DLayer<T> with identical configuration and parameters.
Remarks
The clone is completely independent from the original layer. Changes to one will not affect the other.
Deserialize(BinaryReader)
Deserializes the layer from a binary stream.
public override void Deserialize(BinaryReader reader)
Parameters
readerBinaryReaderThe binary reader to deserialize from.
ExportComputationGraph(List<ComputationNode<T>>)
Exports the layer as a computation graph for JIT compilation.
public override ComputationNode<T> ExportComputationGraph(List<ComputationNode<T>> inputNodes)
Parameters
inputNodesList<ComputationNode<T>>List to populate with input nodes.
Returns
- ComputationNode<T>
The output computation node.
Exceptions
- ArgumentNullException
Thrown when inputNodes is null.
- InvalidOperationException
Thrown when layer is not properly initialized.
Forward(Tensor<T>)
Performs the forward pass of the 3D convolution operation.
public override Tensor<T> Forward(Tensor<T> input)
Parameters
inputTensor<T>The input tensor with shape [batch, channels, depth, height, width] or [channels, depth, height, width].
Returns
- Tensor<T>
The output tensor after convolution, bias addition, and activation. Shape: [batch, OutputChannels, outD, outH, outW] or [OutputChannels, outD, outH, outW].
Remarks
This method uses the vectorized IEngine.Conv3D operation for CPU/GPU acceleration. The computation flow is: 1. Reshape input to 5D if needed (add batch dimension) 2. Perform 3D convolution using Engine.Conv3D 3. Add biases using Engine.TensorBroadcastAdd 4. Apply activation function
Exceptions
- ArgumentException
Thrown when input tensor has invalid rank or dimensions.
ForwardGpu(params IGpuTensor<T>[])
Performs the forward pass using GPU-resident tensors, keeping all data on GPU.
public override IGpuTensor<T> ForwardGpu(params IGpuTensor<T>[] inputs)
Parameters
inputsIGpuTensor<T>[]GPU-resident input tensor [batch, inChannels, inDepth, inHeight, inWidth] in NCDHW format.
Returns
- IGpuTensor<T>
GPU-resident output tensor [batch, outChannels, outDepth, outHeight, outWidth] in NCDHW format.
Remarks
For Beginners: This is the GPU-optimized version of the Forward method. All data stays on the GPU throughout the computation, avoiding expensive CPU-GPU transfers.
GetBiases()
Gets the bias tensor.
public override Tensor<T> GetBiases()
Returns
- Tensor<T>
The bias tensor with shape [OutputChannels].
GetFilters()
Gets the convolution filter kernels.
public Tensor<T> GetFilters()
Returns
- Tensor<T>
The kernel tensor.
GetParameters()
Gets all trainable parameters as a single vector.
public override Vector<T> GetParameters()
Returns
- Vector<T>
A vector containing all kernel and bias parameters.
GetWeights()
Gets the kernel weights tensor.
public override Tensor<T> GetWeights()
Returns
- Tensor<T>
The kernel tensor with shape [OutputChannels, InputChannels, KernelSize, KernelSize, KernelSize].
ResetState()
Resets the cached state from forward/backward passes.
public override void ResetState()
Remarks
Call this method to free memory after training is complete or when switching between training and inference modes.
Serialize(BinaryWriter)
Serializes the layer to a binary stream.
public override void Serialize(BinaryWriter writer)
Parameters
writerBinaryWriterThe binary writer to serialize to.
SetParameters(Vector<T>)
Sets all trainable parameters from a single vector.
public override void SetParameters(Vector<T> parameters)
Parameters
parametersVector<T>Vector containing all parameters (kernels followed by biases).
Exceptions
- ArgumentException
Thrown when parameter count does not match expected.
UpdateParameters(T)
Updates the layer parameters using the computed gradients and learning rate.
public override void UpdateParameters(T learningRate)
Parameters
learningRateTThe learning rate for gradient descent.
Exceptions
- InvalidOperationException
Thrown when Backward has not been called.
UpdateParametersGpu(IGpuOptimizerConfig)
Updates parameters on GPU using the configured optimizer.
public override void UpdateParametersGpu(IGpuOptimizerConfig config)
Parameters
configIGpuOptimizerConfigThe GPU optimizer configuration.