Table of Contents

Class VoxelCNN<T>

Namespace
AiDotNet.NeuralNetworks
Assembly
AiDotNet.dll

Represents a Voxel-based 3D Convolutional Neural Network for processing volumetric data.

public class VoxelCNN<T> : NeuralNetworkBase<T>, INeuralNetworkModel<T>, INeuralNetwork<T>, IFullModel<T, Tensor<T>, Tensor<T>>, IModel<Tensor<T>, Tensor<T>, ModelMetadata<T>>, IModelSerializer, ICheckpointableModel, IParameterizable<T, Tensor<T>, Tensor<T>>, IFeatureAware, IFeatureImportance<T>, ICloneable<IFullModel<T, Tensor<T>, Tensor<T>>>, IGradientComputable<T, Tensor<T>, Tensor<T>>, IJitCompilable<T>, IInterpretableModel<T>, IInputGradientComputable<T>, IDisposable

Type Parameters

T

The numeric type used for calculations (typically float or double).

Inheritance
VoxelCNN<T>
Implements
IFullModel<T, Tensor<T>, Tensor<T>>
IModel<Tensor<T>, Tensor<T>, ModelMetadata<T>>
IParameterizable<T, Tensor<T>, Tensor<T>>
ICloneable<IFullModel<T, Tensor<T>, Tensor<T>>>
IGradientComputable<T, Tensor<T>, Tensor<T>>
Inherited Members
Extension Methods

Remarks

A Voxel CNN processes 3D volumetric data using 3D convolutions. This is useful for: - 3D shape recognition from voxelized point clouds (e.g., ModelNet40) - Medical image analysis (CT, MRI scans) - Spatial occupancy prediction

For Beginners: Think of a VoxelCNN as a 3D version of a regular image classifier. Instead of looking at a 2D image, it examines a 3D grid of "blocks" (voxels) to understand 3D shapes. This is like how Minecraft represents the world - each block is either filled or empty, and the pattern of blocks creates recognizable objects.

Applications include:

  • Recognizing 3D objects from point cloud scans
  • Detecting tumors in 3D medical scans
  • Understanding room layouts from depth sensors

Constructors

VoxelCNN(NeuralNetworkArchitecture<T>, int, int, int, IGradientBasedOptimizer<T, Tensor<T>, Tensor<T>>?, ILossFunction<T>?, double)

Initializes a new instance of the VoxelCNN<T> class.

public VoxelCNN(NeuralNetworkArchitecture<T> architecture, int voxelResolution = 32, int numConvBlocks = 3, int baseFilters = 32, IGradientBasedOptimizer<T, Tensor<T>, Tensor<T>>? optimizer = null, ILossFunction<T>? lossFunction = null, double maxGradNorm = 1)

Parameters

architecture NeuralNetworkArchitecture<T>

The architecture defining the structure of the neural network.

voxelResolution int

The resolution of the voxel grid (e.g., 32 for 32x32x32). Default is 32.

numConvBlocks int

Number of convolutional blocks. Default is 3.

baseFilters int

Base number of filters in first conv layer. Default is 32.

optimizer IGradientBasedOptimizer<T, Tensor<T>, Tensor<T>>

The optimizer for training. Defaults to Adam if not specified.

lossFunction ILossFunction<T>

The loss function. Defaults based on task type if not specified.

maxGradNorm double

Maximum gradient norm for clipping. Defaults to 1.0.

Remarks

For Beginners: This constructor creates a VoxelCNN with the specified configuration.

Key parameters explained:

  • voxelResolution: The size of the 3D input grid (32 = 32x32x32 voxels)
  • numConvBlocks: How many conv+pool layers (more = deeper network)
  • baseFilters: Starting number of feature detectors (32 is a good default)

Exceptions

ArgumentNullException

Thrown when architecture is null.

ArgumentException

Thrown when voxelResolution or numConvBlocks is not positive.

Properties

BaseFilters

Gets the base number of filters in the first convolutional layer.

public int BaseFilters { get; }

Property Value

int

Remarks

This value doubles with each convolutional block. For example, with baseFilters=32 and 3 blocks, the filter counts will be 32, 64, 128.

NumConvBlocks

Gets the number of convolutional blocks in the network.

public int NumConvBlocks { get; }

Property Value

int

Remarks

Each convolutional block consists of a Conv3D layer followed by a MaxPool3D layer. More blocks allow the network to learn more hierarchical features but increase computational cost and risk of overfitting.

VoxelResolution

Gets the voxel grid resolution used by this network.

public int VoxelResolution { get; }

Property Value

int

Remarks

The voxel resolution determines the spatial dimensions of the input 3D grid. A resolution of 32 means the network expects 32x32x32 voxel grids. Higher resolutions capture more detail but require more computation.

Methods

Backward(Tensor<T>)

Performs a backward pass through the network to compute gradients.

public Tensor<T> Backward(Tensor<T> outputGradient)

Parameters

outputGradient Tensor<T>

The gradient of the loss with respect to the output.

Returns

Tensor<T>

The gradient of the loss with respect to the input.

Remarks

The backward pass propagates gradients from the output back through each layer, computing gradients for all trainable parameters.

CreateNewInstance()

Creates a new instance of this model type for cloning purposes.

protected override IFullModel<T, Tensor<T>, Tensor<T>> CreateNewInstance()

Returns

IFullModel<T, Tensor<T>, Tensor<T>>

A new VoxelCNN<T> instance with the same configuration.

Remarks

For Beginners: This creates a blank version of the same type of neural network.

It's used internally by methods like DeepCopy and Clone to create the right type of network before copying the data into it.

DeserializeNetworkSpecificData(BinaryReader)

Deserializes network-specific data from a binary stream.

protected override void DeserializeNetworkSpecificData(BinaryReader reader)

Parameters

reader BinaryReader

The binary reader to deserialize from.

Remarks

This method is called at the end of the general deserialization process to allow derived classes to read any additional data specific to their implementation.

For Beginners: Continuing the suitcase analogy, this is like unpacking that special compartment. After the main deserialization method has unpacked the common items (layers, parameters), this method allows each specific type of neural network to unpack its own unique items that were stored during serialization.

Forward(Tensor<T>)

Performs a forward pass through the network.

public Tensor<T> Forward(Tensor<T> input)

Parameters

input Tensor<T>

The input voxel grid tensor with shape [batch, channels, depth, height, width] or [channels, depth, height, width] for single samples.

Returns

Tensor<T>

The output predictions with shape [batch, numClasses] or [numClasses].

Remarks

The forward pass sequentially applies each layer's transformation to the input, producing class probabilities or scores for 3D shape classification.

GetModelMetadata()

Gets metadata about this model for serialization and inspection.

public override ModelMetadata<T> GetModelMetadata()

Returns

ModelMetadata<T>

A ModelMetadata<T> object containing model information.

InitializeLayers()

Initializes the layers of the VoxelCNN.

protected override void InitializeLayers()

Remarks

If the architecture provides custom layers, those are used. Otherwise, default layers are created using CreateDefaultVoxelCNNLayers(NeuralNetworkArchitecture<T>, int, int, int).

Predict(Tensor<T>)

Generates predictions for the given input.

public override Tensor<T> Predict(Tensor<T> input)

Parameters

input Tensor<T>

The input voxel grid tensor.

Returns

Tensor<T>

The predicted class probabilities or scores.

Remarks

For Beginners: This is the main method you'll use to get results from your trained neural network. You provide some input data (like an image or text), and the network processes it through all its layers to produce an output (like a classification or prediction).

SerializeNetworkSpecificData(BinaryWriter)

Serializes network-specific data to a binary stream.

protected override void SerializeNetworkSpecificData(BinaryWriter writer)

Parameters

writer BinaryWriter

The binary writer to serialize to.

Remarks

This method is called at the end of the general serialization process to allow derived classes to write any additional data specific to their implementation.

For Beginners: Think of this as packing a special compartment in your suitcase. While the main serialization method packs the common items (layers, parameters), this method allows each specific type of neural network to pack its own unique items that other networks might not have.

Train(Tensor<T>, Tensor<T>)

Trains the network on a single batch of input-output pairs.

public override void Train(Tensor<T> input, Tensor<T> expectedOutput)

Parameters

input Tensor<T>

The input voxel grid tensor.

expectedOutput Tensor<T>

The expected output (ground truth labels).

Remarks

Training involves: 1. Forward pass to compute predictions 2. Loss calculation between predictions and expected output 3. Backward pass to compute gradients 4. Gradient clipping to prevent exploding gradients 5. Parameter update using the optimizer

UpdateParameters(Vector<T>)

Updates the network parameters using a flat parameter vector.

public override void UpdateParameters(Vector<T> parameters)

Parameters

parameters Vector<T>

Vector containing all parameters to set.

Remarks

This method distributes parameters from a flat vector to each layer based on their parameter counts.