Table of Contents

Class UNet3D<T>

Namespace
AiDotNet.NeuralNetworks
Assembly
AiDotNet.dll

Represents a 3D U-Net neural network for volumetric semantic segmentation.

public class UNet3D<T> : NeuralNetworkBase<T>, INeuralNetworkModel<T>, INeuralNetwork<T>, IFullModel<T, Tensor<T>, Tensor<T>>, IModel<Tensor<T>, Tensor<T>, ModelMetadata<T>>, IModelSerializer, ICheckpointableModel, IParameterizable<T, Tensor<T>, Tensor<T>>, IFeatureAware, IFeatureImportance<T>, ICloneable<IFullModel<T, Tensor<T>, Tensor<T>>>, IGradientComputable<T, Tensor<T>, Tensor<T>>, IJitCompilable<T>, IInterpretableModel<T>, IInputGradientComputable<T>, IDisposable

Type Parameters

T

The numeric type used for calculations (typically float or double).

Inheritance
UNet3D<T>
Implements
IFullModel<T, Tensor<T>, Tensor<T>>
IModel<Tensor<T>, Tensor<T>, ModelMetadata<T>>
IParameterizable<T, Tensor<T>, Tensor<T>>
ICloneable<IFullModel<T, Tensor<T>, Tensor<T>>>
IGradientComputable<T, Tensor<T>, Tensor<T>>
Inherited Members
Extension Methods

Remarks

A 3D U-Net extends the classic U-Net architecture to three dimensions for processing volumetric data. It uses an encoder-decoder structure with skip connections to produce dense, per-voxel predictions while preserving both local details and global context.

For Beginners: A 3D U-Net is like an intelligent 3D scanner that can identify and label every single voxel (3D pixel) in a 3D volume.

Think of it like this:

  • The encoder (left side of "U") looks at the big picture by progressively zooming out
  • The decoder (right side of "U") zooms back in to produce detailed predictions
  • Skip connections (horizontal lines in "U") preserve fine details from encoder to decoder

This is useful for:

  • Medical imaging: Finding organs or tumors in CT/MRI scans
  • 3D scene understanding: Segmenting objects in point clouds
  • Part segmentation: Identifying different parts of 3D shapes

The "U" shape comes from the symmetric encoder-decoder design with skip connections.

Constructors

UNet3D(NeuralNetworkArchitecture<T>, int, int, int, IGradientBasedOptimizer<T, Tensor<T>, Tensor<T>>?, ILossFunction<T>?, double)

Initializes a new instance of the UNet3D<T> class.

public UNet3D(NeuralNetworkArchitecture<T> architecture, int voxelResolution = 32, int numEncoderBlocks = 4, int baseFilters = 32, IGradientBasedOptimizer<T, Tensor<T>, Tensor<T>>? optimizer = null, ILossFunction<T>? lossFunction = null, double maxGradNorm = 1)

Parameters

architecture NeuralNetworkArchitecture<T>

The architecture defining the structure of the neural network.

voxelResolution int

The resolution of the voxel grid (e.g., 32 for 32x32x32). Default is 32.

numEncoderBlocks int

Number of encoder blocks. Default is 4.

baseFilters int

Base number of filters in first encoder block. Default is 32.

optimizer IGradientBasedOptimizer<T, Tensor<T>, Tensor<T>>

The optimizer for training. Defaults to Adam if not specified.

lossFunction ILossFunction<T>

The loss function. Defaults based on task type if not specified.

maxGradNorm double

Maximum gradient norm for clipping. Defaults to 1.0.

Remarks

For Beginners: This constructor creates a 3D U-Net with the specified configuration.

Key parameters explained:

  • voxelResolution: The size of the 3D input grid (32 = 32×32×32 voxels)
  • numEncoderBlocks: How many downsampling stages (more = deeper network)
  • baseFilters: Starting number of feature detectors (32 is a good default)

Exceptions

ArgumentNullException

Thrown when architecture is null.

ArgumentException

Thrown when voxelResolution or numEncoderBlocks is not positive.

Properties

BaseFilters

Gets the base number of filters in the first encoder block.

public int BaseFilters { get; }

Property Value

int

Remarks

This value doubles with each encoder block. For example, with baseFilters=32 and 4 blocks, the filter counts will be 32, 64, 128, 256.

NumClasses

Gets the number of output classes (segmentation categories).

public int NumClasses { get; }

Property Value

int

Remarks

For binary segmentation (foreground/background), this is 1. For multi-class segmentation, this equals the number of categories.

NumEncoderBlocks

Gets the number of encoder blocks in the network.

public int NumEncoderBlocks { get; }

Property Value

int

Remarks

Each encoder block consists of two Conv3D layers followed by a MaxPool3D layer (except the last encoder block). More blocks allow deeper feature extraction but require higher input resolution and more computation.

VoxelResolution

Gets the voxel grid resolution used by this network.

public int VoxelResolution { get; }

Property Value

int

Remarks

The voxel resolution determines the spatial dimensions of the input and output 3D grids. A resolution of 32 means the network processes 32×32×32 voxel grids. Input and output have the same spatial resolution (dense prediction).

Methods

Backward(Tensor<T>)

Performs a backward pass through the network to compute gradients.

public Tensor<T> Backward(Tensor<T> outputGradient)

Parameters

outputGradient Tensor<T>

The gradient of the loss with respect to the output.

Returns

Tensor<T>

The gradient of the loss with respect to the input.

Remarks

The backward pass propagates gradients from the output back through each layer, computing gradients for all trainable parameters.

CreateNewInstance()

Creates a new instance of this model type for cloning purposes.

protected override IFullModel<T, Tensor<T>, Tensor<T>> CreateNewInstance()

Returns

IFullModel<T, Tensor<T>, Tensor<T>>

A new UNet3D<T> instance with the same configuration.

Remarks

For Beginners: This creates a blank version of the same type of neural network.

It's used internally by methods like DeepCopy and Clone to create the right type of network before copying the data into it.

DeserializeNetworkSpecificData(BinaryReader)

Deserializes network-specific data from a binary stream.

protected override void DeserializeNetworkSpecificData(BinaryReader reader)

Parameters

reader BinaryReader

The binary reader to deserialize from.

Remarks

This method is called at the end of the general deserialization process to allow derived classes to read any additional data specific to their implementation.

For Beginners: Continuing the suitcase analogy, this is like unpacking that special compartment. After the main deserialization method has unpacked the common items (layers, parameters), this method allows each specific type of neural network to unpack its own unique items that were stored during serialization.

Forward(Tensor<T>)

Performs a forward pass through the network.

public Tensor<T> Forward(Tensor<T> input)

Parameters

input Tensor<T>

The input voxel grid tensor with shape [batch, channels, depth, height, width] or [channels, depth, height, width] for single samples.

Returns

Tensor<T>

The output segmentation map with shape [batch, numClasses, depth, height, width] or [numClasses, depth, height, width] for single samples.

Remarks

The forward pass sequentially applies each layer's transformation to the input, producing per-voxel class predictions for 3D semantic segmentation.

GetModelMetadata()

Gets metadata about this model for serialization and inspection.

public override ModelMetadata<T> GetModelMetadata()

Returns

ModelMetadata<T>

A ModelMetadata<T> object containing model information.

InitializeLayers()

Initializes the layers of the 3D U-Net.

protected override void InitializeLayers()

Remarks

If the architecture provides custom layers, those are used. Otherwise, default layers are created using CreateDefaultUNet3DLayers(NeuralNetworkArchitecture<T>, int, int, int).

Predict(Tensor<T>)

Generates predictions for the given input.

public override Tensor<T> Predict(Tensor<T> input)

Parameters

input Tensor<T>

The input voxel grid tensor.

Returns

Tensor<T>

The predicted segmentation map.

Remarks

For Beginners: This is the main method you'll use to get results from your trained neural network. You provide some input data (like an image or text), and the network processes it through all its layers to produce an output (like a classification or prediction).

SerializeNetworkSpecificData(BinaryWriter)

Serializes network-specific data to a binary stream.

protected override void SerializeNetworkSpecificData(BinaryWriter writer)

Parameters

writer BinaryWriter

The binary writer to serialize to.

Remarks

This method is called at the end of the general serialization process to allow derived classes to write any additional data specific to their implementation.

For Beginners: Think of this as packing a special compartment in your suitcase. While the main serialization method packs the common items (layers, parameters), this method allows each specific type of neural network to pack its own unique items that other networks might not have.

Train(Tensor<T>, Tensor<T>)

Trains the network on a single batch of input-output pairs.

public override void Train(Tensor<T> input, Tensor<T> expectedOutput)

Parameters

input Tensor<T>

The input voxel grid tensor.

expectedOutput Tensor<T>

The expected segmentation map (ground truth labels).

Remarks

Training involves: 1. Forward pass to compute predictions 2. Loss calculation between predictions and expected output 3. Backward pass to compute gradients 4. Gradient clipping to prevent exploding gradients 5. Parameter update using the optimizer

UpdateParameters(Vector<T>)

Updates the network parameters using a flat parameter vector.

public override void UpdateParameters(Vector<T> parameters)

Parameters

parameters Vector<T>

Vector containing all parameters to set.

Remarks

This method distributes parameters from a flat vector to each layer based on their parameter counts.