Class NeRF<T>

Namespace: AiDotNet.NeuralRadianceFields.Models

Assembly: AiDotNet.dll

Implements Neural Radiance Fields (NeRF) for novel view synthesis.

public class NeRF<T> : NeuralNetworkBase<T>, INeuralNetworkModel<T>, IInterpretableModel<T>, IInputGradientComputable<T>, IDisposable, IRadianceField<T>, INeuralNetwork<T>, IFullModel<T, Tensor<T>, Tensor<T>>, IModel<Tensor<T>, Tensor<T>, ModelMetadata<T>>, IModelSerializer, ICheckpointableModel, IParameterizable<T, Tensor<T>, Tensor<T>>, IFeatureAware, IFeatureImportance<T>, ICloneable<IFullModel<T, Tensor<T>, Tensor<T>>>, IGradientComputable<T, Tensor<T>, Tensor<T>>, IJitCompilable<T>

Type Parameters

T: The numeric type used for calculations (e.g., float, double).

Inheritance: object

NeuralNetworkBase<T>

NeRF<T>

Implements: INeuralNetworkModel<T>

IInterpretableModel<T>

IInputGradientComputable<T>

IDisposable

IRadianceField<T>

INeuralNetwork<T>

IFullModel<T, Tensor<T>, Tensor<T>>

IModel<Tensor<T>, Tensor<T>, ModelMetadata<T>>

IModelSerializer

ICheckpointableModel

IParameterizable<T, Tensor<T>, Tensor<T>>

IFeatureAware

IFeatureImportance<T>

ICloneable<IFullModel<T, Tensor<T>, Tensor<T>>>

IGradientComputable<T, Tensor<T>, Tensor<T>>

IJitCompilable<T>

Inherited Members: NeuralNetworkBase<T>.Layers

NeuralNetworkBase<T>.LayerCount

NeuralNetworkBase<T>.Architecture

NeuralNetworkBase<T>.NumOps

NeuralNetworkBase<T>.Engine

NeuralNetworkBase<T>._layerInputs

NeuralNetworkBase<T>._layerOutputs

NeuralNetworkBase<T>.Random

NeuralNetworkBase<T>.LossFunction

NeuralNetworkBase<T>.LastLoss

NeuralNetworkBase<T>.IsTrainingMode

NeuralNetworkBase<T>.SupportsGpuTraining

NeuralNetworkBase<T>.CanTrainOnGpu

NeuralNetworkBase<T>.GpuEngine

NeuralNetworkBase<T>.MaxGradNorm

NeuralNetworkBase<T>._mixedPrecisionContext

NeuralNetworkBase<T>._memoryManager

NeuralNetworkBase<T>.IsMemoryManagementEnabled

NeuralNetworkBase<T>.IsGradientCheckpointingEnabled

NeuralNetworkBase<T>.IsMixedPrecisionEnabled

NeuralNetworkBase<T>.ClipGradients(List<Tensor<T>>)

NeuralNetworkBase<T>.ClipGradient(Tensor<T>)

NeuralNetworkBase<T>.ClipGradient(Vector<T>)

NeuralNetworkBase<T>.GetParameters()

NeuralNetworkBase<T>.BackpropagateWithRecompute(Tensor<T>)

NeuralNetworkBase<T>.ForwardGpu(IGpuTensor<T>)

NeuralNetworkBase<T>.BackpropagateGpu(IGpuTensor<T>)

NeuralNetworkBase<T>.BackpropagateGpuDeferred(IGpuTensor<T>, GpuExecutionOptions)

NeuralNetworkBase<T>.UpdateParametersGpu(T, T, T)

NeuralNetworkBase<T>.UpdateParametersGpu(IGpuOptimizerConfig)

NeuralNetworkBase<T>.UpdateParametersGpuDeferred(IGpuOptimizerConfig, GpuExecutionOptions)

NeuralNetworkBase<T>.TrainBatchGpuDeferred(IGpuTensor<T>, IGpuTensor<T>, IGpuOptimizerConfig, GpuExecutionOptions)

NeuralNetworkBase<T>.TrainBatchGpuDeferredAsync(IGpuTensor<T>, IGpuTensor<T>, IGpuOptimizerConfig, GpuExecutionOptions, CancellationToken)

NeuralNetworkBase<T>.UploadWeightsToGpu()

NeuralNetworkBase<T>.DownloadWeightsFromGpu()

NeuralNetworkBase<T>.ZeroGradientsGpu()

NeuralNetworkBase<T>.ExtractSingleExample(Tensor<T>, int)

NeuralNetworkBase<T>.ForwardWithCheckpointing(Tensor<T>)

NeuralNetworkBase<T>.CanUseGpuResidentPath()

NeuralNetworkBase<T>.TryForwardGpuOptimized(Tensor<T>, out Tensor<T>)

NeuralNetworkBase<T>.ForwardGpu(Tensor<T>)

NeuralNetworkBase<T>.ForwardDeferred(Tensor<T>)

NeuralNetworkBase<T>.ForwardDeferredAsync(Tensor<T>, CancellationToken)

NeuralNetworkBase<T>.BeginGpuExecution(GpuExecutionOptions)

NeuralNetworkBase<T>.ForwardWithGpuContext(Tensor<T>)

NeuralNetworkBase<T>.ForwardWithGpuContext(IGpuTensor<T>)

NeuralNetworkBase<T>.GetGpuMemoryStats()

NeuralNetworkBase<T>.ForwardWithFeatures(Tensor<T>, int[])

NeuralNetworkBase<T>.ParameterCount

NeuralNetworkBase<T>.GetParameterCount()

NeuralNetworkBase<T>.InvalidateParameterCountCache()

NeuralNetworkBase<T>.AddLayerToCollection(ILayer<T>)

NeuralNetworkBase<T>.RemoveLayerFromCollection(ILayer<T>)

NeuralNetworkBase<T>.ClearLayers()

NeuralNetworkBase<T>.ValidateCustomLayers(List<ILayer<T>>)

NeuralNetworkBase<T>.ValidateCustomLayersInternal(List<ILayer<T>>)

NeuralNetworkBase<T>.IsValidInputLayer(ILayer<T>)

NeuralNetworkBase<T>.IsValidOutputLayer(ILayer<T>)

NeuralNetworkBase<T>.AreLayersCompatible(ILayer<T>, ILayer<T>)

NeuralNetworkBase<T>.GetParameterGradients()

NeuralNetworkBase<T>.EnsureArchitectureInitialized()

NeuralNetworkBase<T>.SetTrainingMode(bool)

NeuralNetworkBase<T>.EnableMemoryManagement(TrainingMemoryConfig)

NeuralNetworkBase<T>.DisableMemoryManagement()

NeuralNetworkBase<T>.GetMemoryEstimate(int, int)

NeuralNetworkBase<T>.GetLastLoss()

NeuralNetworkBase<T>.ResetState()

NeuralNetworkBase<T>.BackwardWithInputGradient(Tensor<T>)

NeuralNetworkBase<T>.ComputeInputGradient(Vector<T>, Vector<T>)

NeuralNetworkBase<T>.ComputeInputGradient(Tensor<T>, Tensor<T>)

NeuralNetworkBase<T>.SaveModel(string)

NeuralNetworkBase<T>.LoadModel(string)

NeuralNetworkBase<T>.Serialize()

NeuralNetworkBase<T>.Deserialize(byte[])

NeuralNetworkBase<T>.WithParameters(Vector<T>)

NeuralNetworkBase<T>.GetActiveFeatureIndices()

NeuralNetworkBase<T>.IsFeatureUsed(int)

NeuralNetworkBase<T>.DeepCopy()

NeuralNetworkBase<T>.Clone()

NeuralNetworkBase<T>.SetActiveFeatureIndices(IEnumerable<int>)

NeuralNetworkBase<T>._enabledMethods

NeuralNetworkBase<T>._sensitiveFeatures

NeuralNetworkBase<T>._fairnessMetrics

NeuralNetworkBase<T>._baseModel

NeuralNetworkBase<T>.GetGlobalFeatureImportanceAsync()

NeuralNetworkBase<T>.GetLocalFeatureImportanceAsync(Tensor<T>)

NeuralNetworkBase<T>.GetShapValuesAsync(Tensor<T>)

NeuralNetworkBase<T>.GetLimeExplanationAsync(Tensor<T>, int)

NeuralNetworkBase<T>.GetPartialDependenceAsync(Vector<int>, int)

NeuralNetworkBase<T>.GetCounterfactualAsync(Tensor<T>, Tensor<T>, int)

NeuralNetworkBase<T>.GetModelSpecificInterpretabilityAsync()

NeuralNetworkBase<T>.GenerateTextExplanationAsync(Tensor<T>, Tensor<T>)

NeuralNetworkBase<T>.GetFeatureInteractionAsync(int, int)

NeuralNetworkBase<T>.ValidateFairnessAsync(Tensor<T>, int)

NeuralNetworkBase<T>.GetAnchorExplanationAsync(Tensor<T>, T)

NeuralNetworkBase<T>.SetBaseModel<TInput, TOutput>(IFullModel<T, TInput, TOutput>)

NeuralNetworkBase<T>.EnableMethod(params InterpretationMethod[])

NeuralNetworkBase<T>.ConfigureFairness(Vector<int>, params FairnessMetric[])

NeuralNetworkBase<T>.GetNamedLayerActivations(Tensor<T>)

NeuralNetworkBase<T>.GetArchitecture()

NeuralNetworkBase<T>.GetFeatureImportance()

NeuralNetworkBase<T>.SetParameters(Vector<T>)

NeuralNetworkBase<T>.AddLayer(LayerType, int, ActivationFunction)

NeuralNetworkBase<T>.AddConvolutionalLayer(int, int, int, ActivationFunction)

NeuralNetworkBase<T>.AddLSTMLayer(int, bool)

NeuralNetworkBase<T>.AddDropoutLayer(double)

NeuralNetworkBase<T>.AddBatchNormalizationLayer(int, double, double)

NeuralNetworkBase<T>.AddPoolingLayer(int[], PoolingType, int, int?)

NeuralNetworkBase<T>.GetGradients()

NeuralNetworkBase<T>.GetInputShape()

NeuralNetworkBase<T>.GetLayerActivations(Tensor<T>)

NeuralNetworkBase<T>.DefaultLossFunction

NeuralNetworkBase<T>.ComputeGradients(Tensor<T>, Tensor<T>, ILossFunction<T>)

NeuralNetworkBase<T>.ApplyGradients(Vector<T>, T)

NeuralNetworkBase<T>.SaveState(Stream)

NeuralNetworkBase<T>.LoadState(Stream)

NeuralNetworkBase<T>.Dispose()

NeuralNetworkBase<T>.Dispose(bool)

NeuralNetworkBase<T>.SupportsJitCompilation

NeuralNetworkBase<T>.ExportComputationGraph(List<ComputationNode<T>>)

NeuralNetworkBase<T>.ConvertLayerToGraph(ILayer<T>, ComputationNode<T>)

object.Equals(object)

object.Equals(object, object)

object.GetHashCode()

object.GetType()

object.MemberwiseClone()

object.ReferenceEquals(object, object)

object.ToString()

Extension Methods: DistributedExtensions.AsDistributedForHighBandwidth<T, TInput, TOutput>(IFullModel<T, TInput, TOutput>, ICommunicationBackend<T>)

DistributedExtensions.AsDistributedForLowBandwidth<T, TInput, TOutput>(IFullModel<T, TInput, TOutput>, ICommunicationBackend<T>)

DistributedExtensions.AsDistributed<T, TInput, TOutput>(IFullModel<T, TInput, TOutput>, ICommunicationBackend<T>)

DistributedExtensions.AsDistributed<T, TInput, TOutput>(IFullModel<T, TInput, TOutput>, IShardingConfiguration<T>)

Remarks

For Beginners: NeRF is a groundbreaking method for creating photorealistic 3D scenes from 2D images.

What NeRF does: - Input: Collection of photos of a scene from different angles - Training: Learn a neural network that represents the 3D scene - Output: Ability to render the scene from any new viewpoint

Key innovation: - Represents entire 3D scene as a continuous 5D function - Input: (x, y, z, θ, φ) - position and viewing direction - Output: (r, g, b, σ) - color and volume density

Architecture: 1. Positional encoding: Transform (x,y,z) to higher-dimensional space - Why: Helps network learn high-frequency details - Example: (x,y,z) → [sin(x), cos(x), sin(2x), cos(2x), ..., sin(2^L*x), cos(2^L*x)] - Similar encoding for direction (θ, φ)

Coarse network (8 layers, 256 units):
- Input: Encoded position
- Output: Density + intermediate features
- Input: Intermediate features + encoded direction
- Output: RGB color
Fine network (same structure):
- Resamples based on coarse network predictions
- Focuses samples where density is high
- Produces final high-quality output

Why positional encoding matters: - Neural networks naturally learn low-frequency functions (smooth, blurry) - Real scenes have high-frequency details (sharp edges, textures) - Positional encoding enables learning high-frequency details - Without it: Blurry reconstructions - With it: Sharp, detailed reconstructions

Training process: 1. Sample random rays from training images 2. Sample points along each ray 3. Query network at each sample point 4. Render ray using volume rendering 5. Compare rendered color to actual pixel color 6. Backpropagate error and update network weights 7. Repeat for thousands of iterations

Hierarchical sampling: - Coarse sampling: Uniform samples along ray - Analyze coarse results: Where is density high? - Fine sampling: More samples where density is high (near surfaces) - Final rendering: Use both coarse and fine samples - Result: Better quality with fewer total samples

Rendering equation (volume rendering): C(r) = Σ T(t_i) * (1 - exp(-σ_i * δ_i)) * c_i where: - C(r): Final color of ray r - T(t_i): Transmittance (how much light reaches point i) - σ_i: Density at sample point i - δ_i: Distance between sample points - c_i: Color at sample point i - T(t_i) = exp(-Σ(j<i) σ_j * δ_j)

Applications: - Virtual reality: Create immersive 3D environments from photos - Film industry: Digitize real locations for CGI - Real estate: Virtual property tours - Cultural heritage: Preserve historical sites digitally - Robotics: Build 3D maps for navigation - Medical imaging: Reconstruct 3D anatomy from scans

Limitations of original NeRF: - Slow training: Hours to days per scene - Slow rendering: Seconds per image - Scene-specific: Must retrain for each new scene - Static only: Can't handle moving objects

These limitations led to many improved variants:

Instant-NGP: 100x faster training and rendering
Plenoxels: No neural network, faster optimization
TensoRF: Tensor decomposition for efficiency
Dynamic NeRF: Handle time-varying scenes
Mip-NeRF: Better handling of scale/blur

Reference: "NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis" by Mildenhall et al., ECCV 2020

Constructors

NeRF(int, int, int, int, int, int, bool, int, int, double, double, double, ILossFunction<T>?)

Creates a new NeRF model for 3D scene representation and novel view synthesis.

public NeRF(int positionEncodingLevels = 10, int directionEncodingLevels = 4, int hiddenDim = 256, int numLayers = 8, int colorHiddenDim = 128, int colorNumLayers = 1, bool useHierarchicalSampling = true, int renderSamples = 64, int hierarchicalSamples = 128, double renderNearBound = 2, double renderFarBound = 6, double learningRate = 0.0005, ILossFunction<T>? lossFunction = null)

Parameters

positionEncodingLevels int: Number of frequency levels for position encoding. Higher values enable more high-frequency details but are harder to optimize. Default is 10.
directionEncodingLevels int: Number of frequency levels for direction encoding. Lower than position (view dependence is smoother than geometry). Default is 4.
hiddenDim int: Size of hidden layers. Larger values have more capacity but are slower. Default is 256.
numLayers int: Depth of network. More layers can learn more complex functions. Default is 8.
colorHiddenDim int: Hidden dimension for color prediction network. Default is 128.
colorNumLayers int: Number of layers in color prediction network. Default is 1.
useHierarchicalSampling bool: Whether to use two-stage rendering (coarse + fine). True gives better quality but is slower. Default is true.
renderSamples int: Number of samples per ray for rendering. Default is 64.
hierarchicalSamples int: Additional samples for hierarchical sampling. Default is 128.
renderNearBound double: Near bound for ray sampling. Default is 2.0.
renderFarBound double: Far bound for ray sampling. Default is 6.0.
learningRate double: Learning rate for training. Default is 5e-4.
lossFunction ILossFunction<T>: Loss function for training. If null, MSE loss is used.

Remarks

For Beginners: Creates a NeRF model for 3D scene representation.

Parameters explained:

positionEncodingLevels: How many frequencies for position encoding
- Higher = more high-frequency details (but harder to optimize)
- Typical: 10 (produces 60-dimensional encoding from 3D position)
- Formula: 3 * 2 * L = 60 for L=10
directionEncodingLevels: Frequencies for viewing direction encoding
- Lower than position (view dependence is smoother than geometry)
- Typical: 4 (produces 24-dimensional encoding from 2D direction)
- Formula: 3 * 2 * L' = 24 for L'=4
hiddenDim: Size of hidden layers
- Larger = more capacity (can represent more complex scenes)
- Larger = slower and needs more memory
- Typical: 256
numLayers: Depth of network
- More layers = can learn more complex functions
- More layers = slower and harder to train
- Typical: 8
useHierarchicalSampling: Two-stage rendering (coarse + fine)
- True: Better quality, slower (recommended)
- False: Faster, lower quality

Standard NeRF configuration:

var nerf = new NeRF<float>(
    positionEncodingLevels: 10,
    directionEncodingLevels: 4,
    hiddenDim: 256,
    numLayers: 8,
    useHierarchicalSampling: true);

Properties

SupportsTraining

Gets whether this network supports training.

public override bool SupportsTraining { get; }

Property Value

bool

Methods

Backpropagate(Tensor<T>)

Performs backpropagation to compute gradients.

public override Tensor<T> Backpropagate(Tensor<T> outputGradient)

Parameters

outputGradient Tensor<T>

Returns

Tensor<T>

CreateNewInstance()

Creates a new instance of this model for cloning.

protected override IFullModel<T, Tensor<T>, Tensor<T>> CreateNewInstance()

Returns

IFullModel<T, Tensor<T>, Tensor<T>>

DeserializeNetworkSpecificData(BinaryReader)

Deserializes network-specific data.

protected override void DeserializeNetworkSpecificData(BinaryReader reader)

Parameters

reader BinaryReader

ForwardWithMemory(Tensor<T>)

Performs forward pass with memory for backpropagation.

public override Tensor<T> ForwardWithMemory(Tensor<T> input)

Parameters

input Tensor<T>

Returns

Tensor<T>

GetModelMetadata()

Gets metadata about the model.

public override ModelMetadata<T> GetModelMetadata()

Returns

ModelMetadata<T>

InitializeLayers()

Initializes the neural network layers.

protected override void InitializeLayers()

Predict(Tensor<T>)

Makes a prediction using the model.

public override Tensor<T> Predict(Tensor<T> input)

Parameters

input Tensor<T>

Returns

Tensor<T>

QueryField(Tensor<T>, Tensor<T>)

Queries the radiance field at given positions and viewing directions.

public (Tensor<T> rgb, Tensor<T> density) QueryField(Tensor<T> positions, Tensor<T> viewingDirections)

Parameters

positions Tensor<T>: 3D positions tensor of shape [N, 3].
viewingDirections Tensor<T>: Viewing direction vectors of shape [N, 3].

Returns

(Tensor<T> grad1, Tensor<T> grad2): RGB colors and volume densities for each query point.

RenderImage(Vector<T>, Matrix<T>, int, int, T)

Renders an image from a camera viewpoint.

public Tensor<T> RenderImage(Vector<T> cameraPosition, Matrix<T> cameraRotation, int imageWidth, int imageHeight, T focalLength)

Parameters

cameraPosition Vector<T>: Camera position in world coordinates.
cameraRotation Matrix<T>: Camera rotation matrix (3x3).
imageWidth int: Output image width in pixels.
imageHeight int: Output image height in pixels.
focalLength T: Camera focal length.

Returns

Tensor<T>: Rendered image tensor of shape [height, width, 3].

RenderRays(Tensor<T>, Tensor<T>, int, T, T)

Renders colors for a batch of rays.

public Tensor<T> RenderRays(Tensor<T> rayOrigins, Tensor<T> rayDirections, int numSamples, T nearBound, T farBound)

Parameters

rayOrigins Tensor<T>: Ray origin positions [N, 3].
rayDirections Tensor<T>: Ray direction vectors [N, 3].
numSamples int: Number of samples per ray.
nearBound T: Near clipping plane.
farBound T: Far clipping plane.

Returns

Tensor<T>: Rendered colors for each ray [N, 3].

SerializeNetworkSpecificData(BinaryWriter)

Serializes network-specific data.

protected override void SerializeNetworkSpecificData(BinaryWriter writer)

Parameters

writer BinaryWriter

Train(Tensor<T>, Tensor<T>)

Trains the model on input data.

public override void Train(Tensor<T> input, Tensor<T> expectedOutput)

Parameters

input Tensor<T>
expectedOutput Tensor<T>

UpdateParameters(Vector<T>)

Updates model parameters using gradient descent.

public override void UpdateParameters(Vector<T> gradients)

Parameters

gradients Vector<T>

Table of Contents

Class NeRF<T>

Type Parameters

Remarks

Constructors

NeRF(int, int, int, int, int, int, bool, int, int, double, double, double, ILossFunction<T>?)

Parameters

Remarks

Properties

SupportsTraining

Property Value

Methods

Backpropagate(Tensor<T>)

Parameters

Returns

CreateNewInstance()

Returns

DeserializeNetworkSpecificData(BinaryReader)

Parameters

ForwardWithMemory(Tensor<T>)

Parameters

Returns

GetModelMetadata()

Returns

InitializeLayers()

Predict(Tensor<T>)

Parameters

Returns

QueryField(Tensor<T>, Tensor<T>)

Parameters

Returns

RenderImage(Vector<T>, Matrix<T>, int, int, T)

Parameters

Returns

RenderRays(Tensor<T>, Tensor<T>, int, T, T)

Parameters

Returns

SerializeNetworkSpecificData(BinaryWriter)

Parameters

Train(Tensor<T>, Tensor<T>)

Parameters

UpdateParameters(Vector<T>)

Parameters