Table of Contents

Class ThreeDDiffusionModelBase<T>

Namespace
AiDotNet.Diffusion
Assembly
AiDotNet.dll

Base class for 3D diffusion models that generate 3D content like point clouds, meshes, and scenes.

public abstract class ThreeDDiffusionModelBase<T> : LatentDiffusionModelBase<T>, ILatentDiffusionModel<T>, I3DDiffusionModel<T>, IDiffusionModel<T>, IFullModel<T, Tensor<T>, Tensor<T>>, IModel<Tensor<T>, Tensor<T>, ModelMetadata<T>>, IModelSerializer, ICheckpointableModel, IParameterizable<T, Tensor<T>, Tensor<T>>, IFeatureAware, IFeatureImportance<T>, ICloneable<IFullModel<T, Tensor<T>, Tensor<T>>>, IGradientComputable<T, Tensor<T>, Tensor<T>>, IJitCompilable<T>

Type Parameters

T

The numeric type used for calculations.

Inheritance
ThreeDDiffusionModelBase<T>
Implements
IFullModel<T, Tensor<T>, Tensor<T>>
IModel<Tensor<T>, Tensor<T>, ModelMetadata<T>>
IParameterizable<T, Tensor<T>, Tensor<T>>
ICloneable<IFullModel<T, Tensor<T>, Tensor<T>>>
IGradientComputable<T, Tensor<T>, Tensor<T>>
Derived
Inherited Members
Extension Methods

Remarks

This abstract base class provides common functionality for all 3D diffusion models, including point cloud generation, mesh generation, image-to-3D, novel view synthesis, and score distillation sampling.

For Beginners: This is the foundation for 3D generation models like Point-E and Shap-E. It extends diffusion to create 3D objects instead of 2D images.

Types of 3D generation: - Point Clouds: Sets of 3D points that form a shape - Meshes: Surfaces made of triangles (like in video games) - Textured Models: Meshes with colors and materials - Novel Views: New angles of an object from one image

How 3D diffusion works: 1. Text-to-3D: Describe what you want and get a 3D model 2. Image-to-3D: Turn a single photo into a full 3D model 3. Score Distillation: Use 2D diffusion knowledge to guide 3D optimization

Constructors

ThreeDDiffusionModelBase(DiffusionModelOptions<T>?, INoiseScheduler<T>?, int)

Initializes a new instance of the ThreeDDiffusionModelBase class.

protected ThreeDDiffusionModelBase(DiffusionModelOptions<T>? options = null, INoiseScheduler<T>? scheduler = null, int defaultPointCount = 4096)

Parameters

options DiffusionModelOptions<T>

Configuration options for the diffusion model.

scheduler INoiseScheduler<T>

Optional custom scheduler.

defaultPointCount int

Default number of points in point clouds.

Properties

CoordinateScale

Gets the coordinate scale for normalizing 3D positions.

public virtual double CoordinateScale { get; protected set; }

Property Value

double

Remarks

Points are typically normalized to [-1, 1] or [0, 1] range.

DefaultPointCount

Gets the default number of points in generated point clouds.

public virtual int DefaultPointCount { get; }

Property Value

int

SupportsMesh

Gets whether this model supports mesh generation.

public abstract bool SupportsMesh { get; }

Property Value

bool

SupportsNovelView

Gets whether this model supports novel view synthesis.

public abstract bool SupportsNovelView { get; }

Property Value

bool

SupportsPointCloud

Gets whether this model supports point cloud generation.

public abstract bool SupportsPointCloud { get; }

Property Value

bool

SupportsScoreDistillation

Gets whether this model supports score distillation sampling (SDS).

public abstract bool SupportsScoreDistillation { get; }

Property Value

bool

Remarks

SDS uses gradients from a 2D diffusion model to optimize a 3D representation. This is the technique behind DreamFusion and similar text-to-3D methods.

SupportsTexture

Gets whether this model supports texture generation.

public abstract bool SupportsTexture { get; }

Property Value

bool

Methods

ColorizePointCloud(Tensor<T>, string, int, int?)

Adds colors/normals to a point cloud.

public virtual Tensor<T> ColorizePointCloud(Tensor<T> pointCloud, string prompt, int numInferenceSteps = 50, int? seed = null)

Parameters

pointCloud Tensor<T>

Point cloud positions [batch, numPoints, 3].

prompt string

Text prompt for coloring.

numInferenceSteps int

Number of denoising steps.

seed int?

Optional random seed.

Returns

Tensor<T>

Point cloud with colors [batch, numPoints, 6] (XYZ + RGB).

CombineImageAndViewConditioning(Tensor<T>, Tensor<T>)

Combines image latent with view embedding for conditioning.

protected virtual Tensor<T> CombineImageAndViewConditioning(Tensor<T> imageLatent, Tensor<T> viewEmbedding)

Parameters

imageLatent Tensor<T>
viewEmbedding Tensor<T>

Returns

Tensor<T>

ComputeScoreDistillationGradients(Tensor<T>, string, int, double)

Computes score distillation gradients for 3D optimization.

public virtual Tensor<T> ComputeScoreDistillationGradients(Tensor<T> renderedViews, string prompt, int timestep, double guidanceScale = 100)

Parameters

renderedViews Tensor<T>

2D renders from the current 3D representation.

prompt string

Text prompt guiding the optimization.

timestep int

Diffusion timestep for noise level.

guidanceScale double

Classifier-free guidance scale.

Returns

Tensor<T>

Gradients to apply to the 3D representation.

Remarks

For Beginners: This helps create 3D from text using 2D knowledge: 1. Render your 3D object from multiple angles 2. Ask a 2D diffusion model "does this look like [prompt]?" 3. Get gradients that tell you how to improve the 3D 4. Repeat until the 3D looks right from all angles

This is how DreamFusion works - it uses 2D diffusion to guide 3D creation.

ConcatenatePointsAndColors(Tensor<T>, Tensor<T>)

Concatenates point positions with RGB colors.

protected virtual Tensor<T> ConcatenatePointsAndColors(Tensor<T> points, Tensor<T> colors)

Parameters

points Tensor<T>
colors Tensor<T>

Returns

Tensor<T>

CreateSimpleMeshFromPoints(Tensor<T>)

Creates a simple triangulated mesh from point cloud using nearest neighbors.

protected virtual Mesh3D<T> CreateSimpleMeshFromPoints(Tensor<T> pointCloud)

Parameters

pointCloud Tensor<T>

Returns

Mesh3D<T>

CreateViewEmbedding(double, double)

Creates a view embedding from azimuth and elevation angles.

protected virtual Tensor<T> CreateViewEmbedding(double azimuth, double elevation)

Parameters

azimuth double
elevation double

Returns

Tensor<T>

GenerateMesh(string, string?, int, int, double, int?)

Generates a mesh from a text description.

public virtual Mesh3D<T> GenerateMesh(string prompt, string? negativePrompt = null, int resolution = 128, int numInferenceSteps = 64, double guidanceScale = 3, int? seed = null)

Parameters

prompt string

Text description of the desired 3D object.

negativePrompt string

What to avoid.

resolution int

Mesh resolution (higher = more detail).

numInferenceSteps int

Number of denoising steps.

guidanceScale double

Classifier-free guidance scale.

seed int?

Optional random seed.

Returns

Mesh3D<T>

Mesh data containing vertices and faces.

Remarks

For Beginners: This creates a 3D surface you can render: - Vertices: 3D points that define corners - Faces: Triangles connecting vertices to form surfaces - Can be exported to common 3D formats (OBJ, STL, etc.)

GeneratePointCloud(string, string?, int?, int, double, int?)

Generates a point cloud from a text description.

public virtual Tensor<T> GeneratePointCloud(string prompt, string? negativePrompt = null, int? numPoints = null, int numInferenceSteps = 64, double guidanceScale = 3, int? seed = null)

Parameters

prompt string

Text description of the desired 3D object.

negativePrompt string

What to avoid.

numPoints int?

Number of points in the cloud.

numInferenceSteps int

Number of denoising steps.

guidanceScale double

Classifier-free guidance scale.

seed int?

Optional random seed.

Returns

Tensor<T>

Point cloud tensor [batch, numPoints, 3] for XYZ coordinates.

Remarks

For Beginners: This creates a cloud of 3D points that form a shape: - prompt: "A chair" → 4096 points arranged in a chair shape - The points define the surface of the object - Can be converted to a mesh for rendering

GenerateViewAngles(int)

Generates evenly distributed view angles around an object.

protected virtual (double azimuth, double elevation)[] GenerateViewAngles(int numViews)

Parameters

numViews int

Returns

(double azimuth, double elevation)[]

ImageTo3D(Tensor<T>, int, int, double, int?)

Generates a 3D model from a single input image.

public virtual Mesh3D<T> ImageTo3D(Tensor<T> inputImage, int numViews = 4, int numInferenceSteps = 50, double guidanceScale = 3, int? seed = null)

Parameters

inputImage Tensor<T>

The input image showing the object.

numViews int

Number of views to generate for reconstruction.

numInferenceSteps int

Number of denoising steps.

guidanceScale double

Classifier-free guidance scale.

seed int?

Optional random seed.

Returns

Mesh3D<T>

Reconstructed mesh with optional texture.

Remarks

For Beginners: This turns a flat picture into a 3D model: - Input: Photo of a shoe from the front - Output: Full 3D model you can view from any angle - The model infers what the hidden parts look like

NormalizeColors(Tensor<T>)

Normalizes color values to [0, 1] range.

protected virtual Tensor<T> NormalizeColors(Tensor<T> colors)

Parameters

colors Tensor<T>

Returns

Tensor<T>

NormalizePointCloud(Tensor<T>)

Normalizes point cloud coordinates to specified range.

protected virtual Tensor<T> NormalizePointCloud(Tensor<T> pointCloud)

Parameters

pointCloud Tensor<T>

Returns

Tensor<T>

PointCloudToMesh(Tensor<T>, SurfaceReconstructionMethod)

Converts a point cloud to a mesh.

public virtual Mesh3D<T> PointCloudToMesh(Tensor<T> pointCloud, SurfaceReconstructionMethod method = SurfaceReconstructionMethod.Poisson)

Parameters

pointCloud Tensor<T>

Point cloud [batch, numPoints, 3].

method SurfaceReconstructionMethod

Surface reconstruction method.

Returns

Mesh3D<T>

Reconstructed mesh.

PointCloudToMeshAlphaShape(Tensor<T>)

Converts point cloud to mesh using alpha shape algorithm.

protected virtual Mesh3D<T> PointCloudToMeshAlphaShape(Tensor<T> pointCloud)

Parameters

pointCloud Tensor<T>

Returns

Mesh3D<T>

PointCloudToMeshBallPivoting(Tensor<T>)

Converts point cloud to mesh using ball pivoting algorithm.

protected virtual Mesh3D<T> PointCloudToMeshBallPivoting(Tensor<T> pointCloud)

Parameters

pointCloud Tensor<T>

Returns

Mesh3D<T>

PointCloudToMeshMarchingCubes(Tensor<T>)

Converts point cloud to mesh using marching cubes on a voxel grid.

protected virtual Mesh3D<T> PointCloudToMeshMarchingCubes(Tensor<T> pointCloud)

Parameters

pointCloud Tensor<T>

Returns

Mesh3D<T>

PointCloudToMeshPoisson(Tensor<T>)

Converts point cloud to mesh using Poisson surface reconstruction.

protected virtual Mesh3D<T> PointCloudToMeshPoisson(Tensor<T> pointCloud)

Parameters

pointCloud Tensor<T>

Returns

Mesh3D<T>

PredictColorNoise(Tensor<T>, int, Tensor<T>, Tensor<T>)

Predicts noise for point cloud colorization.

protected virtual Tensor<T> PredictColorNoise(Tensor<T> colors, int timestep, Tensor<T> pointCloud, Tensor<T> promptEmbedding)

Parameters

colors Tensor<T>
timestep int
pointCloud Tensor<T>
promptEmbedding Tensor<T>

Returns

Tensor<T>

PredictNovelViewNoise(Tensor<T>, int, Tensor<T>, Tensor<T>, double)

Predicts noise for novel view synthesis.

protected virtual Tensor<T> PredictNovelViewNoise(Tensor<T> latents, int timestep, Tensor<T> imageLatent, Tensor<T> viewEmbedding, double guidanceScale)

Parameters

latents Tensor<T>
timestep int
imageLatent Tensor<T>
viewEmbedding Tensor<T>
guidanceScale double

Returns

Tensor<T>

PredictPointCloudNoise(Tensor<T>, int, Tensor<T>)

Predicts noise for point cloud denoising.

protected virtual Tensor<T> PredictPointCloudNoise(Tensor<T> points, int timestep, Tensor<T> conditioning)

Parameters

points Tensor<T>
timestep int
conditioning Tensor<T>

Returns

Tensor<T>

ReconstructFromViews(Tensor<T>[], (double azimuth, double elevation)[])

Reconstructs a 3D mesh from multiple view images.

protected virtual Mesh3D<T> ReconstructFromViews(Tensor<T>[] views, (double azimuth, double elevation)[] angles)

Parameters

views Tensor<T>[]
angles (double azimuth, double elevation)[]

Returns

Mesh3D<T>

SynthesizeNovelViews(Tensor<T>, (double azimuth, double elevation)[], int, double, int?)

Synthesizes novel views of an object from a reference image.

public virtual Tensor<T>[] SynthesizeNovelViews(Tensor<T> inputImage, (double azimuth, double elevation)[] targetAngles, int numInferenceSteps = 50, double guidanceScale = 3, int? seed = null)

Parameters

inputImage Tensor<T>

The reference image.

targetAngles (double azimuth, double elevation)[]

Target viewing angles (azimuth, elevation) in radians.

numInferenceSteps int

Number of denoising steps.

guidanceScale double

Classifier-free guidance scale.

seed int?

Optional random seed.

Returns

Tensor<T>[]

Array of images from the requested viewpoints.

Remarks

For Beginners: This shows an object from different angles: - Input: Front view of a car - Target: 45°, 90°, 135° rotations - Output: Images showing the car from those angles - Useful for product visualization