Class ThreeDDiffusionModelBase<T>

Namespace: AiDotNet.Diffusion

Assembly: AiDotNet.dll

Base class for 3D diffusion models that generate 3D content like point clouds, meshes, and scenes.

public abstract class ThreeDDiffusionModelBase<T> : LatentDiffusionModelBase<T>, ILatentDiffusionModel<T>, I3DDiffusionModel<T>, IDiffusionModel<T>, IFullModel<T, Tensor<T>, Tensor<T>>, IModel<Tensor<T>, Tensor<T>, ModelMetadata<T>>, IModelSerializer, ICheckpointableModel, IParameterizable<T, Tensor<T>, Tensor<T>>, IFeatureAware, IFeatureImportance<T>, ICloneable<IFullModel<T, Tensor<T>, Tensor<T>>>, IGradientComputable<T, Tensor<T>, Tensor<T>>, IJitCompilable<T>

Type Parameters

T: The numeric type used for calculations.

Inheritance: object

DiffusionModelBase<T>

LatentDiffusionModelBase<T>

ThreeDDiffusionModelBase<T>

Implements: ILatentDiffusionModel<T>

I3DDiffusionModel<T>

IDiffusionModel<T>

IFullModel<T, Tensor<T>, Tensor<T>>

IModel<Tensor<T>, Tensor<T>, ModelMetadata<T>>

IModelSerializer

ICheckpointableModel

IParameterizable<T, Tensor<T>, Tensor<T>>

IFeatureAware

IFeatureImportance<T>

ICloneable<IFullModel<T, Tensor<T>, Tensor<T>>>

IGradientComputable<T, Tensor<T>, Tensor<T>>

IJitCompilable<T>

Derived: MVDreamModel<T>

PointEModel<T>

ShapEModel<T>

Inherited Members: LatentDiffusionModelBase<T>.VAE

LatentDiffusionModelBase<T>.NoisePredictor

LatentDiffusionModelBase<T>.Conditioner

LatentDiffusionModelBase<T>.LatentChannels

LatentDiffusionModelBase<T>.GuidanceScale

LatentDiffusionModelBase<T>.SupportsNegativePrompt

LatentDiffusionModelBase<T>.SupportsInpainting

LatentDiffusionModelBase<T>.EncodeToLatent(Tensor<T>, bool)

LatentDiffusionModelBase<T>.DecodeFromLatent(Tensor<T>)

LatentDiffusionModelBase<T>.GenerateFromText(string, string, int, int, int, double?, int?)

LatentDiffusionModelBase<T>.ImageToImage(Tensor<T>, string, string, double, int, double?, int?)

LatentDiffusionModelBase<T>.Inpaint(Tensor<T>, Tensor<T>, string, string, int, double?, int?)

LatentDiffusionModelBase<T>.SetGuidanceScale(double)

LatentDiffusionModelBase<T>.PredictNoise(Tensor<T>, int)

LatentDiffusionModelBase<T>.Generate(int[], int, int?)

LatentDiffusionModelBase<T>.ApplyGuidance(Tensor<T>, Tensor<T>, double)

LatentDiffusionModelBase<T>.SampleNoiseTensor(int[], Random)

LatentDiffusionModelBase<T>.ResizeMaskToLatent(Tensor<T>, int[])

LatentDiffusionModelBase<T>.BlendLatentsWithMask(Tensor<T>, Tensor<T>, Tensor<T>, int)

DiffusionModelBase<T>.NumOps

DiffusionModelBase<T>.RandomGenerator

DiffusionModelBase<T>.LossFunction

DiffusionModelBase<T>.LearningRate

DiffusionModelBase<T>.Scheduler

DiffusionModelBase<T>.ParameterCount

DiffusionModelBase<T>.DefaultLossFunction

DiffusionModelBase<T>.SupportsJitCompilation

DiffusionModelBase<T>.ComputeLoss(Tensor<T>, Tensor<T>, int[])

DiffusionModelBase<T>.Train(Tensor<T>, Tensor<T>)

DiffusionModelBase<T>.Predict(Tensor<T>)

DiffusionModelBase<T>.GetModelMetadata()

DiffusionModelBase<T>.GetParameters()

DiffusionModelBase<T>.SetParameters(Vector<T>)

DiffusionModelBase<T>.WithParameters(Vector<T>)

DiffusionModelBase<T>.Serialize()

DiffusionModelBase<T>.Deserialize(byte[])

DiffusionModelBase<T>.SaveModel(string)

DiffusionModelBase<T>.LoadModel(string)

DiffusionModelBase<T>.SaveState(Stream)

DiffusionModelBase<T>.LoadState(Stream)

DiffusionModelBase<T>.GetActiveFeatureIndices()

DiffusionModelBase<T>.SetActiveFeatureIndices(IEnumerable<int>)

DiffusionModelBase<T>.IsFeatureUsed(int)

DiffusionModelBase<T>.GetFeatureImportance()

DiffusionModelBase<T>.DeepCopy()

DiffusionModelBase<T>.Clone()

DiffusionModelBase<T>.ComputeGradients(Tensor<T>, Tensor<T>, ILossFunction<T>)

DiffusionModelBase<T>.ApplyGradients(Vector<T>, T)

DiffusionModelBase<T>.ExportComputationGraph(List<ComputationNode<T>>)

DiffusionModelBase<T>.SampleNoise(int, Random)

object.Equals(object)

object.Equals(object, object)

object.GetHashCode()

object.GetType()

object.MemberwiseClone()

object.ReferenceEquals(object, object)

object.ToString()

Extension Methods: DistributedExtensions.AsDistributedForHighBandwidth<T, TInput, TOutput>(IFullModel<T, TInput, TOutput>, ICommunicationBackend<T>)

DistributedExtensions.AsDistributedForLowBandwidth<T, TInput, TOutput>(IFullModel<T, TInput, TOutput>, ICommunicationBackend<T>)

DistributedExtensions.AsDistributed<T, TInput, TOutput>(IFullModel<T, TInput, TOutput>, ICommunicationBackend<T>)

DistributedExtensions.AsDistributed<T, TInput, TOutput>(IFullModel<T, TInput, TOutput>, IShardingConfiguration<T>)

Remarks

This abstract base class provides common functionality for all 3D diffusion models, including point cloud generation, mesh generation, image-to-3D, novel view synthesis, and score distillation sampling.

For Beginners: This is the foundation for 3D generation models like Point-E and Shap-E. It extends diffusion to create 3D objects instead of 2D images.

Types of 3D generation: - Point Clouds: Sets of 3D points that form a shape - Meshes: Surfaces made of triangles (like in video games) - Textured Models: Meshes with colors and materials - Novel Views: New angles of an object from one image

How 3D diffusion works: 1. Text-to-3D: Describe what you want and get a 3D model 2. Image-to-3D: Turn a single photo into a full 3D model 3. Score Distillation: Use 2D diffusion knowledge to guide 3D optimization

Constructors

ThreeDDiffusionModelBase(DiffusionModelOptions<T>?, INoiseScheduler<T>?, int)

Initializes a new instance of the ThreeDDiffusionModelBase class.

protected ThreeDDiffusionModelBase(DiffusionModelOptions<T>? options = null, INoiseScheduler<T>? scheduler = null, int defaultPointCount = 4096)

Parameters

options DiffusionModelOptions<T>: Configuration options for the diffusion model.
scheduler INoiseScheduler<T>: Optional custom scheduler.
defaultPointCount int: Default number of points in point clouds.

Properties

CoordinateScale

Gets the coordinate scale for normalizing 3D positions.

public virtual double CoordinateScale { get; protected set; }

Property Value

double

Remarks

Points are typically normalized to [-1, 1] or [0, 1] range.

DefaultPointCount

Gets the default number of points in generated point clouds.

public virtual int DefaultPointCount { get; }

Property Value

int

SupportsMesh

Gets whether this model supports mesh generation.

public abstract bool SupportsMesh { get; }

Property Value

bool

SupportsNovelView

Gets whether this model supports novel view synthesis.

public abstract bool SupportsNovelView { get; }

Property Value

bool

SupportsPointCloud

Gets whether this model supports point cloud generation.

public abstract bool SupportsPointCloud { get; }

Property Value

bool

SupportsScoreDistillation

Gets whether this model supports score distillation sampling (SDS).

public abstract bool SupportsScoreDistillation { get; }

Property Value

bool

Remarks

SDS uses gradients from a 2D diffusion model to optimize a 3D representation. This is the technique behind DreamFusion and similar text-to-3D methods.

SupportsTexture

Gets whether this model supports texture generation.

public abstract bool SupportsTexture { get; }

Property Value

bool

Methods

ColorizePointCloud(Tensor<T>, string, int, int?)

Adds colors/normals to a point cloud.

public virtual Tensor<T> ColorizePointCloud(Tensor<T> pointCloud, string prompt, int numInferenceSteps = 50, int? seed = null)

Parameters

pointCloud Tensor<T>: Point cloud positions [batch, numPoints, 3].
prompt string: Text prompt for coloring.
numInferenceSteps int: Number of denoising steps.
seed int?: Optional random seed.

Returns

Tensor<T>: Point cloud with colors [batch, numPoints, 6] (XYZ + RGB).

CombineImageAndViewConditioning(Tensor<T>, Tensor<T>)

Combines image latent with view embedding for conditioning.

protected virtual Tensor<T> CombineImageAndViewConditioning(Tensor<T> imageLatent, Tensor<T> viewEmbedding)

Parameters

imageLatent Tensor<T>
viewEmbedding Tensor<T>

Returns

Tensor<T>

ComputeScoreDistillationGradients(Tensor<T>, string, int, double)

Computes score distillation gradients for 3D optimization.

public virtual Tensor<T> ComputeScoreDistillationGradients(Tensor<T> renderedViews, string prompt, int timestep, double guidanceScale = 100)

Parameters

renderedViews Tensor<T>: 2D renders from the current 3D representation.
prompt string: Text prompt guiding the optimization.
timestep int: Diffusion timestep for noise level.
guidanceScale double: Classifier-free guidance scale.

Returns

Tensor<T>: Gradients to apply to the 3D representation.

Remarks

For Beginners: This helps create 3D from text using 2D knowledge: 1. Render your 3D object from multiple angles 2. Ask a 2D diffusion model "does this look like [prompt]?" 3. Get gradients that tell you how to improve the 3D 4. Repeat until the 3D looks right from all angles

This is how DreamFusion works - it uses 2D diffusion to guide 3D creation.

ConcatenatePointsAndColors(Tensor<T>, Tensor<T>)

Concatenates point positions with RGB colors.

protected virtual Tensor<T> ConcatenatePointsAndColors(Tensor<T> points, Tensor<T> colors)

Parameters

points Tensor<T>
colors Tensor<T>

Returns

Tensor<T>

CreateSimpleMeshFromPoints(Tensor<T>)

Creates a simple triangulated mesh from point cloud using nearest neighbors.

protected virtual Mesh3D<T> CreateSimpleMeshFromPoints(Tensor<T> pointCloud)

Parameters

pointCloud Tensor<T>

Returns

Mesh3D<T>

CreateViewEmbedding(double, double)

Creates a view embedding from azimuth and elevation angles.

protected virtual Tensor<T> CreateViewEmbedding(double azimuth, double elevation)

Parameters

azimuth double
elevation double

Returns

Tensor<T>

GenerateMesh(string, string?, int, int, double, int?)

Generates a mesh from a text description.

public virtual Mesh3D<T> GenerateMesh(string prompt, string? negativePrompt = null, int resolution = 128, int numInferenceSteps = 64, double guidanceScale = 3, int? seed = null)

Parameters

prompt string: Text description of the desired 3D object.
negativePrompt string: What to avoid.
resolution int: Mesh resolution (higher = more detail).
numInferenceSteps int: Number of denoising steps.
guidanceScale double: Classifier-free guidance scale.
seed int?: Optional random seed.

Returns

Mesh3D<T>: Mesh data containing vertices and faces.

Remarks

For Beginners: This creates a 3D surface you can render: - Vertices: 3D points that define corners - Faces: Triangles connecting vertices to form surfaces - Can be exported to common 3D formats (OBJ, STL, etc.)

GeneratePointCloud(string, string?, int?, int, double, int?)

Generates a point cloud from a text description.

public virtual Tensor<T> GeneratePointCloud(string prompt, string? negativePrompt = null, int? numPoints = null, int numInferenceSteps = 64, double guidanceScale = 3, int? seed = null)

Parameters

prompt string: Text description of the desired 3D object.
negativePrompt string: What to avoid.
numPoints int?: Number of points in the cloud.
numInferenceSteps int: Number of denoising steps.
guidanceScale double: Classifier-free guidance scale.
seed int?: Optional random seed.

Returns

Tensor<T>: Point cloud tensor [batch, numPoints, 3] for XYZ coordinates.

Remarks

For Beginners: This creates a cloud of 3D points that form a shape: - prompt: "A chair" → 4096 points arranged in a chair shape - The points define the surface of the object - Can be converted to a mesh for rendering

GenerateViewAngles(int)

Generates evenly distributed view angles around an object.

protected virtual (double azimuth, double elevation)[] GenerateViewAngles(int numViews)

Parameters

numViews int

Returns

(double azimuth, double elevation)[]

ImageTo3D(Tensor<T>, int, int, double, int?)

Generates a 3D model from a single input image.

public virtual Mesh3D<T> ImageTo3D(Tensor<T> inputImage, int numViews = 4, int numInferenceSteps = 50, double guidanceScale = 3, int? seed = null)

Parameters

inputImage Tensor<T>: The input image showing the object.
numViews int: Number of views to generate for reconstruction.
numInferenceSteps int: Number of denoising steps.
guidanceScale double: Classifier-free guidance scale.
seed int?: Optional random seed.

Returns

Mesh3D<T>: Reconstructed mesh with optional texture.

Remarks

For Beginners: This turns a flat picture into a 3D model: - Input: Photo of a shoe from the front - Output: Full 3D model you can view from any angle - The model infers what the hidden parts look like

NormalizeColors(Tensor<T>)

Normalizes color values to [0, 1] range.

protected virtual Tensor<T> NormalizeColors(Tensor<T> colors)

Parameters

colors Tensor<T>

Returns

Tensor<T>

NormalizePointCloud(Tensor<T>)

Normalizes point cloud coordinates to specified range.

protected virtual Tensor<T> NormalizePointCloud(Tensor<T> pointCloud)

Parameters

pointCloud Tensor<T>

Returns

Tensor<T>

PointCloudToMesh(Tensor<T>, SurfaceReconstructionMethod)

Converts a point cloud to a mesh.

public virtual Mesh3D<T> PointCloudToMesh(Tensor<T> pointCloud, SurfaceReconstructionMethod method = SurfaceReconstructionMethod.Poisson)

Parameters

pointCloud Tensor<T>: Point cloud [batch, numPoints, 3].
method SurfaceReconstructionMethod: Surface reconstruction method.

Returns

Mesh3D<T>: Reconstructed mesh.

PointCloudToMeshAlphaShape(Tensor<T>)

Converts point cloud to mesh using alpha shape algorithm.

protected virtual Mesh3D<T> PointCloudToMeshAlphaShape(Tensor<T> pointCloud)

Parameters

pointCloud Tensor<T>

Returns

Mesh3D<T>

PointCloudToMeshBallPivoting(Tensor<T>)

Converts point cloud to mesh using ball pivoting algorithm.

protected virtual Mesh3D<T> PointCloudToMeshBallPivoting(Tensor<T> pointCloud)

Parameters

pointCloud Tensor<T>

Returns

Mesh3D<T>

PointCloudToMeshMarchingCubes(Tensor<T>)

Converts point cloud to mesh using marching cubes on a voxel grid.

protected virtual Mesh3D<T> PointCloudToMeshMarchingCubes(Tensor<T> pointCloud)

Parameters

pointCloud Tensor<T>

Returns

Mesh3D<T>

PointCloudToMeshPoisson(Tensor<T>)

Converts point cloud to mesh using Poisson surface reconstruction.

protected virtual Mesh3D<T> PointCloudToMeshPoisson(Tensor<T> pointCloud)

Parameters

pointCloud Tensor<T>

Returns

Mesh3D<T>

PredictColorNoise(Tensor<T>, int, Tensor<T>, Tensor<T>)

Predicts noise for point cloud colorization.

protected virtual Tensor<T> PredictColorNoise(Tensor<T> colors, int timestep, Tensor<T> pointCloud, Tensor<T> promptEmbedding)

Parameters

colors Tensor<T>
timestep int
pointCloud Tensor<T>
promptEmbedding Tensor<T>

Returns

Tensor<T>

PredictNovelViewNoise(Tensor<T>, int, Tensor<T>, Tensor<T>, double)

Predicts noise for novel view synthesis.

protected virtual Tensor<T> PredictNovelViewNoise(Tensor<T> latents, int timestep, Tensor<T> imageLatent, Tensor<T> viewEmbedding, double guidanceScale)

Parameters

latents Tensor<T>
timestep int
imageLatent Tensor<T>
viewEmbedding Tensor<T>
guidanceScale double

Returns

Tensor<T>

PredictPointCloudNoise(Tensor<T>, int, Tensor<T>)

Predicts noise for point cloud denoising.

protected virtual Tensor<T> PredictPointCloudNoise(Tensor<T> points, int timestep, Tensor<T> conditioning)

Parameters

points Tensor<T>
timestep int
conditioning Tensor<T>

Returns

Tensor<T>

ReconstructFromViews(Tensor<T>[], (double azimuth, double elevation)[])

Reconstructs a 3D mesh from multiple view images.

protected virtual Mesh3D<T> ReconstructFromViews(Tensor<T>[] views, (double azimuth, double elevation)[] angles)

Parameters

views Tensor<T>[]
angles (double azimuth, double elevation)[]

Returns

Mesh3D<T>

SynthesizeNovelViews(Tensor<T>, (double azimuth, double elevation)[], int, double, int?)

Synthesizes novel views of an object from a reference image.

public virtual Tensor<T>[] SynthesizeNovelViews(Tensor<T> inputImage, (double azimuth, double elevation)[] targetAngles, int numInferenceSteps = 50, double guidanceScale = 3, int? seed = null)

Parameters

inputImage Tensor<T>: The reference image.
targetAngles (double azimuth, double elevation)[]: Target viewing angles (azimuth, elevation) in radians.
numInferenceSteps int: Number of denoising steps.
guidanceScale double: Classifier-free guidance scale.
seed int?: Optional random seed.

Returns

Tensor<T>[]: Array of images from the requested viewpoints.

Remarks

For Beginners: This shows an object from different angles: - Input: Front view of a car - Target: 45°, 90°, 135° rotations - Output: Images showing the car from those angles - Useful for product visualization

Table of Contents

Class ThreeDDiffusionModelBase<T>

Type Parameters

Remarks

Constructors

ThreeDDiffusionModelBase(DiffusionModelOptions<T>?, INoiseScheduler<T>?, int)

Parameters

Properties

CoordinateScale

Property Value

Remarks

DefaultPointCount

Property Value

SupportsMesh

Property Value

SupportsNovelView

Property Value

SupportsPointCloud

Property Value

SupportsScoreDistillation

Property Value

Remarks

SupportsTexture

Property Value

Methods

ColorizePointCloud(Tensor<T>, string, int, int?)

Parameters

Returns

CombineImageAndViewConditioning(Tensor<T>, Tensor<T>)

Parameters

Returns

ComputeScoreDistillationGradients(Tensor<T>, string, int, double)

Parameters

Returns

Remarks

ConcatenatePointsAndColors(Tensor<T>, Tensor<T>)

Parameters

Returns

CreateSimpleMeshFromPoints(Tensor<T>)

Parameters

Returns

CreateViewEmbedding(double, double)

Parameters

Returns

GenerateMesh(string, string?, int, int, double, int?)

Parameters

Returns

Remarks

GeneratePointCloud(string, string?, int?, int, double, int?)

Parameters

Returns

Remarks

GenerateViewAngles(int)

Parameters

Returns

ImageTo3D(Tensor<T>, int, int, double, int?)

Parameters

Returns

Remarks

NormalizeColors(Tensor<T>)

Parameters

Returns

NormalizePointCloud(Tensor<T>)

Parameters

Returns

PointCloudToMesh(Tensor<T>, SurfaceReconstructionMethod)

Parameters

Returns

PointCloudToMeshAlphaShape(Tensor<T>)

Parameters

Returns

PointCloudToMeshBallPivoting(Tensor<T>)

Parameters

Returns

PointCloudToMeshMarchingCubes(Tensor<T>)

Parameters

Returns

PointCloudToMeshPoisson(Tensor<T>)

Parameters

Returns