Class ThreeDDiffusionModelBase<T>
Base class for 3D diffusion models that generate 3D content like point clouds, meshes, and scenes.
public abstract class ThreeDDiffusionModelBase<T> : LatentDiffusionModelBase<T>, ILatentDiffusionModel<T>, I3DDiffusionModel<T>, IDiffusionModel<T>, IFullModel<T, Tensor<T>, Tensor<T>>, IModel<Tensor<T>, Tensor<T>, ModelMetadata<T>>, IModelSerializer, ICheckpointableModel, IParameterizable<T, Tensor<T>, Tensor<T>>, IFeatureAware, IFeatureImportance<T>, ICloneable<IFullModel<T, Tensor<T>, Tensor<T>>>, IGradientComputable<T, Tensor<T>, Tensor<T>>, IJitCompilable<T>
Type Parameters
TThe numeric type used for calculations.
- Inheritance
-
ThreeDDiffusionModelBase<T>
- Implements
- Derived
- Inherited Members
- Extension Methods
Remarks
This abstract base class provides common functionality for all 3D diffusion models, including point cloud generation, mesh generation, image-to-3D, novel view synthesis, and score distillation sampling.
For Beginners: This is the foundation for 3D generation models like Point-E and Shap-E. It extends diffusion to create 3D objects instead of 2D images.
Types of 3D generation: - Point Clouds: Sets of 3D points that form a shape - Meshes: Surfaces made of triangles (like in video games) - Textured Models: Meshes with colors and materials - Novel Views: New angles of an object from one image
How 3D diffusion works: 1. Text-to-3D: Describe what you want and get a 3D model 2. Image-to-3D: Turn a single photo into a full 3D model 3. Score Distillation: Use 2D diffusion knowledge to guide 3D optimization
Constructors
ThreeDDiffusionModelBase(DiffusionModelOptions<T>?, INoiseScheduler<T>?, int)
Initializes a new instance of the ThreeDDiffusionModelBase class.
protected ThreeDDiffusionModelBase(DiffusionModelOptions<T>? options = null, INoiseScheduler<T>? scheduler = null, int defaultPointCount = 4096)
Parameters
optionsDiffusionModelOptions<T>Configuration options for the diffusion model.
schedulerINoiseScheduler<T>Optional custom scheduler.
defaultPointCountintDefault number of points in point clouds.
Properties
CoordinateScale
Gets the coordinate scale for normalizing 3D positions.
public virtual double CoordinateScale { get; protected set; }
Property Value
Remarks
Points are typically normalized to [-1, 1] or [0, 1] range.
DefaultPointCount
Gets the default number of points in generated point clouds.
public virtual int DefaultPointCount { get; }
Property Value
SupportsMesh
Gets whether this model supports mesh generation.
public abstract bool SupportsMesh { get; }
Property Value
SupportsNovelView
Gets whether this model supports novel view synthesis.
public abstract bool SupportsNovelView { get; }
Property Value
SupportsPointCloud
Gets whether this model supports point cloud generation.
public abstract bool SupportsPointCloud { get; }
Property Value
SupportsScoreDistillation
Gets whether this model supports score distillation sampling (SDS).
public abstract bool SupportsScoreDistillation { get; }
Property Value
Remarks
SDS uses gradients from a 2D diffusion model to optimize a 3D representation. This is the technique behind DreamFusion and similar text-to-3D methods.
SupportsTexture
Gets whether this model supports texture generation.
public abstract bool SupportsTexture { get; }
Property Value
Methods
ColorizePointCloud(Tensor<T>, string, int, int?)
Adds colors/normals to a point cloud.
public virtual Tensor<T> ColorizePointCloud(Tensor<T> pointCloud, string prompt, int numInferenceSteps = 50, int? seed = null)
Parameters
pointCloudTensor<T>Point cloud positions [batch, numPoints, 3].
promptstringText prompt for coloring.
numInferenceStepsintNumber of denoising steps.
seedint?Optional random seed.
Returns
- Tensor<T>
Point cloud with colors [batch, numPoints, 6] (XYZ + RGB).
CombineImageAndViewConditioning(Tensor<T>, Tensor<T>)
Combines image latent with view embedding for conditioning.
protected virtual Tensor<T> CombineImageAndViewConditioning(Tensor<T> imageLatent, Tensor<T> viewEmbedding)
Parameters
imageLatentTensor<T>viewEmbeddingTensor<T>
Returns
- Tensor<T>
ComputeScoreDistillationGradients(Tensor<T>, string, int, double)
Computes score distillation gradients for 3D optimization.
public virtual Tensor<T> ComputeScoreDistillationGradients(Tensor<T> renderedViews, string prompt, int timestep, double guidanceScale = 100)
Parameters
renderedViewsTensor<T>2D renders from the current 3D representation.
promptstringText prompt guiding the optimization.
timestepintDiffusion timestep for noise level.
guidanceScaledoubleClassifier-free guidance scale.
Returns
- Tensor<T>
Gradients to apply to the 3D representation.
Remarks
For Beginners: This helps create 3D from text using 2D knowledge: 1. Render your 3D object from multiple angles 2. Ask a 2D diffusion model "does this look like [prompt]?" 3. Get gradients that tell you how to improve the 3D 4. Repeat until the 3D looks right from all angles
This is how DreamFusion works - it uses 2D diffusion to guide 3D creation.
ConcatenatePointsAndColors(Tensor<T>, Tensor<T>)
Concatenates point positions with RGB colors.
protected virtual Tensor<T> ConcatenatePointsAndColors(Tensor<T> points, Tensor<T> colors)
Parameters
pointsTensor<T>colorsTensor<T>
Returns
- Tensor<T>
CreateSimpleMeshFromPoints(Tensor<T>)
Creates a simple triangulated mesh from point cloud using nearest neighbors.
protected virtual Mesh3D<T> CreateSimpleMeshFromPoints(Tensor<T> pointCloud)
Parameters
pointCloudTensor<T>
Returns
- Mesh3D<T>
CreateViewEmbedding(double, double)
Creates a view embedding from azimuth and elevation angles.
protected virtual Tensor<T> CreateViewEmbedding(double azimuth, double elevation)
Parameters
Returns
- Tensor<T>
GenerateMesh(string, string?, int, int, double, int?)
Generates a mesh from a text description.
public virtual Mesh3D<T> GenerateMesh(string prompt, string? negativePrompt = null, int resolution = 128, int numInferenceSteps = 64, double guidanceScale = 3, int? seed = null)
Parameters
promptstringText description of the desired 3D object.
negativePromptstringWhat to avoid.
resolutionintMesh resolution (higher = more detail).
numInferenceStepsintNumber of denoising steps.
guidanceScaledoubleClassifier-free guidance scale.
seedint?Optional random seed.
Returns
- Mesh3D<T>
Mesh data containing vertices and faces.
Remarks
For Beginners: This creates a 3D surface you can render: - Vertices: 3D points that define corners - Faces: Triangles connecting vertices to form surfaces - Can be exported to common 3D formats (OBJ, STL, etc.)
GeneratePointCloud(string, string?, int?, int, double, int?)
Generates a point cloud from a text description.
public virtual Tensor<T> GeneratePointCloud(string prompt, string? negativePrompt = null, int? numPoints = null, int numInferenceSteps = 64, double guidanceScale = 3, int? seed = null)
Parameters
promptstringText description of the desired 3D object.
negativePromptstringWhat to avoid.
numPointsint?Number of points in the cloud.
numInferenceStepsintNumber of denoising steps.
guidanceScaledoubleClassifier-free guidance scale.
seedint?Optional random seed.
Returns
- Tensor<T>
Point cloud tensor [batch, numPoints, 3] for XYZ coordinates.
Remarks
For Beginners: This creates a cloud of 3D points that form a shape: - prompt: "A chair" → 4096 points arranged in a chair shape - The points define the surface of the object - Can be converted to a mesh for rendering
GenerateViewAngles(int)
Generates evenly distributed view angles around an object.
protected virtual (double azimuth, double elevation)[] GenerateViewAngles(int numViews)
Parameters
numViewsint
Returns
ImageTo3D(Tensor<T>, int, int, double, int?)
Generates a 3D model from a single input image.
public virtual Mesh3D<T> ImageTo3D(Tensor<T> inputImage, int numViews = 4, int numInferenceSteps = 50, double guidanceScale = 3, int? seed = null)
Parameters
inputImageTensor<T>The input image showing the object.
numViewsintNumber of views to generate for reconstruction.
numInferenceStepsintNumber of denoising steps.
guidanceScaledoubleClassifier-free guidance scale.
seedint?Optional random seed.
Returns
- Mesh3D<T>
Reconstructed mesh with optional texture.
Remarks
For Beginners: This turns a flat picture into a 3D model: - Input: Photo of a shoe from the front - Output: Full 3D model you can view from any angle - The model infers what the hidden parts look like
NormalizeColors(Tensor<T>)
Normalizes color values to [0, 1] range.
protected virtual Tensor<T> NormalizeColors(Tensor<T> colors)
Parameters
colorsTensor<T>
Returns
- Tensor<T>
NormalizePointCloud(Tensor<T>)
Normalizes point cloud coordinates to specified range.
protected virtual Tensor<T> NormalizePointCloud(Tensor<T> pointCloud)
Parameters
pointCloudTensor<T>
Returns
- Tensor<T>
PointCloudToMesh(Tensor<T>, SurfaceReconstructionMethod)
Converts a point cloud to a mesh.
public virtual Mesh3D<T> PointCloudToMesh(Tensor<T> pointCloud, SurfaceReconstructionMethod method = SurfaceReconstructionMethod.Poisson)
Parameters
pointCloudTensor<T>Point cloud [batch, numPoints, 3].
methodSurfaceReconstructionMethodSurface reconstruction method.
Returns
- Mesh3D<T>
Reconstructed mesh.
PointCloudToMeshAlphaShape(Tensor<T>)
Converts point cloud to mesh using alpha shape algorithm.
protected virtual Mesh3D<T> PointCloudToMeshAlphaShape(Tensor<T> pointCloud)
Parameters
pointCloudTensor<T>
Returns
- Mesh3D<T>
PointCloudToMeshBallPivoting(Tensor<T>)
Converts point cloud to mesh using ball pivoting algorithm.
protected virtual Mesh3D<T> PointCloudToMeshBallPivoting(Tensor<T> pointCloud)
Parameters
pointCloudTensor<T>
Returns
- Mesh3D<T>
PointCloudToMeshMarchingCubes(Tensor<T>)
Converts point cloud to mesh using marching cubes on a voxel grid.
protected virtual Mesh3D<T> PointCloudToMeshMarchingCubes(Tensor<T> pointCloud)
Parameters
pointCloudTensor<T>
Returns
- Mesh3D<T>
PointCloudToMeshPoisson(Tensor<T>)
Converts point cloud to mesh using Poisson surface reconstruction.
protected virtual Mesh3D<T> PointCloudToMeshPoisson(Tensor<T> pointCloud)
Parameters
pointCloudTensor<T>
Returns
- Mesh3D<T>
PredictColorNoise(Tensor<T>, int, Tensor<T>, Tensor<T>)
Predicts noise for point cloud colorization.
protected virtual Tensor<T> PredictColorNoise(Tensor<T> colors, int timestep, Tensor<T> pointCloud, Tensor<T> promptEmbedding)
Parameters
colorsTensor<T>timestepintpointCloudTensor<T>promptEmbeddingTensor<T>
Returns
- Tensor<T>
PredictNovelViewNoise(Tensor<T>, int, Tensor<T>, Tensor<T>, double)
Predicts noise for novel view synthesis.
protected virtual Tensor<T> PredictNovelViewNoise(Tensor<T> latents, int timestep, Tensor<T> imageLatent, Tensor<T> viewEmbedding, double guidanceScale)
Parameters
Returns
- Tensor<T>
PredictPointCloudNoise(Tensor<T>, int, Tensor<T>)
Predicts noise for point cloud denoising.
protected virtual Tensor<T> PredictPointCloudNoise(Tensor<T> points, int timestep, Tensor<T> conditioning)
Parameters
pointsTensor<T>timestepintconditioningTensor<T>
Returns
- Tensor<T>
ReconstructFromViews(Tensor<T>[], (double azimuth, double elevation)[])
Reconstructs a 3D mesh from multiple view images.
protected virtual Mesh3D<T> ReconstructFromViews(Tensor<T>[] views, (double azimuth, double elevation)[] angles)
Parameters
Returns
- Mesh3D<T>
SynthesizeNovelViews(Tensor<T>, (double azimuth, double elevation)[], int, double, int?)
Synthesizes novel views of an object from a reference image.
public virtual Tensor<T>[] SynthesizeNovelViews(Tensor<T> inputImage, (double azimuth, double elevation)[] targetAngles, int numInferenceSteps = 50, double guidanceScale = 3, int? seed = null)
Parameters
inputImageTensor<T>The reference image.
targetAngles(double azimuth, double elevation)[]Target viewing angles (azimuth, elevation) in radians.
numInferenceStepsintNumber of denoising steps.
guidanceScaledoubleClassifier-free guidance scale.
seedint?Optional random seed.
Returns
- Tensor<T>[]
Array of images from the requested viewpoints.
Remarks
For Beginners: This shows an object from different angles: - Input: Front view of a car - Target: 45°, 90°, 135° rotations - Output: Images showing the car from those angles - Useful for product visualization