Class PointEModel<T>

Namespace: AiDotNet.Diffusion.Models

Assembly: AiDotNet.dll

Point-E model for text-to-3D point cloud generation.

public class PointEModel<T> : ThreeDDiffusionModelBase<T>, ILatentDiffusionModel<T>, I3DDiffusionModel<T>, IDiffusionModel<T>, IFullModel<T, Tensor<T>, Tensor<T>>, IModel<Tensor<T>, Tensor<T>, ModelMetadata<T>>, IModelSerializer, ICheckpointableModel, IParameterizable<T, Tensor<T>, Tensor<T>>, IFeatureAware, IFeatureImportance<T>, ICloneable<IFullModel<T, Tensor<T>, Tensor<T>>>, IGradientComputable<T, Tensor<T>, Tensor<T>>, IJitCompilable<T>

Type Parameters

T: The numeric type used for calculations.

Inheritance: object

DiffusionModelBase<T>

LatentDiffusionModelBase<T>

ThreeDDiffusionModelBase<T>

PointEModel<T>

Implements: ILatentDiffusionModel<T>

I3DDiffusionModel<T>

IDiffusionModel<T>

IFullModel<T, Tensor<T>, Tensor<T>>

IModel<Tensor<T>, Tensor<T>, ModelMetadata<T>>

IModelSerializer

ICheckpointableModel

IParameterizable<T, Tensor<T>, Tensor<T>>

IFeatureAware

IFeatureImportance<T>

ICloneable<IFullModel<T, Tensor<T>, Tensor<T>>>

IGradientComputable<T, Tensor<T>, Tensor<T>>

IJitCompilable<T>

Inherited Members: ThreeDDiffusionModelBase<T>.DefaultPointCount

ThreeDDiffusionModelBase<T>.CoordinateScale

ThreeDDiffusionModelBase<T>.GenerateMesh(string, string, int, int, double, int?)

ThreeDDiffusionModelBase<T>.ImageTo3D(Tensor<T>, int, int, double, int?)

ThreeDDiffusionModelBase<T>.SynthesizeNovelViews(Tensor<T>, (double azimuth, double elevation)[], int, double, int?)

ThreeDDiffusionModelBase<T>.ComputeScoreDistillationGradients(Tensor<T>, string, int, double)

ThreeDDiffusionModelBase<T>.PointCloudToMesh(Tensor<T>, SurfaceReconstructionMethod)

ThreeDDiffusionModelBase<T>.ColorizePointCloud(Tensor<T>, string, int, int?)

ThreeDDiffusionModelBase<T>.PredictPointCloudNoise(Tensor<T>, int, Tensor<T>)

ThreeDDiffusionModelBase<T>.PredictNovelViewNoise(Tensor<T>, int, Tensor<T>, Tensor<T>, double)

ThreeDDiffusionModelBase<T>.PredictColorNoise(Tensor<T>, int, Tensor<T>, Tensor<T>)

ThreeDDiffusionModelBase<T>.CreateViewEmbedding(double, double)

ThreeDDiffusionModelBase<T>.CombineImageAndViewConditioning(Tensor<T>, Tensor<T>)

ThreeDDiffusionModelBase<T>.GenerateViewAngles(int)

ThreeDDiffusionModelBase<T>.ReconstructFromViews(Tensor<T>[], (double azimuth, double elevation)[])

ThreeDDiffusionModelBase<T>.NormalizePointCloud(Tensor<T>)

ThreeDDiffusionModelBase<T>.NormalizeColors(Tensor<T>)

ThreeDDiffusionModelBase<T>.ConcatenatePointsAndColors(Tensor<T>, Tensor<T>)

ThreeDDiffusionModelBase<T>.PointCloudToMeshPoisson(Tensor<T>)

ThreeDDiffusionModelBase<T>.PointCloudToMeshBallPivoting(Tensor<T>)

ThreeDDiffusionModelBase<T>.PointCloudToMeshMarchingCubes(Tensor<T>)

ThreeDDiffusionModelBase<T>.PointCloudToMeshAlphaShape(Tensor<T>)

ThreeDDiffusionModelBase<T>.CreateSimpleMeshFromPoints(Tensor<T>)

LatentDiffusionModelBase<T>.GuidanceScale

LatentDiffusionModelBase<T>.SupportsNegativePrompt

LatentDiffusionModelBase<T>.SupportsInpainting

LatentDiffusionModelBase<T>.EncodeToLatent(Tensor<T>, bool)

LatentDiffusionModelBase<T>.DecodeFromLatent(Tensor<T>)

LatentDiffusionModelBase<T>.GenerateFromText(string, string, int, int, int, double?, int?)

LatentDiffusionModelBase<T>.ImageToImage(Tensor<T>, string, string, double, int, double?, int?)

LatentDiffusionModelBase<T>.Inpaint(Tensor<T>, Tensor<T>, string, string, int, double?, int?)

LatentDiffusionModelBase<T>.SetGuidanceScale(double)

LatentDiffusionModelBase<T>.PredictNoise(Tensor<T>, int)

LatentDiffusionModelBase<T>.Generate(int[], int, int?)

LatentDiffusionModelBase<T>.ApplyGuidance(Tensor<T>, Tensor<T>, double)

LatentDiffusionModelBase<T>.SampleNoiseTensor(int[], Random)

LatentDiffusionModelBase<T>.ResizeMaskToLatent(Tensor<T>, int[])

LatentDiffusionModelBase<T>.BlendLatentsWithMask(Tensor<T>, Tensor<T>, Tensor<T>, int)

DiffusionModelBase<T>.NumOps

DiffusionModelBase<T>.RandomGenerator

DiffusionModelBase<T>.LossFunction

DiffusionModelBase<T>.LearningRate

DiffusionModelBase<T>.Scheduler

DiffusionModelBase<T>.DefaultLossFunction

DiffusionModelBase<T>.SupportsJitCompilation

DiffusionModelBase<T>.ComputeLoss(Tensor<T>, Tensor<T>, int[])

DiffusionModelBase<T>.Train(Tensor<T>, Tensor<T>)

DiffusionModelBase<T>.Predict(Tensor<T>)

DiffusionModelBase<T>.GetModelMetadata()

DiffusionModelBase<T>.WithParameters(Vector<T>)

DiffusionModelBase<T>.Serialize()

DiffusionModelBase<T>.Deserialize(byte[])

DiffusionModelBase<T>.SaveModel(string)

DiffusionModelBase<T>.LoadModel(string)

DiffusionModelBase<T>.SaveState(Stream)

DiffusionModelBase<T>.LoadState(Stream)

DiffusionModelBase<T>.GetActiveFeatureIndices()

DiffusionModelBase<T>.SetActiveFeatureIndices(IEnumerable<int>)

DiffusionModelBase<T>.IsFeatureUsed(int)

DiffusionModelBase<T>.GetFeatureImportance()

DiffusionModelBase<T>.ComputeGradients(Tensor<T>, Tensor<T>, ILossFunction<T>)

DiffusionModelBase<T>.ApplyGradients(Vector<T>, T)

DiffusionModelBase<T>.ExportComputationGraph(List<ComputationNode<T>>)

DiffusionModelBase<T>.SampleNoise(int, Random)

object.Equals(object)

object.Equals(object, object)

object.GetHashCode()

object.GetType()

object.MemberwiseClone()

object.ReferenceEquals(object, object)

object.ToString()

Extension Methods: DistributedExtensions.AsDistributedForHighBandwidth<T, TInput, TOutput>(IFullModel<T, TInput, TOutput>, ICommunicationBackend<T>)

DistributedExtensions.AsDistributedForLowBandwidth<T, TInput, TOutput>(IFullModel<T, TInput, TOutput>, ICommunicationBackend<T>)

DistributedExtensions.AsDistributed<T, TInput, TOutput>(IFullModel<T, TInput, TOutput>, ICommunicationBackend<T>)

DistributedExtensions.AsDistributed<T, TInput, TOutput>(IFullModel<T, TInput, TOutput>, IShardingConfiguration<T>)

Examples

// Create a Point-E model
var pointE = new PointEModel<float>();

// Generate a point cloud from text
var pointCloud = pointE.GeneratePointCloud(
    prompt: "A wooden chair",
    numPoints: 4096,
    numInferenceSteps: 64,
    guidanceScale: 3.0);

// pointCloud shape: [1, 4096, 6] - XYZ + RGB per point

// Generate from an image
var image = LoadImage("chair_photo.jpg");
var fromImage = pointE.GenerateFromImage(image, numPoints: 4096);

// Export to PLY file for viewing in 3D software
ExportToPLY(pointCloud, "chair.ply");

Remarks

Point-E is OpenAI's model for generating 3D point clouds from text descriptions. It uses a two-stage pipeline: first generating a synthetic view of the object, then generating a point cloud conditioned on that view.

For Beginners: Point-E creates 3D objects as "point clouds" - collections of colored points in 3D space that form a shape:

What is a point cloud?

Thousands of 3D points (X, Y, Z coordinates)
Each point can have a color (R, G, B)
Together they form the surface of an object
Like a very detailed dot-to-dot drawing in 3D

Example: "A red chair"

Point-E first imagines what the chair looks like (synthetic image)
Then generates 4096 points forming the chair shape
Points are colored red where appropriate
Result: A 3D point cloud you can view from any angle

Use cases:

3D modeling: Quick prototypes for games, VR, AR
Visualization: Create 3D representations from descriptions
Dataset creation: Generate synthetic 3D training data

Technical specifications: - Default point count: 4096 (can generate 1024, 4096, or 16384) - Coordinate range: [-1, 1] normalized - Color: RGB values [0, 1] - Two-stage: Image generation + point cloud diffusion - Inference: ~40 steps for image, ~64 for point cloud

Constructors

PointEModel()

Initializes a new Point-E model with default parameters.

public PointEModel()

PointEModel(DiffusionModelOptions<T>?, INoiseScheduler<T>?, DiTNoisePredictor<T>?, ILatentDiffusionModel<T>?, IConditioningModule<T>?, int, bool, int?)

Initializes a new Point-E model with custom parameters.

public PointEModel(DiffusionModelOptions<T>? options = null, INoiseScheduler<T>? scheduler = null, DiTNoisePredictor<T>? pointCloudPredictor = null, ILatentDiffusionModel<T>? imageGenerator = null, IConditioningModule<T>? conditioner = null, int defaultPointCount = 4096, bool useTwoStage = true, int? seed = null)

Parameters

options DiffusionModelOptions<T>: Configuration options.
scheduler INoiseScheduler<T>: Optional custom scheduler.
pointCloudPredictor DiTNoisePredictor<T>: Optional custom point cloud predictor.
imageGenerator ILatentDiffusionModel<T>: Optional image generator for two-stage pipeline.
conditioner IConditioningModule<T>: Optional conditioning module.
defaultPointCount int: Default number of points.
useTwoStage bool: Whether to use two-stage generation.
seed int?: Optional random seed.

Properties

Conditioner

Gets the conditioning module (optional, for conditioned generation).

public override IConditioningModule<T>? Conditioner { get; }

Property Value

IConditioningModule<T>

LatentChannels

Gets the number of latent channels.

public override int LatentChannels { get; }

Property Value

int

Remarks

Typically 4 for Stable Diffusion models.

NoisePredictor

Gets the noise predictor model (U-Net, DiT, etc.).

public override INoisePredictor<T> NoisePredictor { get; }

Property Value

INoisePredictor<T>

ParameterCount

Gets the number of parameters in the model.

public override int ParameterCount { get; }

Property Value

int

Remarks

This property returns the total count of trainable parameters in the model. It's useful for understanding model complexity and memory requirements.

PointCloudPredictor

Gets the point cloud predictor.

public DiTNoisePredictor<T> PointCloudPredictor { get; }

Property Value

DiTNoisePredictor<T>

SupportsMesh

Gets whether this model supports mesh generation.

public override bool SupportsMesh { get; }

Property Value

bool

SupportsNovelView

Gets whether this model supports novel view synthesis.

public override bool SupportsNovelView { get; }

Property Value

bool

SupportsPointCloud

Gets whether this model supports point cloud generation.

public override bool SupportsPointCloud { get; }

Property Value

bool

SupportsScoreDistillation

Gets whether this model supports score distillation sampling (SDS).

public override bool SupportsScoreDistillation { get; }

Property Value

bool

Remarks

SDS uses gradients from a 2D diffusion model to optimize a 3D representation. This is the technique behind DreamFusion and similar text-to-3D methods.

SupportsTexture

Gets whether this model supports texture generation.

public override bool SupportsTexture { get; }

Property Value

bool

UsesTwoStage

Gets whether this model uses two-stage generation.

public bool UsesTwoStage { get; }

Property Value

bool

VAE

Gets the VAE model used for encoding and decoding.

public override IVAEModel<T> VAE { get; }

Property Value

IVAEModel<T>

Methods

Clone()

Creates a deep copy of the model.

public override IDiffusionModel<T> Clone()

Returns

IDiffusionModel<T>: A new instance with the same parameters.

ConvertToMesh(Tensor<T>, int)

Converts point cloud to mesh using marching cubes (simplified).

public virtual (Tensor<T> Vertices, Tensor<T> Faces) ConvertToMesh(Tensor<T> pointCloud, int resolution = 64)

Parameters

pointCloud Tensor<T>: Point cloud tensor [1, numPoints, 6].
resolution int: Grid resolution for marching cubes.

Returns

(Tensor<T> grad1, Tensor<T> grad2): Mesh data as (vertices, faces) tuple.

Remarks

This is a simplified placeholder. Real mesh conversion would use proper point cloud to mesh algorithms like Poisson reconstruction or ball pivoting.

DeepCopy()

Creates a deep copy of this object.

public override IFullModel<T, Tensor<T>, Tensor<T>> DeepCopy()

Returns

IFullModel<T, Tensor<T>, Tensor<T>>

GenerateFromImage(Tensor<T>, int?, int, int?)

Generates a point cloud from an image.

public virtual Tensor<T> GenerateFromImage(Tensor<T> image, int? numPoints = null, int numInferenceSteps = 64, int? seed = null)

Parameters

image Tensor<T>: Input image tensor [batch, channels, height, width].
numPoints int?: Number of points to generate.
numInferenceSteps int: Number of denoising steps.
seed int?: Optional random seed.

Returns

Tensor<T>: Point cloud tensor [1, numPoints, 6].

Remarks

For Beginners: This creates a 3D model from a single photo:

Input: A photo of a mug from the front Output: A 3D point cloud of the mug, viewable from all angles

How it works:

The image is encoded to understand what's in it
Point-E uses this encoding to guide 3D generation
The diffusion process creates points that match the image

Limitations:

Only sees one angle, so back of objects is "imagined"
Works best with centered, simple objects
May not capture fine details

GeneratePointCloud(string, string?, int?, int, double, int?)

Generates a colored point cloud from a text prompt.

public override Tensor<T> GeneratePointCloud(string prompt, string? negativePrompt = null, int? numPoints = null, int numInferenceSteps = 64, double guidanceScale = 3, int? seed = null)

Parameters

prompt string: Text description of the 3D object.
negativePrompt string: Optional negative prompt.
numPoints int?: Number of points to generate.
numInferenceSteps int: Number of denoising steps.
guidanceScale double: Classifier-free guidance scale.
seed int?: Optional random seed.

Returns

Tensor<T>: Point cloud tensor [1, numPoints, 6] with XYZ + RGB per point.

GetParameters()

Gets the parameters that can be optimized.

public override Vector<T> GetParameters()

Returns

Vector<T>

SetParameters(Vector<T>)

Sets the model parameters.

public override void SetParameters(Vector<T> parameters)

Parameters

parameters Vector<T>: The parameter vector to set.

Remarks

This method allows direct modification of the model's internal parameters. This is useful for optimization algorithms that need to update parameters iteratively. If the length of parameters does not match ParameterCount, an ArgumentException should be thrown.

Exceptions

ArgumentException: Thrown when the length of parameters does not match ParameterCount.

Table of Contents

Class PointEModel<T>

Type Parameters

Examples

Remarks

Constructors

PointEModel()

PointEModel(DiffusionModelOptions<T>?, INoiseScheduler<T>?, DiTNoisePredictor<T>?, ILatentDiffusionModel<T>?, IConditioningModule<T>?, int, bool, int?)

Parameters

Properties

Conditioner

Property Value

LatentChannels

Property Value

Remarks

NoisePredictor

Property Value

ParameterCount

Property Value

Remarks

PointCloudPredictor

Property Value

SupportsMesh

Property Value

SupportsNovelView

Property Value

SupportsPointCloud

Property Value

SupportsScoreDistillation

Property Value

Remarks

SupportsTexture

Property Value

UsesTwoStage

Property Value

VAE

Property Value

Methods

Clone()

Returns

ConvertToMesh(Tensor<T>, int)

Parameters

Returns

Remarks

DeepCopy()

Returns

GenerateFromImage(Tensor<T>, int?, int, int?)

Parameters

Returns

Remarks

GeneratePointCloud(string, string?, int?, int, double, int?)

Parameters

Returns

GetParameters()

Returns

SetParameters(Vector<T>)

Parameters

Remarks

Exceptions