Table of Contents

Class PointEModel<T>

Namespace
AiDotNet.Diffusion.Models
Assembly
AiDotNet.dll

Point-E model for text-to-3D point cloud generation.

public class PointEModel<T> : ThreeDDiffusionModelBase<T>, ILatentDiffusionModel<T>, I3DDiffusionModel<T>, IDiffusionModel<T>, IFullModel<T, Tensor<T>, Tensor<T>>, IModel<Tensor<T>, Tensor<T>, ModelMetadata<T>>, IModelSerializer, ICheckpointableModel, IParameterizable<T, Tensor<T>, Tensor<T>>, IFeatureAware, IFeatureImportance<T>, ICloneable<IFullModel<T, Tensor<T>, Tensor<T>>>, IGradientComputable<T, Tensor<T>, Tensor<T>>, IJitCompilable<T>

Type Parameters

T

The numeric type used for calculations.

Inheritance
PointEModel<T>
Implements
IFullModel<T, Tensor<T>, Tensor<T>>
IModel<Tensor<T>, Tensor<T>, ModelMetadata<T>>
IParameterizable<T, Tensor<T>, Tensor<T>>
ICloneable<IFullModel<T, Tensor<T>, Tensor<T>>>
IGradientComputable<T, Tensor<T>, Tensor<T>>
Inherited Members
Extension Methods

Examples

// Create a Point-E model
var pointE = new PointEModel<float>();

// Generate a point cloud from text
var pointCloud = pointE.GeneratePointCloud(
    prompt: "A wooden chair",
    numPoints: 4096,
    numInferenceSteps: 64,
    guidanceScale: 3.0);

// pointCloud shape: [1, 4096, 6] - XYZ + RGB per point

// Generate from an image
var image = LoadImage("chair_photo.jpg");
var fromImage = pointE.GenerateFromImage(image, numPoints: 4096);

// Export to PLY file for viewing in 3D software
ExportToPLY(pointCloud, "chair.ply");

Remarks

Point-E is OpenAI's model for generating 3D point clouds from text descriptions. It uses a two-stage pipeline: first generating a synthetic view of the object, then generating a point cloud conditioned on that view.

For Beginners: Point-E creates 3D objects as "point clouds" - collections of colored points in 3D space that form a shape:

What is a point cloud?

  • Thousands of 3D points (X, Y, Z coordinates)
  • Each point can have a color (R, G, B)
  • Together they form the surface of an object
  • Like a very detailed dot-to-dot drawing in 3D

Example: "A red chair"

  1. Point-E first imagines what the chair looks like (synthetic image)
  2. Then generates 4096 points forming the chair shape
  3. Points are colored red where appropriate
  4. Result: A 3D point cloud you can view from any angle

Use cases:

  • 3D modeling: Quick prototypes for games, VR, AR
  • Visualization: Create 3D representations from descriptions
  • Dataset creation: Generate synthetic 3D training data

Technical specifications: - Default point count: 4096 (can generate 1024, 4096, or 16384) - Coordinate range: [-1, 1] normalized - Color: RGB values [0, 1] - Two-stage: Image generation + point cloud diffusion - Inference: ~40 steps for image, ~64 for point cloud

Constructors

PointEModel()

Initializes a new Point-E model with default parameters.

public PointEModel()

PointEModel(DiffusionModelOptions<T>?, INoiseScheduler<T>?, DiTNoisePredictor<T>?, ILatentDiffusionModel<T>?, IConditioningModule<T>?, int, bool, int?)

Initializes a new Point-E model with custom parameters.

public PointEModel(DiffusionModelOptions<T>? options = null, INoiseScheduler<T>? scheduler = null, DiTNoisePredictor<T>? pointCloudPredictor = null, ILatentDiffusionModel<T>? imageGenerator = null, IConditioningModule<T>? conditioner = null, int defaultPointCount = 4096, bool useTwoStage = true, int? seed = null)

Parameters

options DiffusionModelOptions<T>

Configuration options.

scheduler INoiseScheduler<T>

Optional custom scheduler.

pointCloudPredictor DiTNoisePredictor<T>

Optional custom point cloud predictor.

imageGenerator ILatentDiffusionModel<T>

Optional image generator for two-stage pipeline.

conditioner IConditioningModule<T>

Optional conditioning module.

defaultPointCount int

Default number of points.

useTwoStage bool

Whether to use two-stage generation.

seed int?

Optional random seed.

Properties

Conditioner

Gets the conditioning module (optional, for conditioned generation).

public override IConditioningModule<T>? Conditioner { get; }

Property Value

IConditioningModule<T>

LatentChannels

Gets the number of latent channels.

public override int LatentChannels { get; }

Property Value

int

Remarks

Typically 4 for Stable Diffusion models.

NoisePredictor

Gets the noise predictor model (U-Net, DiT, etc.).

public override INoisePredictor<T> NoisePredictor { get; }

Property Value

INoisePredictor<T>

ParameterCount

Gets the number of parameters in the model.

public override int ParameterCount { get; }

Property Value

int

Remarks

This property returns the total count of trainable parameters in the model. It's useful for understanding model complexity and memory requirements.

PointCloudPredictor

Gets the point cloud predictor.

public DiTNoisePredictor<T> PointCloudPredictor { get; }

Property Value

DiTNoisePredictor<T>

SupportsMesh

Gets whether this model supports mesh generation.

public override bool SupportsMesh { get; }

Property Value

bool

SupportsNovelView

Gets whether this model supports novel view synthesis.

public override bool SupportsNovelView { get; }

Property Value

bool

SupportsPointCloud

Gets whether this model supports point cloud generation.

public override bool SupportsPointCloud { get; }

Property Value

bool

SupportsScoreDistillation

Gets whether this model supports score distillation sampling (SDS).

public override bool SupportsScoreDistillation { get; }

Property Value

bool

Remarks

SDS uses gradients from a 2D diffusion model to optimize a 3D representation. This is the technique behind DreamFusion and similar text-to-3D methods.

SupportsTexture

Gets whether this model supports texture generation.

public override bool SupportsTexture { get; }

Property Value

bool

UsesTwoStage

Gets whether this model uses two-stage generation.

public bool UsesTwoStage { get; }

Property Value

bool

VAE

Gets the VAE model used for encoding and decoding.

public override IVAEModel<T> VAE { get; }

Property Value

IVAEModel<T>

Methods

Clone()

Creates a deep copy of the model.

public override IDiffusionModel<T> Clone()

Returns

IDiffusionModel<T>

A new instance with the same parameters.

ConvertToMesh(Tensor<T>, int)

Converts point cloud to mesh using marching cubes (simplified).

public virtual (Tensor<T> Vertices, Tensor<T> Faces) ConvertToMesh(Tensor<T> pointCloud, int resolution = 64)

Parameters

pointCloud Tensor<T>

Point cloud tensor [1, numPoints, 6].

resolution int

Grid resolution for marching cubes.

Returns

(Tensor<T> grad1, Tensor<T> grad2)

Mesh data as (vertices, faces) tuple.

Remarks

This is a simplified placeholder. Real mesh conversion would use proper point cloud to mesh algorithms like Poisson reconstruction or ball pivoting.

DeepCopy()

Creates a deep copy of this object.

public override IFullModel<T, Tensor<T>, Tensor<T>> DeepCopy()

Returns

IFullModel<T, Tensor<T>, Tensor<T>>

GenerateFromImage(Tensor<T>, int?, int, int?)

Generates a point cloud from an image.

public virtual Tensor<T> GenerateFromImage(Tensor<T> image, int? numPoints = null, int numInferenceSteps = 64, int? seed = null)

Parameters

image Tensor<T>

Input image tensor [batch, channels, height, width].

numPoints int?

Number of points to generate.

numInferenceSteps int

Number of denoising steps.

seed int?

Optional random seed.

Returns

Tensor<T>

Point cloud tensor [1, numPoints, 6].

Remarks

For Beginners: This creates a 3D model from a single photo:

Input: A photo of a mug from the front Output: A 3D point cloud of the mug, viewable from all angles

How it works:

  1. The image is encoded to understand what's in it
  2. Point-E uses this encoding to guide 3D generation
  3. The diffusion process creates points that match the image

Limitations:

  • Only sees one angle, so back of objects is "imagined"
  • Works best with centered, simple objects
  • May not capture fine details

GeneratePointCloud(string, string?, int?, int, double, int?)

Generates a colored point cloud from a text prompt.

public override Tensor<T> GeneratePointCloud(string prompt, string? negativePrompt = null, int? numPoints = null, int numInferenceSteps = 64, double guidanceScale = 3, int? seed = null)

Parameters

prompt string

Text description of the 3D object.

negativePrompt string

Optional negative prompt.

numPoints int?

Number of points to generate.

numInferenceSteps int

Number of denoising steps.

guidanceScale double

Classifier-free guidance scale.

seed int?

Optional random seed.

Returns

Tensor<T>

Point cloud tensor [1, numPoints, 6] with XYZ + RGB per point.

GetParameters()

Gets the parameters that can be optimized.

public override Vector<T> GetParameters()

Returns

Vector<T>

SetParameters(Vector<T>)

Sets the model parameters.

public override void SetParameters(Vector<T> parameters)

Parameters

parameters Vector<T>

The parameter vector to set.

Remarks

This method allows direct modification of the model's internal parameters. This is useful for optimization algorithms that need to update parameters iteratively. If the length of parameters does not match ParameterCount, an ArgumentException should be thrown.

Exceptions

ArgumentException

Thrown when the length of parameters does not match ParameterCount.