Class Zero123Model<T>

Namespace: AiDotNet.Diffusion.Models

Assembly: AiDotNet.dll

Zero-1-to-3 model for novel view synthesis from a single image.

public class Zero123Model<T> : LatentDiffusionModelBase<T>, ILatentDiffusionModel<T>, IDiffusionModel<T>, IFullModel<T, Tensor<T>, Tensor<T>>, IModel<Tensor<T>, Tensor<T>, ModelMetadata<T>>, IModelSerializer, ICheckpointableModel, IParameterizable<T, Tensor<T>, Tensor<T>>, IFeatureAware, IFeatureImportance<T>, ICloneable<IFullModel<T, Tensor<T>, Tensor<T>>>, IGradientComputable<T, Tensor<T>, Tensor<T>>, IJitCompilable<T>

Type Parameters

T: The numeric type used for calculations.

Inheritance: object

DiffusionModelBase<T>

LatentDiffusionModelBase<T>

Zero123Model<T>

Implements: ILatentDiffusionModel<T>

IDiffusionModel<T>

IFullModel<T, Tensor<T>, Tensor<T>>

IModel<Tensor<T>, Tensor<T>, ModelMetadata<T>>

IModelSerializer

ICheckpointableModel

IParameterizable<T, Tensor<T>, Tensor<T>>

IFeatureAware

IFeatureImportance<T>

ICloneable<IFullModel<T, Tensor<T>, Tensor<T>>>

IGradientComputable<T, Tensor<T>, Tensor<T>>

IJitCompilable<T>

Inherited Members: LatentDiffusionModelBase<T>.GuidanceScale

LatentDiffusionModelBase<T>.SupportsNegativePrompt

LatentDiffusionModelBase<T>.SupportsInpainting

LatentDiffusionModelBase<T>.EncodeToLatent(Tensor<T>, bool)

LatentDiffusionModelBase<T>.DecodeFromLatent(Tensor<T>)

LatentDiffusionModelBase<T>.GenerateFromText(string, string, int, int, int, double?, int?)

LatentDiffusionModelBase<T>.ImageToImage(Tensor<T>, string, string, double, int, double?, int?)

LatentDiffusionModelBase<T>.Inpaint(Tensor<T>, Tensor<T>, string, string, int, double?, int?)

LatentDiffusionModelBase<T>.SetGuidanceScale(double)

LatentDiffusionModelBase<T>.PredictNoise(Tensor<T>, int)

LatentDiffusionModelBase<T>.Generate(int[], int, int?)

LatentDiffusionModelBase<T>.ApplyGuidance(Tensor<T>, Tensor<T>, double)

LatentDiffusionModelBase<T>.SampleNoiseTensor(int[], Random)

LatentDiffusionModelBase<T>.ResizeMaskToLatent(Tensor<T>, int[])

LatentDiffusionModelBase<T>.BlendLatentsWithMask(Tensor<T>, Tensor<T>, Tensor<T>, int)

DiffusionModelBase<T>.NumOps

DiffusionModelBase<T>.RandomGenerator

DiffusionModelBase<T>.LossFunction

DiffusionModelBase<T>.LearningRate

DiffusionModelBase<T>.Scheduler

DiffusionModelBase<T>.DefaultLossFunction

DiffusionModelBase<T>.SupportsJitCompilation

DiffusionModelBase<T>.ComputeLoss(Tensor<T>, Tensor<T>, int[])

DiffusionModelBase<T>.Train(Tensor<T>, Tensor<T>)

DiffusionModelBase<T>.Predict(Tensor<T>)

DiffusionModelBase<T>.GetModelMetadata()

DiffusionModelBase<T>.WithParameters(Vector<T>)

DiffusionModelBase<T>.Serialize()

DiffusionModelBase<T>.Deserialize(byte[])

DiffusionModelBase<T>.SaveModel(string)

DiffusionModelBase<T>.LoadModel(string)

DiffusionModelBase<T>.SaveState(Stream)

DiffusionModelBase<T>.LoadState(Stream)

DiffusionModelBase<T>.GetActiveFeatureIndices()

DiffusionModelBase<T>.SetActiveFeatureIndices(IEnumerable<int>)

DiffusionModelBase<T>.IsFeatureUsed(int)

DiffusionModelBase<T>.GetFeatureImportance()

DiffusionModelBase<T>.ComputeGradients(Tensor<T>, Tensor<T>, ILossFunction<T>)

DiffusionModelBase<T>.ApplyGradients(Vector<T>, T)

DiffusionModelBase<T>.ExportComputationGraph(List<ComputationNode<T>>)

DiffusionModelBase<T>.SampleNoise(int, Random)

object.Equals(object)

object.Equals(object, object)

object.GetHashCode()

object.GetType()

object.MemberwiseClone()

object.ReferenceEquals(object, object)

object.ToString()

Extension Methods: DistributedExtensions.AsDistributedForHighBandwidth<T, TInput, TOutput>(IFullModel<T, TInput, TOutput>, ICommunicationBackend<T>)

DistributedExtensions.AsDistributedForLowBandwidth<T, TInput, TOutput>(IFullModel<T, TInput, TOutput>, ICommunicationBackend<T>)

DistributedExtensions.AsDistributed<T, TInput, TOutput>(IFullModel<T, TInput, TOutput>, ICommunicationBackend<T>)

DistributedExtensions.AsDistributed<T, TInput, TOutput>(IFullModel<T, TInput, TOutput>, IShardingConfiguration<T>)

Examples

// Create a Zero123 model
var zero123 = new Zero123Model<float>();

// Generate novel view
var inputImage = LoadImage("object.png");
var novelView = zero123.GenerateNovelView(
    inputImage: inputImage,
    polarAngle: 30.0,    // Rotate up 30 degrees
    azimuthAngle: 45.0,  // Rotate right 45 degrees
    radius: 1.0);        // Keep same distance

// Generate multiple views for 3D reconstruction
var views = zero123.Generate360Views(inputImage, numViews: 8);

Remarks

Zero-1-to-3 (Zero123) generates new viewpoints of an object from just a single input image. It uses camera pose conditioning to control the viewpoint change, enabling 3D-aware image generation without explicit 3D reconstruction.

For Beginners: Zero123 is like having a magical camera that can show you what an object looks like from different angles, even though you only gave it one photograph.

What it does:

Takes a single image of an object
Generates images of that same object from different viewpoints
Works with any object: cars, furniture, animals, etc.

Input parameters:

Image: The original photo of the object
Camera rotation: How much to rotate the view (polar/azimuth angles)
Scale change: How close/far to zoom

Use cases:

E-commerce: Show products from multiple angles
3D reconstruction: Generate training data for 3D models
AR/VR: Create object previews from any angle
Game development: Generate sprite variations

Technical details: - Fine-tuned from Stable Diffusion - Uses CLIP image encoder for conditioning - Camera pose embedding via sinusoidal encoding - Supports arbitrary viewpoint changes - Can be used iteratively for 360° reconstruction

Reference: Liu et al., "Zero-1-to-3: Zero-shot One Image to 3D Object", 2023

Constructors

Zero123Model()

Initializes a new instance of Zero123Model with default parameters.

public Zero123Model()

Zero123Model(DiffusionModelOptions<T>?, INoiseScheduler<T>?, UNetNoisePredictor<T>?, StandardVAE<T>?, int, int?)

Initializes a new instance of Zero123Model with custom parameters.

public Zero123Model(DiffusionModelOptions<T>? options = null, INoiseScheduler<T>? scheduler = null, UNetNoisePredictor<T>? unet = null, StandardVAE<T>? vae = null, int imageSize = 256, int? seed = null)

Parameters

options DiffusionModelOptions<T>: Configuration options.
scheduler INoiseScheduler<T>: Optional custom scheduler.
unet UNetNoisePredictor<T>: Optional custom U-Net.
vae StandardVAE<T>: Optional custom VAE.
imageSize int: Image size for generation.
seed int?: Optional random seed.

Properties

Conditioner

Gets the conditioning module (optional, for conditioned generation).

public override IConditioningModule<T>? Conditioner { get; }

Property Value

IConditioningModule<T>

LatentChannels

Gets the number of latent channels.

public override int LatentChannels { get; }

Property Value

int

Remarks

Typically 4 for Stable Diffusion models.

NoisePredictor

Gets the noise predictor model (U-Net, DiT, etc.).

public override INoisePredictor<T> NoisePredictor { get; }

Property Value

INoisePredictor<T>

ParameterCount

Gets the number of parameters in the model.

public override int ParameterCount { get; }

Property Value

int

Remarks

This property returns the total count of trainable parameters in the model. It's useful for understanding model complexity and memory requirements.

VAE

Gets the VAE model used for encoding and decoding.

public override IVAEModel<T> VAE { get; }

Property Value

IVAEModel<T>

Methods

Clone()

Creates a deep copy of the model.

public override IDiffusionModel<T> Clone()

Returns

IDiffusionModel<T>: A new instance with the same parameters.

DeepCopy()

Creates a deep copy of this object.

public override IFullModel<T, Tensor<T>, Tensor<T>> DeepCopy()

Returns

IFullModel<T, Tensor<T>, Tensor<T>>

Generate360Views(Tensor<T>, int, double, int, double?, int?)

Generates multiple views around an object (360° views).

public virtual List<Tensor<T>> Generate360Views(Tensor<T> inputImage, int numViews = 8, double polarAngle = 0, int numInferenceSteps = 50, double? guidanceScale = null, int? seed = null)

Parameters

inputImage Tensor<T>: The input image.
numViews int: Number of views to generate.
polarAngle double: Fixed polar angle for all views.
numInferenceSteps int: Denoising steps per view.
guidanceScale double?: Guidance scale.
seed int?: Random seed.

Returns

List<Tensor<T>>: List of generated views.

GenerateMultipleViews(Tensor<T>, double[], double[], int, double?, int?)

Generates views at multiple elevation angles.

public virtual List<Tensor<T>> GenerateMultipleViews(Tensor<T> inputImage, double[] azimuthAngles, double[] polarAngles, int numInferenceSteps = 50, double? guidanceScale = null, int? seed = null)

Parameters

inputImage Tensor<T>: The input image.
azimuthAngles double[]: List of azimuth angles.
polarAngles double[]: List of polar angles.
numInferenceSteps int: Denoising steps.
guidanceScale double?: Guidance scale.
seed int?: Random seed.

Returns

List<Tensor<T>>: List of generated views.

GenerateNovelView(Tensor<T>, double, double, double, int, double?, int?)

Generates a novel view of an object.

public virtual Tensor<T> GenerateNovelView(Tensor<T> inputImage, double polarAngle, double azimuthAngle, double radius = 1, int numInferenceSteps = 50, double? guidanceScale = null, int? seed = null)

Parameters

inputImage Tensor<T>: The input image of the object.
polarAngle double: Polar angle change in degrees (vertical rotation).
azimuthAngle double: Azimuth angle change in degrees (horizontal rotation).
radius double: Relative radius change (1.0 = same distance).
numInferenceSteps int: Number of denoising steps.
guidanceScale double?: Classifier-free guidance scale.
seed int?: Optional random seed.

Returns

Tensor<T>: The generated novel view image.

GetParameters()

Gets the parameters that can be optimized.

public override Vector<T> GetParameters()

Returns

Vector<T>

SetParameters(Vector<T>)

Sets the model parameters.

public override void SetParameters(Vector<T> parameters)

Parameters

parameters Vector<T>: The parameter vector to set.

Remarks

This method allows direct modification of the model's internal parameters. This is useful for optimization algorithms that need to update parameters iteratively. If the length of parameters does not match ParameterCount, an ArgumentException should be thrown.

Exceptions

ArgumentException: Thrown when the length of parameters does not match ParameterCount.

Table of Contents

Class Zero123Model<T>

Type Parameters

Examples

Remarks

Constructors

Zero123Model()

Zero123Model(DiffusionModelOptions<T>?, INoiseScheduler<T>?, UNetNoisePredictor<T>?, StandardVAE<T>?, int, int?)

Parameters

Properties

Conditioner

Property Value

LatentChannels

Property Value

Remarks

NoisePredictor

Property Value

ParameterCount

Property Value

Remarks

VAE

Property Value

Methods

Clone()

Returns

DeepCopy()

Returns

Generate360Views(Tensor<T>, int, double, int, double?, int?)

Parameters

Returns

GenerateMultipleViews(Tensor<T>, double[], double[], int, double?, int?)

Parameters

Returns

GenerateNovelView(Tensor<T>, double, double, double, int, double?, int?)

Parameters

Returns

GetParameters()

Returns

SetParameters(Vector<T>)

Parameters

Remarks

Exceptions