Class DallE3Model<T>

Namespace: AiDotNet.Diffusion.Models

Assembly: AiDotNet.dll

DALL-E 3 style text-to-image generation model with advanced prompt understanding and high-fidelity image generation capabilities.

public class DallE3Model<T> : LatentDiffusionModelBase<T>, ILatentDiffusionModel<T>, IDiffusionModel<T>, IFullModel<T, Tensor<T>, Tensor<T>>, IModel<Tensor<T>, Tensor<T>, ModelMetadata<T>>, IModelSerializer, ICheckpointableModel, IParameterizable<T, Tensor<T>, Tensor<T>>, IFeatureAware, IFeatureImportance<T>, ICloneable<IFullModel<T, Tensor<T>, Tensor<T>>>, IGradientComputable<T, Tensor<T>, Tensor<T>>, IJitCompilable<T>, IDallE3Model<T>

Type Parameters

T: The numeric type used for calculations.

Inheritance: object

DiffusionModelBase<T>

LatentDiffusionModelBase<T>

DallE3Model<T>

Implements: ILatentDiffusionModel<T>

IDiffusionModel<T>

IFullModel<T, Tensor<T>, Tensor<T>>

IModel<Tensor<T>, Tensor<T>, ModelMetadata<T>>

IModelSerializer

ICheckpointableModel

IParameterizable<T, Tensor<T>, Tensor<T>>

IFeatureAware

IFeatureImportance<T>

ICloneable<IFullModel<T, Tensor<T>, Tensor<T>>>

IGradientComputable<T, Tensor<T>, Tensor<T>>

IJitCompilable<T>

IDallE3Model<T>

Inherited Members: LatentDiffusionModelBase<T>.GuidanceScale

LatentDiffusionModelBase<T>.SupportsNegativePrompt

LatentDiffusionModelBase<T>.SupportsInpainting

LatentDiffusionModelBase<T>.EncodeToLatent(Tensor<T>, bool)

LatentDiffusionModelBase<T>.DecodeFromLatent(Tensor<T>)

LatentDiffusionModelBase<T>.GenerateFromText(string, string, int, int, int, double?, int?)

LatentDiffusionModelBase<T>.ImageToImage(Tensor<T>, string, string, double, int, double?, int?)

LatentDiffusionModelBase<T>.Inpaint(Tensor<T>, Tensor<T>, string, string, int, double?, int?)

LatentDiffusionModelBase<T>.SetGuidanceScale(double)

LatentDiffusionModelBase<T>.PredictNoise(Tensor<T>, int)

LatentDiffusionModelBase<T>.Generate(int[], int, int?)

LatentDiffusionModelBase<T>.ApplyGuidance(Tensor<T>, Tensor<T>, double)

LatentDiffusionModelBase<T>.SampleNoiseTensor(int[], Random)

LatentDiffusionModelBase<T>.ResizeMaskToLatent(Tensor<T>, int[])

LatentDiffusionModelBase<T>.BlendLatentsWithMask(Tensor<T>, Tensor<T>, Tensor<T>, int)

DiffusionModelBase<T>.NumOps

DiffusionModelBase<T>.RandomGenerator

DiffusionModelBase<T>.LossFunction

DiffusionModelBase<T>.LearningRate

DiffusionModelBase<T>.Scheduler

DiffusionModelBase<T>.DefaultLossFunction

DiffusionModelBase<T>.SupportsJitCompilation

DiffusionModelBase<T>.ComputeLoss(Tensor<T>, Tensor<T>, int[])

DiffusionModelBase<T>.Train(Tensor<T>, Tensor<T>)

DiffusionModelBase<T>.Predict(Tensor<T>)

DiffusionModelBase<T>.WithParameters(Vector<T>)

DiffusionModelBase<T>.Serialize()

DiffusionModelBase<T>.Deserialize(byte[])

DiffusionModelBase<T>.SaveModel(string)

DiffusionModelBase<T>.LoadModel(string)

DiffusionModelBase<T>.SaveState(Stream)

DiffusionModelBase<T>.LoadState(Stream)

DiffusionModelBase<T>.GetActiveFeatureIndices()

DiffusionModelBase<T>.SetActiveFeatureIndices(IEnumerable<int>)

DiffusionModelBase<T>.IsFeatureUsed(int)

DiffusionModelBase<T>.GetFeatureImportance()

DiffusionModelBase<T>.ComputeGradients(Tensor<T>, Tensor<T>, ILossFunction<T>)

DiffusionModelBase<T>.ApplyGradients(Vector<T>, T)

DiffusionModelBase<T>.ExportComputationGraph(List<ComputationNode<T>>)

DiffusionModelBase<T>.SampleNoise(int, Random)

object.Equals(object)

object.Equals(object, object)

object.GetHashCode()

object.GetType()

object.MemberwiseClone()

object.ReferenceEquals(object, object)

object.ToString()

Extension Methods: DistributedExtensions.AsDistributedForHighBandwidth<T, TInput, TOutput>(IFullModel<T, TInput, TOutput>, ICommunicationBackend<T>)

DistributedExtensions.AsDistributedForLowBandwidth<T, TInput, TOutput>(IFullModel<T, TInput, TOutput>, ICommunicationBackend<T>)

DistributedExtensions.AsDistributed<T, TInput, TOutput>(IFullModel<T, TInput, TOutput>, ICommunicationBackend<T>)

DistributedExtensions.AsDistributed<T, TInput, TOutput>(IFullModel<T, TInput, TOutput>, IShardingConfiguration<T>)

Remarks

This implementation provides DALL-E 3 style capabilities including prompt expansion, text rendering, style control, and high-quality image generation at multiple sizes.

Constructors

DallE3Model(DiffusionModelOptions<T>?, INoiseScheduler<T>?, IConditioningModule<T>?, int?)

Initializes a new instance of the DallE3Model class.

public DallE3Model(DiffusionModelOptions<T>? options = null, INoiseScheduler<T>? scheduler = null, IConditioningModule<T>? conditioner = null, int? seed = null)

Parameters

options DiffusionModelOptions<T>: Optional configuration options.
scheduler INoiseScheduler<T>: Optional noise scheduler.
conditioner IConditioningModule<T>: Optional conditioning module for text encoding.
seed int?: Optional seed for reproducibility.

Properties

Conditioner

Gets the conditioning module (optional, for conditioned generation).

public override IConditioningModule<T>? Conditioner { get; }

Property Value

IConditioningModule<T>

LatentChannels

Gets the number of latent channels.

public override int LatentChannels { get; }

Property Value

int

Remarks

Typically 4 for Stable Diffusion models.

MaxPromptLength

Gets the maximum prompt length in characters.

public int MaxPromptLength { get; }

Property Value

int

NoisePredictor

Gets the noise predictor model (U-Net, DiT, etc.).

public override INoisePredictor<T> NoisePredictor { get; }

Property Value

INoisePredictor<T>

ParameterCount

Gets the number of parameters in the model.

public override int ParameterCount { get; }

Property Value

int

Remarks

This property returns the total count of trainable parameters in the model. It's useful for understanding model complexity and memory requirements.

SupportedSizes

Gets the supported image sizes.

public IReadOnlyList<DallE3ImageSize> SupportedSizes { get; }

Property Value

IReadOnlyList<DallE3ImageSize>

SupportsEditing

Gets whether the model supports image editing (inpainting).

public bool SupportsEditing { get; }

Property Value

bool

SupportsVariations

Gets whether the model supports image variations.

public bool SupportsVariations { get; }

Property Value

bool

VAE

Gets the VAE model used for encoding and decoding.

public override IVAEModel<T> VAE { get; }

Property Value

IVAEModel<T>

Methods

CheckPromptSafety(string)

Checks if a prompt is likely to be rejected for safety reasons.

public (bool IsSafe, IEnumerable<string> FlaggedCategories) CheckPromptSafety(string prompt)

Parameters

prompt string: Prompt to check.

Returns

(bool IsSafe, IEnumerable<string> FlaggedCategories): Whether the prompt is safe and any flagged categories.

Clone()

Creates a deep copy of the model.

public override IDiffusionModel<T> Clone()

Returns

IDiffusionModel<T>: A new instance with the same parameters.

CreateVariations(Tensor<T>, int, double, DallE3ImageSize)

Generates variations of an existing image.

public IEnumerable<Tensor<T>> CreateVariations(Tensor<T> image, int count = 4, double variationStrength = 0.5, DallE3ImageSize size = DallE3ImageSize.Square1024)

Parameters

image Tensor<T>: Source image to create variations of.
count int: Number of variations to generate.
variationStrength double: How different from original (0-1).
size DallE3ImageSize: Output image size.

Returns

IEnumerable<Tensor<T>>: Collection of image variations.

DeepCopy()

Creates a deep copy of this object.

public override IFullModel<T, Tensor<T>, Tensor<T>> DeepCopy()

Returns

IFullModel<T, Tensor<T>, Tensor<T>>

Edit(Tensor<T>, Tensor<T>, string, DallE3ImageSize)

Edits an existing image based on a prompt and mask.

public Tensor<T> Edit(Tensor<T> image, Tensor<T> mask, string prompt, DallE3ImageSize size = DallE3ImageSize.Square1024)

Parameters

image Tensor<T>: Original image to edit.
mask Tensor<T>: Mask indicating areas to edit (white = edit, black = keep).
prompt string: Description of what to generate in masked areas.
size DallE3ImageSize: Output image size.

Returns

Tensor<T>: Edited image.

Remarks

For Beginners: Change specific parts of an image!

The mask tells the model where to make changes:

White (255) areas will be regenerated based on prompt
Black (0) areas will be preserved from the original
Gray areas blend between original and generated

EstimateQuality(string)

Estimates the generation quality before actually generating.

public (T PredictedQuality, IEnumerable<string> Suggestions) EstimateQuality(string prompt)

Parameters

prompt string: Prompt to evaluate.

Returns

(T PredictedQuality, IEnumerable<string> Suggestions): Predicted quality score and improvement suggestions.

ExpandPrompt(string, DallE3Style)

Expands a simple prompt into a more detailed description.

public string ExpandPrompt(string simplePrompt, DallE3Style style = DallE3Style.Vivid)

Parameters

simplePrompt string: Brief description.
style DallE3Style: Desired style for expansion.

Returns

string: Expanded, detailed prompt.

Generate(string, DallE3ImageSize, DallE3Quality, DallE3Style, int?)

Generates an image from a text prompt.

public Tensor<T> Generate(string prompt, DallE3ImageSize size = DallE3ImageSize.Square1024, DallE3Quality quality = DallE3Quality.Standard, DallE3Style style = DallE3Style.Vivid, int? seed = null)

Parameters

prompt string: Text description of the desired image.
size DallE3ImageSize: Output image size.
quality DallE3Quality: Quality setting.
style DallE3Style: Style setting.
seed int?: Optional seed for reproducibility.

Returns

Tensor<T>: Generated image tensor [channels, height, width].

Remarks

For Beginners: The main function - describe what you want!

Tips for good prompts:

Be specific about subject, style, and composition
Include lighting and mood descriptions
Mention artistic style if desired (e.g., "oil painting", "digital art")
Describe spatial relationships clearly

GenerateConsistentSet(string, IEnumerable<string>, int, DallE3ImageSize)

Generates a consistent set of images (same character/scene, different poses/angles).

public IEnumerable<Tensor<T>> GenerateConsistentSet(string basePrompt, IEnumerable<string> variations, int consistencySeed, DallE3ImageSize size = DallE3ImageSize.Square1024)

Parameters

basePrompt string: Base description of the subject.
variations IEnumerable<string>: List of variation descriptions (poses, angles, etc.).
consistencySeed int: Seed for maintaining consistency.
size DallE3ImageSize: Output image size.

Returns

IEnumerable<Tensor<T>>: Collection of consistent images.

GenerateForUseCase(string, string, DallE3ImageSize)

Generates an image optimized for a specific use case.

public Tensor<T> GenerateForUseCase(string prompt, string useCase, DallE3ImageSize size = DallE3ImageSize.Square1024)

Parameters

prompt string: Image description.
useCase string: Use case: "social_media", "product_photo", "illustration", "concept_art", "stock_photo".
size DallE3ImageSize: Output image size.

Returns

Tensor<T>: Generated image optimized for use case.

GenerateMultiple(string, int, DallE3ImageSize, DallE3Quality, DallE3Style)

Generates multiple images from a text prompt.

public IEnumerable<Tensor<T>> GenerateMultiple(string prompt, int count = 4, DallE3ImageSize size = DallE3ImageSize.Square1024, DallE3Quality quality = DallE3Quality.Standard, DallE3Style style = DallE3Style.Vivid)

Parameters

prompt string: Text description of the desired images.
count int: Number of images to generate (1-4).
size DallE3ImageSize: Output image size.
quality DallE3Quality: Quality setting.
style DallE3Style: Style setting.

Returns

IEnumerable<Tensor<T>>: Collection of generated image tensors.

GenerateTileable(string, DallE3ImageSize)

Generates a seamlessly tileable image.

public Tensor<T> GenerateTileable(string prompt, DallE3ImageSize size = DallE3ImageSize.Square1024)

Parameters

prompt string: Pattern or texture description.
size DallE3ImageSize: Output image size.

Returns

Tensor<T>: Tileable image.

Remarks

Useful for creating textures, wallpapers, and backgrounds that can be repeated without visible seams.

GenerateWithComposition(string, IEnumerable<(string Element, string Position, double Prominence)>, DallE3ImageSize)

Generates an image with controlled composition.

public Tensor<T> GenerateWithComposition(string prompt, IEnumerable<(string Element, string Position, double Prominence)> compositionGuide, DallE3ImageSize size = DallE3ImageSize.Square1024)

Parameters

prompt string: Overall description.
compositionGuide IEnumerable<(string Element, string Position, double Prominence)>: Composition elements with positions.
size DallE3ImageSize: Output image size.

Returns

Tensor<T>: Generated image following composition guide.

Remarks

For Beginners: Control where things appear in your image!

Composition guide format: [("subject", "center", 0.5), ("background", "back", 0.2), ("accent", "bottom-right", 0.3)]

GenerateWithPrompt(string, DallE3ImageSize, DallE3Quality, DallE3Style)

Generates an image with the revised/expanded prompt returned.

public (Tensor<T> Image, string RevisedPrompt) GenerateWithPrompt(string prompt, DallE3ImageSize size = DallE3ImageSize.Square1024, DallE3Quality quality = DallE3Quality.Standard, DallE3Style style = DallE3Style.Vivid)

Parameters

prompt string: Original text prompt.
size DallE3ImageSize: Output image size.
quality DallE3Quality: Quality setting.
style DallE3Style: Style setting.

Returns

(Tensor<T> Image, string RevisedPrompt): Generated image and the expanded prompt used.

Remarks

DALL-E 3 internally expands prompts for better results. This method returns both the image and the expanded prompt so you can see how your prompt was interpreted.

GenerateWithStyle(string, string, DallE3ImageSize, DallE3Quality)

Generates an image in a specific artistic style.

public Tensor<T> GenerateWithStyle(string prompt, string artisticStyle, DallE3ImageSize size = DallE3ImageSize.Square1024, DallE3Quality quality = DallE3Quality.Standard)

Parameters

prompt string: Content description.
artisticStyle string: Style: "photorealistic", "oil_painting", "watercolor", "digital_art", "anime", "sketch", "3d_render".
size DallE3ImageSize: Output image size.
quality DallE3Quality: Quality setting.

Returns

Tensor<T>: Generated image in specified style.

GenerateWithText(string, string, string, DallE3ImageSize)

Generates an image with specific text rendered in it.

public Tensor<T> GenerateWithText(string prompt, string textToRender, string textPlacement = "center", DallE3ImageSize size = DallE3ImageSize.Square1024)

Parameters

prompt string: Overall image description.
textToRender string: Exact text to appear in the image.
textPlacement string: Where to place text: "top", "center", "bottom", "overlay".
size DallE3ImageSize: Output image size.

Returns

Tensor<T>: Generated image with text.

Remarks

DALL-E 3 has improved text rendering capabilities compared to earlier models. Use this method when you need specific text to appear in the image.

GetModelMetadata()

Retrieves metadata and performance metrics about the trained model.

public override ModelMetadata<T> GetModelMetadata()

Returns

ModelMetadata<T>: An object containing metadata and performance metrics about the trained model.

Remarks

This method provides information about the model's structure, parameters, and performance metrics.

For Beginners: Model metadata is like a report card for your machine learning model.

Just as a report card shows how well a student is performing in different subjects, model metadata shows how well your model is performing and provides details about its structure.

This information typically includes:

Accuracy measures: How well does the model's predictions match actual values?
Error metrics: How far off are the model's predictions on average?
Model parameters: What patterns did the model learn from the data?
Training information: How long did training take? How many iterations were needed?

For example, in a house price prediction model, metadata might include:

Average prediction error (e.g., off by $15,000 on average)
How strongly each feature (bedrooms, location) influences the prediction
How well the model fits the training data

This information helps you understand your model's strengths and weaknesses, and decide if it's ready to use or needs more training.

GetParameters()

Gets the parameters that can be optimized.

public override Vector<T> GetParameters()

Returns

Vector<T>

Outpaint(Tensor<T>, string, int, string?)

Outpaints an image, extending it beyond its original boundaries.

public Tensor<T> Outpaint(Tensor<T> image, string direction, int extensionPixels, string? prompt = null)

Parameters

image Tensor<T>: Original image.
direction string: Direction to extend: "left", "right", "top", "bottom", "all".
extensionPixels int: How many pixels to extend.
prompt string: Optional prompt to guide the extension.

Returns

Tensor<T>: Extended image.

SetParameters(Vector<T>)

Sets the model parameters.

public override void SetParameters(Vector<T> parameters)

Parameters

parameters Vector<T>: The parameter vector to set.

Remarks

This method allows direct modification of the model's internal parameters. This is useful for optimization algorithms that need to update parameters iteratively. If the length of parameters does not match ParameterCount, an ArgumentException should be thrown.

Exceptions

ArgumentException: Thrown when the length of parameters does not match ParameterCount.

Upscale(Tensor<T>, int, bool)

Upscales an image to higher resolution.

public Tensor<T> Upscale(Tensor<T> image, int scaleFactor = 2, bool enhanceDetails = true)

Parameters

image Tensor<T>: Image to upscale.
scaleFactor int: Upscale factor (2 or 4).
enhanceDetails bool: Whether to enhance details during upscaling.

Returns

Tensor<T>: Upscaled image.

Table of Contents

Class DallE3Model<T>

Type Parameters

Remarks

Constructors

DallE3Model(DiffusionModelOptions<T>?, INoiseScheduler<T>?, IConditioningModule<T>?, int?)

Parameters

Properties

Conditioner

Property Value

LatentChannels

Property Value

Remarks

MaxPromptLength

Property Value

NoisePredictor

Property Value

ParameterCount

Property Value

Remarks

SupportedSizes

Property Value

SupportsEditing

Property Value

SupportsVariations

Property Value

VAE

Property Value

Methods

CheckPromptSafety(string)

Parameters

Returns

Clone()

Returns

CreateVariations(Tensor<T>, int, double, DallE3ImageSize)

Parameters

Returns

DeepCopy()

Returns

Edit(Tensor<T>, Tensor<T>, string, DallE3ImageSize)

Parameters

Returns

Remarks

EstimateQuality(string)

Parameters

Returns

ExpandPrompt(string, DallE3Style)

Parameters

Returns

Generate(string, DallE3ImageSize, DallE3Quality, DallE3Style, int?)

Parameters

Returns

Remarks

GenerateConsistentSet(string, IEnumerable<string>, int, DallE3ImageSize)

Parameters

Returns

GenerateForUseCase(string, string, DallE3ImageSize)

Parameters

Returns

GenerateMultiple(string, int, DallE3ImageSize, DallE3Quality, DallE3Style)

Parameters

Returns

GenerateTileable(string, DallE3ImageSize)

Parameters

Returns

Remarks

GenerateWithComposition(string, IEnumerable<(string Element, string Position, double Prominence)>, DallE3ImageSize)

Parameters

Returns

Remarks

GenerateWithPrompt(string, DallE3ImageSize, DallE3Quality, DallE3Style)

Parameters

Returns

Remarks

GenerateWithStyle(string, string, DallE3ImageSize, DallE3Quality)

Parameters

Returns

GenerateWithText(string, string, string, DallE3ImageSize)

Parameters

Returns