Class IPAdapterModel<T>

Namespace: AiDotNet.Diffusion.Models

Assembly: AiDotNet.dll

IP-Adapter model for image-based prompt conditioning in diffusion models.

public class IPAdapterModel<T> : LatentDiffusionModelBase<T>, ILatentDiffusionModel<T>, IDiffusionModel<T>, IFullModel<T, Tensor<T>, Tensor<T>>, IModel<Tensor<T>, Tensor<T>, ModelMetadata<T>>, IModelSerializer, ICheckpointableModel, IParameterizable<T, Tensor<T>, Tensor<T>>, IFeatureAware, IFeatureImportance<T>, ICloneable<IFullModel<T, Tensor<T>, Tensor<T>>>, IGradientComputable<T, Tensor<T>, Tensor<T>>, IJitCompilable<T>

Type Parameters

T: The numeric type used for calculations.

Inheritance: object

DiffusionModelBase<T>

LatentDiffusionModelBase<T>

IPAdapterModel<T>

Implements: ILatentDiffusionModel<T>

IDiffusionModel<T>

IFullModel<T, Tensor<T>, Tensor<T>>

IModel<Tensor<T>, Tensor<T>, ModelMetadata<T>>

IModelSerializer

ICheckpointableModel

IParameterizable<T, Tensor<T>, Tensor<T>>

IFeatureAware

IFeatureImportance<T>

ICloneable<IFullModel<T, Tensor<T>, Tensor<T>>>

IGradientComputable<T, Tensor<T>, Tensor<T>>

IJitCompilable<T>

Inherited Members: LatentDiffusionModelBase<T>.GuidanceScale

LatentDiffusionModelBase<T>.SupportsNegativePrompt

LatentDiffusionModelBase<T>.SupportsInpainting

LatentDiffusionModelBase<T>.EncodeToLatent(Tensor<T>, bool)

LatentDiffusionModelBase<T>.DecodeFromLatent(Tensor<T>)

LatentDiffusionModelBase<T>.GenerateFromText(string, string, int, int, int, double?, int?)

LatentDiffusionModelBase<T>.ImageToImage(Tensor<T>, string, string, double, int, double?, int?)

LatentDiffusionModelBase<T>.Inpaint(Tensor<T>, Tensor<T>, string, string, int, double?, int?)

LatentDiffusionModelBase<T>.SetGuidanceScale(double)

LatentDiffusionModelBase<T>.PredictNoise(Tensor<T>, int)

LatentDiffusionModelBase<T>.Generate(int[], int, int?)

LatentDiffusionModelBase<T>.ApplyGuidance(Tensor<T>, Tensor<T>, double)

LatentDiffusionModelBase<T>.SampleNoiseTensor(int[], Random)

LatentDiffusionModelBase<T>.ResizeMaskToLatent(Tensor<T>, int[])

LatentDiffusionModelBase<T>.BlendLatentsWithMask(Tensor<T>, Tensor<T>, Tensor<T>, int)

DiffusionModelBase<T>.NumOps

DiffusionModelBase<T>.RandomGenerator

DiffusionModelBase<T>.LossFunction

DiffusionModelBase<T>.LearningRate

DiffusionModelBase<T>.Scheduler

DiffusionModelBase<T>.DefaultLossFunction

DiffusionModelBase<T>.SupportsJitCompilation

DiffusionModelBase<T>.ComputeLoss(Tensor<T>, Tensor<T>, int[])

DiffusionModelBase<T>.Train(Tensor<T>, Tensor<T>)

DiffusionModelBase<T>.Predict(Tensor<T>)

DiffusionModelBase<T>.GetModelMetadata()

DiffusionModelBase<T>.WithParameters(Vector<T>)

DiffusionModelBase<T>.Serialize()

DiffusionModelBase<T>.Deserialize(byte[])

DiffusionModelBase<T>.SaveModel(string)

DiffusionModelBase<T>.LoadModel(string)

DiffusionModelBase<T>.SaveState(Stream)

DiffusionModelBase<T>.LoadState(Stream)

DiffusionModelBase<T>.GetActiveFeatureIndices()

DiffusionModelBase<T>.SetActiveFeatureIndices(IEnumerable<int>)

DiffusionModelBase<T>.IsFeatureUsed(int)

DiffusionModelBase<T>.GetFeatureImportance()

DiffusionModelBase<T>.ComputeGradients(Tensor<T>, Tensor<T>, ILossFunction<T>)

DiffusionModelBase<T>.ApplyGradients(Vector<T>, T)

DiffusionModelBase<T>.ExportComputationGraph(List<ComputationNode<T>>)

DiffusionModelBase<T>.SampleNoise(int, Random)

object.Equals(object)

object.Equals(object, object)

object.GetHashCode()

object.GetType()

object.MemberwiseClone()

object.ReferenceEquals(object, object)

object.ToString()

Extension Methods: DistributedExtensions.AsDistributedForHighBandwidth<T, TInput, TOutput>(IFullModel<T, TInput, TOutput>, ICommunicationBackend<T>)

DistributedExtensions.AsDistributedForLowBandwidth<T, TInput, TOutput>(IFullModel<T, TInput, TOutput>, ICommunicationBackend<T>)

DistributedExtensions.AsDistributed<T, TInput, TOutput>(IFullModel<T, TInput, TOutput>, ICommunicationBackend<T>)

DistributedExtensions.AsDistributed<T, TInput, TOutput>(IFullModel<T, TInput, TOutput>, IShardingConfiguration<T>)

Examples

// Create an IP-Adapter model
var ipAdapter = new IPAdapterModel<float>();

// Generate with image reference
var referenceImage = LoadImage("style_reference.png");
var image = ipAdapter.GenerateWithImagePrompt(
    textPrompt: "A beautiful landscape",
    imagePrompt: referenceImage,
    imagePromptWeight: 0.7);

// Multi-image reference
var faceImage = LoadImage("face.png");
var styleImage = LoadImage("art_style.png");
var composed = ipAdapter.GenerateWithMultiImagePrompt(
    textPrompt: "Portrait painting",
    imagePrompts: new[] { faceImage, styleImage },
    imageWeights: new[] { 0.8, 0.5 });

Remarks

IP-Adapter (Image Prompt Adapter) enables using reference images as prompts to guide image generation. It decouples cross-attention for text and image features, allowing fine-grained control over image style, composition, and content transfer.

For Beginners: IP-Adapter lets you use pictures as instructions for the AI instead of just text.

Think of it like:

Showing someone a photo and saying "make something like this"
The AI extracts the style, composition, and content from your image
It then applies those elements to create new images

Use cases:

Style transfer: "Generate in the style of this artwork"
Face preservation: Keep a person's likeness in different scenes
Object consistency: Maintain the same object across images
Scene composition: Use reference for layout/arrangement

Key advantage: Combines with text prompts for precise control

Technical details: - Uses a pretrained image encoder (like CLIP ViT) - Projects image features to text embedding space - Injects via decoupled cross-attention mechanism - Supports multiple reference images (multi-IP) - Adjustable image prompt weight (0-1)

Reference: Ye et al., "IP-Adapter: Text Compatible Image Prompt Adapter", 2023

Constructors

IPAdapterModel()

Initializes a new instance of IPAdapterModel with default parameters.

public IPAdapterModel()

IPAdapterModel(DiffusionModelOptions<T>?, INoiseScheduler<T>?, UNetNoisePredictor<T>?, StandardVAE<T>?, IConditioningModule<T>?, int, int?)

Initializes a new instance of IPAdapterModel with custom parameters.

public IPAdapterModel(DiffusionModelOptions<T>? options = null, INoiseScheduler<T>? scheduler = null, UNetNoisePredictor<T>? baseUNet = null, StandardVAE<T>? vae = null, IConditioningModule<T>? conditioner = null, int embedDim = 768, int? seed = null)

Parameters

options DiffusionModelOptions<T>: Configuration options for the diffusion model.
scheduler INoiseScheduler<T>: Optional custom scheduler.
baseUNet UNetNoisePredictor<T>: Optional base U-Net noise predictor.
vae StandardVAE<T>: Optional custom VAE.
conditioner IConditioningModule<T>: Optional conditioning module for text encoding.
embedDim int: Image embedding dimension.
seed int?: Optional random seed for reproducibility.

Properties

Conditioner

Gets the conditioning module (optional, for conditioned generation).

public override IConditioningModule<T>? Conditioner { get; }

Property Value

IConditioningModule<T>

ImagePromptWeight

Gets or sets the default image prompt weight (0-1).

public double ImagePromptWeight { get; set; }

Property Value

double

LatentChannels

Gets the number of latent channels.

public override int LatentChannels { get; }

Property Value

int

Remarks

Typically 4 for Stable Diffusion models.

NoisePredictor

Gets the noise predictor model (U-Net, DiT, etc.).

public override INoisePredictor<T> NoisePredictor { get; }

Property Value

INoisePredictor<T>

ParameterCount

Gets the number of parameters in the model.

public override int ParameterCount { get; }

Property Value

int

Remarks

This property returns the total count of trainable parameters in the model. It's useful for understanding model complexity and memory requirements.

VAE

Gets the VAE model used for encoding and decoding.

public override IVAEModel<T> VAE { get; }

Property Value

IVAEModel<T>

Methods

Clone()

Creates a deep copy of the model.

public override IDiffusionModel<T> Clone()

Returns

IDiffusionModel<T>: A new instance with the same parameters.

DeepCopy()

Creates a deep copy of this object.

public override IFullModel<T, Tensor<T>, Tensor<T>> DeepCopy()

Returns

IFullModel<T, Tensor<T>, Tensor<T>>

GenerateWithImagePrompt(string, Tensor<T>, string?, int, int, int, double?, double?, int?)

Generates an image with image prompt conditioning.

public virtual Tensor<T> GenerateWithImagePrompt(string textPrompt, Tensor<T> imagePrompt, string? negativePrompt = null, int width = 512, int height = 512, int numInferenceSteps = 50, double? guidanceScale = null, double? imagePromptWeight = null, int? seed = null)

Parameters

textPrompt string: The text prompt describing the desired image.
imagePrompt Tensor<T>: The reference image for conditioning.
negativePrompt string: Optional negative prompt.
width int: Output image width.
height int: Output image height.
numInferenceSteps int: Number of denoising steps.
guidanceScale double?: Classifier-free guidance scale.
imagePromptWeight double?: Weight for image prompt (0-1).
seed int?: Optional random seed.

Returns

Tensor<T>: The generated image tensor.

GenerateWithMultiImagePrompt(string, Tensor<T>[], double[]?, string?, int, int, int, double?, int?)

Generates an image with multiple image prompts.

public virtual Tensor<T> GenerateWithMultiImagePrompt(string textPrompt, Tensor<T>[] imagePrompts, double[]? imageWeights = null, string? negativePrompt = null, int width = 512, int height = 512, int numInferenceSteps = 50, double? guidanceScale = null, int? seed = null)

Parameters

textPrompt string: The text prompt.
imagePrompts Tensor<T>[]: Array of reference images.
imageWeights double[]: Optional weights for each image.
negativePrompt string: Optional negative prompt.
width int: Output width.
height int: Output height.
numInferenceSteps int: Number of steps.
guidanceScale double?: Guidance scale.
seed int?: Random seed.

Returns

Tensor<T>: The generated image.

GetParameters()

Gets the parameters that can be optimized.

public override Vector<T> GetParameters()

Returns

Vector<T>

SetParameters(Vector<T>)

Sets the model parameters.

public override void SetParameters(Vector<T> parameters)

Parameters

parameters Vector<T>: The parameter vector to set.

Remarks

This method allows direct modification of the model's internal parameters. This is useful for optimization algorithms that need to update parameters iteratively. If the length of parameters does not match ParameterCount, an ArgumentException should be thrown.

Exceptions

ArgumentException: Thrown when the length of parameters does not match ParameterCount.

Table of Contents

Class IPAdapterModel<T>

Type Parameters

Examples

Remarks

Constructors

IPAdapterModel()

IPAdapterModel(DiffusionModelOptions<T>?, INoiseScheduler<T>?, UNetNoisePredictor<T>?, StandardVAE<T>?, IConditioningModule<T>?, int, int?)

Parameters

Properties

Conditioner

Property Value

ImagePromptWeight

Property Value

LatentChannels

Property Value

Remarks

NoisePredictor

Property Value

ParameterCount

Property Value

Remarks

VAE

Property Value

Methods

Clone()

Returns

DeepCopy()

Returns

GenerateWithImagePrompt(string, Tensor<T>, string?, int, int, int, double?, double?, int?)

Parameters

Returns

GenerateWithMultiImagePrompt(string, Tensor<T>[], double[]?, string?, int, int, int, double?, int?)

Parameters

Returns

GetParameters()

Returns

SetParameters(Vector<T>)

Parameters

Remarks

Exceptions