Class IPAdapterModel<T>
IP-Adapter model for image-based prompt conditioning in diffusion models.
public class IPAdapterModel<T> : LatentDiffusionModelBase<T>, ILatentDiffusionModel<T>, IDiffusionModel<T>, IFullModel<T, Tensor<T>, Tensor<T>>, IModel<Tensor<T>, Tensor<T>, ModelMetadata<T>>, IModelSerializer, ICheckpointableModel, IParameterizable<T, Tensor<T>, Tensor<T>>, IFeatureAware, IFeatureImportance<T>, ICloneable<IFullModel<T, Tensor<T>, Tensor<T>>>, IGradientComputable<T, Tensor<T>, Tensor<T>>, IJitCompilable<T>
Type Parameters
TThe numeric type used for calculations.
- Inheritance
-
IPAdapterModel<T>
- Implements
- Inherited Members
- Extension Methods
Examples
// Create an IP-Adapter model
var ipAdapter = new IPAdapterModel<float>();
// Generate with image reference
var referenceImage = LoadImage("style_reference.png");
var image = ipAdapter.GenerateWithImagePrompt(
textPrompt: "A beautiful landscape",
imagePrompt: referenceImage,
imagePromptWeight: 0.7);
// Multi-image reference
var faceImage = LoadImage("face.png");
var styleImage = LoadImage("art_style.png");
var composed = ipAdapter.GenerateWithMultiImagePrompt(
textPrompt: "Portrait painting",
imagePrompts: new[] { faceImage, styleImage },
imageWeights: new[] { 0.8, 0.5 });
Remarks
IP-Adapter (Image Prompt Adapter) enables using reference images as prompts to guide image generation. It decouples cross-attention for text and image features, allowing fine-grained control over image style, composition, and content transfer.
For Beginners: IP-Adapter lets you use pictures as instructions for the AI instead of just text.
Think of it like:
- Showing someone a photo and saying "make something like this"
- The AI extracts the style, composition, and content from your image
- It then applies those elements to create new images
Use cases:
- Style transfer: "Generate in the style of this artwork"
- Face preservation: Keep a person's likeness in different scenes
- Object consistency: Maintain the same object across images
- Scene composition: Use reference for layout/arrangement
Key advantage: Combines with text prompts for precise control
Technical details: - Uses a pretrained image encoder (like CLIP ViT) - Projects image features to text embedding space - Injects via decoupled cross-attention mechanism - Supports multiple reference images (multi-IP) - Adjustable image prompt weight (0-1)
Reference: Ye et al., "IP-Adapter: Text Compatible Image Prompt Adapter", 2023
Constructors
IPAdapterModel()
Initializes a new instance of IPAdapterModel with default parameters.
public IPAdapterModel()
IPAdapterModel(DiffusionModelOptions<T>?, INoiseScheduler<T>?, UNetNoisePredictor<T>?, StandardVAE<T>?, IConditioningModule<T>?, int, int?)
Initializes a new instance of IPAdapterModel with custom parameters.
public IPAdapterModel(DiffusionModelOptions<T>? options = null, INoiseScheduler<T>? scheduler = null, UNetNoisePredictor<T>? baseUNet = null, StandardVAE<T>? vae = null, IConditioningModule<T>? conditioner = null, int embedDim = 768, int? seed = null)
Parameters
optionsDiffusionModelOptions<T>Configuration options for the diffusion model.
schedulerINoiseScheduler<T>Optional custom scheduler.
baseUNetUNetNoisePredictor<T>Optional base U-Net noise predictor.
vaeStandardVAE<T>Optional custom VAE.
conditionerIConditioningModule<T>Optional conditioning module for text encoding.
embedDimintImage embedding dimension.
seedint?Optional random seed for reproducibility.
Properties
Conditioner
Gets the conditioning module (optional, for conditioned generation).
public override IConditioningModule<T>? Conditioner { get; }
Property Value
ImagePromptWeight
Gets or sets the default image prompt weight (0-1).
public double ImagePromptWeight { get; set; }
Property Value
LatentChannels
Gets the number of latent channels.
public override int LatentChannels { get; }
Property Value
Remarks
Typically 4 for Stable Diffusion models.
NoisePredictor
Gets the noise predictor model (U-Net, DiT, etc.).
public override INoisePredictor<T> NoisePredictor { get; }
Property Value
ParameterCount
Gets the number of parameters in the model.
public override int ParameterCount { get; }
Property Value
Remarks
This property returns the total count of trainable parameters in the model. It's useful for understanding model complexity and memory requirements.
VAE
Gets the VAE model used for encoding and decoding.
public override IVAEModel<T> VAE { get; }
Property Value
- IVAEModel<T>
Methods
Clone()
Creates a deep copy of the model.
public override IDiffusionModel<T> Clone()
Returns
- IDiffusionModel<T>
A new instance with the same parameters.
DeepCopy()
Creates a deep copy of this object.
public override IFullModel<T, Tensor<T>, Tensor<T>> DeepCopy()
Returns
- IFullModel<T, Tensor<T>, Tensor<T>>
GenerateWithImagePrompt(string, Tensor<T>, string?, int, int, int, double?, double?, int?)
Generates an image with image prompt conditioning.
public virtual Tensor<T> GenerateWithImagePrompt(string textPrompt, Tensor<T> imagePrompt, string? negativePrompt = null, int width = 512, int height = 512, int numInferenceSteps = 50, double? guidanceScale = null, double? imagePromptWeight = null, int? seed = null)
Parameters
textPromptstringThe text prompt describing the desired image.
imagePromptTensor<T>The reference image for conditioning.
negativePromptstringOptional negative prompt.
widthintOutput image width.
heightintOutput image height.
numInferenceStepsintNumber of denoising steps.
guidanceScaledouble?Classifier-free guidance scale.
imagePromptWeightdouble?Weight for image prompt (0-1).
seedint?Optional random seed.
Returns
- Tensor<T>
The generated image tensor.
GenerateWithMultiImagePrompt(string, Tensor<T>[], double[]?, string?, int, int, int, double?, int?)
Generates an image with multiple image prompts.
public virtual Tensor<T> GenerateWithMultiImagePrompt(string textPrompt, Tensor<T>[] imagePrompts, double[]? imageWeights = null, string? negativePrompt = null, int width = 512, int height = 512, int numInferenceSteps = 50, double? guidanceScale = null, int? seed = null)
Parameters
textPromptstringThe text prompt.
imagePromptsTensor<T>[]Array of reference images.
imageWeightsdouble[]Optional weights for each image.
negativePromptstringOptional negative prompt.
widthintOutput width.
heightintOutput height.
numInferenceStepsintNumber of steps.
guidanceScaledouble?Guidance scale.
seedint?Random seed.
Returns
- Tensor<T>
The generated image.
GetParameters()
Gets the parameters that can be optimized.
public override Vector<T> GetParameters()
Returns
- Vector<T>
SetParameters(Vector<T>)
Sets the model parameters.
public override void SetParameters(Vector<T> parameters)
Parameters
parametersVector<T>The parameter vector to set.
Remarks
This method allows direct modification of the model's internal parameters.
This is useful for optimization algorithms that need to update parameters iteratively.
If the length of parameters does not match ParameterCount,
an ArgumentException should be thrown.
Exceptions
- ArgumentException
Thrown when the length of
parametersdoes not match ParameterCount.