Table of Contents

Class ControlNetModel<T>

Namespace
AiDotNet.Diffusion.Models
Assembly
AiDotNet.dll

ControlNet model for adding spatial conditioning to diffusion models.

public class ControlNetModel<T> : LatentDiffusionModelBase<T>, ILatentDiffusionModel<T>, IDiffusionModel<T>, IFullModel<T, Tensor<T>, Tensor<T>>, IModel<Tensor<T>, Tensor<T>, ModelMetadata<T>>, IModelSerializer, ICheckpointableModel, IParameterizable<T, Tensor<T>, Tensor<T>>, IFeatureAware, IFeatureImportance<T>, ICloneable<IFullModel<T, Tensor<T>, Tensor<T>>>, IGradientComputable<T, Tensor<T>, Tensor<T>>, IJitCompilable<T>

Type Parameters

T

The numeric type used for calculations.

Inheritance
ControlNetModel<T>
Implements
IFullModel<T, Tensor<T>, Tensor<T>>
IModel<Tensor<T>, Tensor<T>, ModelMetadata<T>>
IParameterizable<T, Tensor<T>, Tensor<T>>
ICloneable<IFullModel<T, Tensor<T>, Tensor<T>>>
IGradientComputable<T, Tensor<T>, Tensor<T>>
Inherited Members
Extension Methods

Examples

// Create a ControlNet model
var controlNet = new ControlNetModel<float>(
    controlType: ControlType.Canny);

// Generate with edge control
var edgeMap = LoadCannyEdges("house_edges.png");
var image = controlNet.GenerateWithControl(
    prompt: "A beautiful Victorian house",
    controlImage: edgeMap,
    conditioningStrength: 1.0);

// Multi-control generation
var depthMap = LoadDepthMap("scene_depth.png");
var imageMulti = controlNet.GenerateWithMultiControl(
    prompt: "Forest landscape",
    controlImages: new[] { edgeMap, depthMap },
    controlTypes: new[] { ControlType.Canny, ControlType.Depth },
    conditioningStrengths: new[] { 0.8, 0.6 });

Remarks

ControlNet enables fine-grained spatial control over image generation by adding additional conditioning signals such as edge maps, depth maps, pose keypoints, segmentation masks, and more. It works by creating a trainable copy of the encoder blocks that process the control signal.

For Beginners: ControlNet is like giving the AI artist a reference sketch or blueprint to follow while creating an image.

Supported control types:

  • Canny edges: Outline/edge detection of shapes
  • Depth maps: 3D depth information
  • Pose keypoints: Human body positions (OpenPose)
  • Segmentation: Region/object boundaries
  • Normal maps: Surface orientation
  • Scribbles: Simple user drawings
  • Line art: Clean line drawings

How it works:

  1. You provide a control image (e.g., edge map of a house)
  2. ControlNet encodes this control signal
  3. The encoded control guides the diffusion process
  4. Result: Generated image follows the control structure

Example use cases:

  • "Draw a Victorian house" + edge map = house in exact shape
  • "Dancing woman" + pose skeleton = person in exact pose
  • "Forest scene" + depth map = correct 3D perspective

Technical details: - ControlNet is a "zero convolution" architecture - Copies encoder weights from base model - Adds control signal via residual connections - Can be combined: multi-ControlNet stacking - Supports conditioning strength adjustment (0-1)

Reference: Zhang et al., "Adding Conditional Control to Text-to-Image Diffusion Models", 2023

Constructors

ControlNetModel(ControlType)

Initializes a new instance of ControlNetModel with default parameters.

public ControlNetModel(ControlType controlType = ControlType.Canny)

Parameters

controlType ControlType

The type of control signal (default: Canny edges).

ControlNetModel(DiffusionModelOptions<T>?, INoiseScheduler<T>?, UNetNoisePredictor<T>?, StandardVAE<T>?, IConditioningModule<T>?, ControlType, int?)

Initializes a new instance of ControlNetModel with custom parameters.

public ControlNetModel(DiffusionModelOptions<T>? options = null, INoiseScheduler<T>? scheduler = null, UNetNoisePredictor<T>? baseUNet = null, StandardVAE<T>? vae = null, IConditioningModule<T>? conditioner = null, ControlType controlType = ControlType.Canny, int? seed = null)

Parameters

options DiffusionModelOptions<T>

Configuration options for the diffusion model.

scheduler INoiseScheduler<T>

Optional custom scheduler.

baseUNet UNetNoisePredictor<T>

Optional base U-Net noise predictor.

vae StandardVAE<T>

Optional custom VAE.

conditioner IConditioningModule<T>

Optional conditioning module for text encoding.

controlType ControlType

The type of control signal.

seed int?

Optional random seed for reproducibility.

Properties

Conditioner

Gets the conditioning module (optional, for conditioned generation).

public override IConditioningModule<T>? Conditioner { get; }

Property Value

IConditioningModule<T>

ConditioningStrength

Gets or sets the default conditioning strength (0-1).

public double ConditioningStrength { get; set; }

Property Value

double

ControlType

Gets the type of control signal this model uses.

public ControlType ControlType { get; }

Property Value

ControlType

LatentChannels

Gets the number of latent channels.

public override int LatentChannels { get; }

Property Value

int

Remarks

Typically 4 for Stable Diffusion models.

NoisePredictor

Gets the noise predictor model (U-Net, DiT, etc.).

public override INoisePredictor<T> NoisePredictor { get; }

Property Value

INoisePredictor<T>

ParameterCount

Gets the number of parameters in the model.

public override int ParameterCount { get; }

Property Value

int

Remarks

This property returns the total count of trainable parameters in the model. It's useful for understanding model complexity and memory requirements.

VAE

Gets the VAE model used for encoding and decoding.

public override IVAEModel<T> VAE { get; }

Property Value

IVAEModel<T>

Methods

Clone()

Creates a deep copy of the model.

public override IDiffusionModel<T> Clone()

Returns

IDiffusionModel<T>

A new instance with the same parameters.

DeepCopy()

Creates a deep copy of this object.

public override IFullModel<T, Tensor<T>, Tensor<T>> DeepCopy()

Returns

IFullModel<T, Tensor<T>, Tensor<T>>

GenerateWithControl(string, Tensor<T>, string?, int, int, int, double?, double?, int?)

Generates an image with control signal.

public virtual Tensor<T> GenerateWithControl(string prompt, Tensor<T> controlImage, string? negativePrompt = null, int width = 512, int height = 512, int numInferenceSteps = 50, double? guidanceScale = null, double? conditioningStrength = null, int? seed = null)

Parameters

prompt string

The text prompt describing the desired image.

controlImage Tensor<T>

The control image (e.g., edge map, depth map).

negativePrompt string

Optional negative prompt.

width int

Output image width.

height int

Output image height.

numInferenceSteps int

Number of denoising steps.

guidanceScale double?

Classifier-free guidance scale.

conditioningStrength double?

How strongly to apply the control (0-1).

seed int?

Optional random seed.

Returns

Tensor<T>

The generated image tensor.

GenerateWithMultiControl(string, Tensor<T>[], ControlType[], double[]?, string?, int, int, int, double?, int?)

Generates an image with multiple control signals.

public virtual Tensor<T> GenerateWithMultiControl(string prompt, Tensor<T>[] controlImages, ControlType[] controlTypes, double[]? conditioningStrengths = null, string? negativePrompt = null, int width = 512, int height = 512, int numInferenceSteps = 50, double? guidanceScale = null, int? seed = null)

Parameters

prompt string

The text prompt.

controlImages Tensor<T>[]

Array of control images.

controlTypes ControlType[]

Array of control types.

conditioningStrengths double[]

Array of conditioning strengths.

negativePrompt string

Optional negative prompt.

width int

Output width.

height int

Output height.

numInferenceSteps int

Number of steps.

guidanceScale double?

Guidance scale.

seed int?

Random seed.

Returns

Tensor<T>

The generated image.

GetParameters()

Gets the parameters that can be optimized.

public override Vector<T> GetParameters()

Returns

Vector<T>

SetParameters(Vector<T>)

Sets the model parameters.

public override void SetParameters(Vector<T> parameters)

Parameters

parameters Vector<T>

The parameter vector to set.

Remarks

This method allows direct modification of the model's internal parameters. This is useful for optimization algorithms that need to update parameters iteratively. If the length of parameters does not match ParameterCount, an ArgumentException should be thrown.

Exceptions

ArgumentException

Thrown when the length of parameters does not match ParameterCount.