Class ModelShard<T>

Namespace: AiDotNet.Diffusion.Memory

Assembly: AiDotNet.dll

Enables model sharding across multiple devices for large model inference.

public class ModelShard<T>

Type Parameters

T: The numeric type used for calculations.

Inheritance: object

ModelShard<T>

Inherited Members: object.Equals(object)

object.Equals(object, object)

object.GetHashCode()

object.GetType()

object.MemberwiseClone()

object.ReferenceEquals(object, object)

Remarks

Model sharding (also known as model parallelism or pipeline parallelism) splits a large model across multiple devices (GPUs/CPUs) when it's too large to fit on a single device.

For Beginners: Large AI models like Stable Diffusion XL may not fit in a single GPU.

Model sharding solves this by:

Splitting the model layers across devices
Running each device's layers in sequence
Passing intermediate results between devices

Example: UNet with 24 layers on 4 GPUs:

GPU 0: Layers 1-6
GPU 1: Layers 7-12
GPU 2: Layers 13-18
GPU 3: Layers 19-24

Data flows: Input -> GPU0 -> GPU1 -> GPU2 -> GPU3 -> Output

Usage:

var shard = new ModelShard<float>(layers, numDevices: 4);
var output = shard.Forward(input);

Constructors

ModelShard(IEnumerable<ILayer<T>>, int, ShardingConfig?)

Initializes model sharding across specified number of devices.

public ModelShard(IEnumerable<ILayer<T>> layers, int numDevices, ShardingConfig? config = null)

Parameters

layers IEnumerable<ILayer<T>>: Layers to shard.
numDevices int: Number of devices to use.
config ShardingConfig: Optional sharding configuration.

Remarks

Layers are distributed evenly by default. Use ShardingConfig for custom distribution based on memory requirements or compute costs.

Properties

Config

Sharding configuration.

public ShardingConfig Config { get; }

Property Value

ShardingConfig

Methods

Backward(Tensor<T>)

Performs backward pass through all sharded layers.

public Tensor<T> Backward(Tensor<T> outputGradient)

Parameters

outputGradient Tensor<T>: Gradient from subsequent layer.

Returns

Tensor<T>: Gradient with respect to input.

Forward(Tensor<T>)

Performs forward pass through all sharded layers.

public Tensor<T> Forward(Tensor<T> input)

Parameters

input Tensor<T>: Input tensor.

Returns

Tensor<T>: Output tensor after all layers.

Remarks

Data flows sequentially through devices. Each device processes its assigned layers before passing results to the next device.

Forward(Tensor<T>, Tensor<T>?)

Performs forward pass with context (for conditional generation).

public Tensor<T> Forward(Tensor<T> input, Tensor<T>? context)

Parameters

input Tensor<T>: Input tensor.
context Tensor<T>: Context tensor (e.g., timestep, conditioning).

Returns

Tensor<T>: Output tensor.

GetDeviceLayers(int)

Gets layers assigned to a specific device.

public IReadOnlyList<ILayer<T>> GetDeviceLayers(int device)

Parameters

device int

Returns

IReadOnlyList<ILayer<T>>

GetDeviceMemoryUsage()

Gets memory usage per device.

public IReadOnlyDictionary<int, long> GetDeviceMemoryUsage()

Returns

IReadOnlyDictionary<int, long>

GetLayerDevice(ILayer<T>)

Gets the device assignment for a layer.

public int GetLayerDevice(ILayer<T> layer)

Parameters

layer ILayer<T>

Returns

int

ToString()

Gets a summary of the sharding distribution.

public override string ToString()

Returns

string

UpdateParameters(T)

Updates parameters on all devices.

public void UpdateParameters(T learningRate)

Parameters

learningRate T: Learning rate for update.

Table of Contents

Class ModelShard<T>

Type Parameters

Remarks

Constructors

ModelShard(IEnumerable<ILayer<T>>, int, ShardingConfig?)

Parameters

Remarks

Properties

Config

Property Value

Methods

Backward(Tensor<T>)

Parameters

Returns

Forward(Tensor<T>)

Parameters

Returns

Remarks

Forward(Tensor<T>, Tensor<T>?)

Parameters

Returns

GetDeviceLayers(int)

Parameters

Returns

GetDeviceMemoryUsage()

Returns

GetLayerDevice(ILayer<T>)

Parameters

Returns

ToString()

Returns

UpdateParameters(T)

Parameters