Table of Contents

Class ShardingConfiguration<T>

Namespace
AiDotNet.DistributedTraining
Assembly
AiDotNet.dll

Default implementation of sharding configuration for distributed training.

public class ShardingConfiguration<T> : IShardingConfiguration<T>

Type Parameters

T

The numeric type

Inheritance
ShardingConfiguration<T>
Implements
Inherited Members

Remarks

For Beginners: This class holds all the settings that control how distributed training works. You can create an instance with default settings or customize it for your needs.

Example:

var config = new ShardingConfiguration<double>(backend)
{
    AutoSyncGradients = true,      // Automatically sync after each step
    MinimumParameterGroupSize = 1024,  // Group small parameters together
    EnableGradientCompression = false  // No compression for now
};

Constructors

ShardingConfiguration(ICommunicationBackend<T>, double)

Creates a new sharding configuration with the specified communication backend.

public ShardingConfiguration(ICommunicationBackend<T> communicationBackend, double learningRate = 0.01)

Parameters

communicationBackend ICommunicationBackend<T>

The communication backend to use

learningRate double

Learning rate for gradient application. Defaults to 0.01.

Remarks

For Beginners: This creates the configuration object that tells the system how to handle distributed training. You must provide a communication backend (the system that allows processes to talk to each other).

Exceptions

ArgumentNullException

Thrown if backend is null

Properties

AutoSyncGradients

Gets whether to automatically synchronize gradients after backward pass.

public bool AutoSyncGradients { get; set; }

Property Value

bool

Remarks

For Beginners: When true, gradients are automatically shared across all processes after each training step. This is usually what you want for standard training. You might set it to false if you want manual control over synchronization.

Default: true

CommunicationBackend

Gets the communication backend to use for distributed operations.

public ICommunicationBackend<T> CommunicationBackend { get; }

Property Value

ICommunicationBackend<T>

Remarks

For Beginners: This is the "communication system" that processes use to talk to each other. It could be an in-memory backend for testing or an MPI backend for real distributed training across multiple machines.

EnableGradientCompression

Gets whether to enable gradient compression to reduce communication costs.

public bool EnableGradientCompression { get; set; }

Property Value

bool

Remarks

For Beginners: Gradient compression reduces the size of data that needs to be sent between processes. It's like zipping a file before sending it - faster to send, but requires a tiny bit of extra work to compress/decompress. This can significantly speed up training on slower networks.

Default: false

LearningRate

Gets the learning rate for gradient application during training.

public T LearningRate { get; set; }

Property Value

T

Remarks

For Beginners: The learning rate controls how much to update model parameters based on computed gradients. A typical default is 0.01. Lower values mean slower but more stable learning; higher values mean faster but potentially unstable learning.

Default: 0.01

MinimumParameterGroupSize

Gets the minimum parameter group size for sharding.

public int MinimumParameterGroupSize { get; set; }

Property Value

int

Remarks

Parameters smaller than this might be grouped together to reduce communication overhead.

For Beginners: Sending many tiny messages is inefficient. This setting groups small parameters together into larger chunks before communicating them. Think of it like sending one big box instead of 100 tiny envelopes.

Default: 1024

Methods

CreateDefault(ICommunicationBackend<T>)

Creates a new sharding configuration with default settings and the specified backend.

public static ShardingConfiguration<T> CreateDefault(ICommunicationBackend<T> communicationBackend)

Parameters

communicationBackend ICommunicationBackend<T>

The communication backend to use

Returns

ShardingConfiguration<T>

A new configuration with default settings

Remarks

For Beginners: This is a convenient way to create a configuration with sensible defaults. The defaults are: - AutoSyncGradients = true (automatically sync gradients) - MinimumParameterGroupSize = 1024 (group small parameters) - EnableGradientCompression = false (no compression for simplicity)

CreateForHighBandwidth(ICommunicationBackend<T>)

Creates a configuration optimized for high-bandwidth networks (like NVLink between GPUs).

public static ShardingConfiguration<T> CreateForHighBandwidth(ICommunicationBackend<T> communicationBackend)

Parameters

communicationBackend ICommunicationBackend<T>

The communication backend to use

Returns

ShardingConfiguration<T>

A configuration optimized for high-bandwidth scenarios

Remarks

For Beginners: Use this when your GPUs or machines are connected with very fast networks. It disables compression (not needed with fast networks) and uses smaller parameter groups (communication is fast enough to handle many messages).

CreateForLowBandwidth(ICommunicationBackend<T>)

Creates a configuration optimized for low-bandwidth networks (like machines connected over ethernet).

public static ShardingConfiguration<T> CreateForLowBandwidth(ICommunicationBackend<T> communicationBackend)

Parameters

communicationBackend ICommunicationBackend<T>

The communication backend to use

Returns

ShardingConfiguration<T>

A configuration optimized for low-bandwidth scenarios

Remarks

For Beginners: Use this when your machines are connected over slower networks like regular ethernet. It enables compression to reduce the amount of data sent and uses larger parameter groups to minimize the number of messages.