Interface IShardingConfiguration<T>
- Namespace
- AiDotNet.DistributedTraining
- Assembly
- AiDotNet.dll
Configuration for parameter sharding in distributed training.
public interface IShardingConfiguration<T>
Type Parameters
TThe numeric type
Remarks
For Beginners: This configuration tells the sharding system how to divide up parameters and how to handle communication. Think of it as the "rules" for how the team collaborates.
Properties
AutoSyncGradients
Gets whether to automatically synchronize gradients after backward pass.
bool AutoSyncGradients { get; }
Property Value
Remarks
For Beginners: When true, gradients are automatically shared across all processes after each training step. This is usually what you want for standard training. You might set it to false if you want manual control over synchronization.
Default: true
CommunicationBackend
Gets the communication backend to use for distributed operations.
ICommunicationBackend<T> CommunicationBackend { get; }
Property Value
Remarks
For Beginners: This is the "communication system" that processes use to talk to each other. It could be an in-memory backend for testing or an MPI backend for real distributed training across multiple machines.
EnableGradientCompression
Gets whether to enable gradient compression to reduce communication costs.
bool EnableGradientCompression { get; }
Property Value
Remarks
For Beginners: Gradient compression reduces the size of data that needs to be sent between processes. It's like zipping a file before sending it - faster to send, but requires a tiny bit of extra work to compress/decompress. This can significantly speed up training on slower networks.
Default: false
LearningRate
Gets the learning rate for gradient application during training.
T LearningRate { get; }
Property Value
- T
Remarks
For Beginners: The learning rate controls how much to update model parameters based on computed gradients. A typical default is 0.01. Lower values mean slower but more stable learning; higher values mean faster but potentially unstable learning.
Default: 0.01
MinimumParameterGroupSize
Gets the minimum parameter group size for sharding.
int MinimumParameterGroupSize { get; }
Property Value
Remarks
Parameters smaller than this might be grouped together to reduce communication overhead.
For Beginners: Sending many tiny messages is inefficient. This setting groups small parameters together into larger chunks before communicating them. Think of it like sending one big box instead of 100 tiny envelopes.
Default: 1024