Table of Contents

Class ZeRO3Optimizer<T, TInput, TOutput>

Namespace
AiDotNet.DistributedTraining
Assembly
AiDotNet.dll

Implements ZeRO Stage 3 optimizer - full sharding equivalent to FSDP.

public class ZeRO3Optimizer<T, TInput, TOutput> : FSDPOptimizer<T, TInput, TOutput>, IShardedOptimizer<T, TInput, TOutput>, IOptimizer<T, TInput, TOutput>, IModelSerializer

Type Parameters

T

The numeric type

TInput

The input type for the model

TOutput

The output type for the model

Inheritance
ShardedOptimizerBase<T, TInput, TOutput>
FSDPOptimizer<T, TInput, TOutput>
ZeRO3Optimizer<T, TInput, TOutput>
Implements
IShardedOptimizer<T, TInput, TOutput>
IOptimizer<T, TInput, TOutput>
Inherited Members
Extension Methods

Remarks

Strategy Overview: ZeRO-3 is equivalent to FSDP optimizer - full sharding of parameters, gradients, and optimizer states. This class is an alias to FSDPOptimizer for ZeRO terminology consistency.

For Beginners: ZeRO-3 and FSDP optimizers are the same thing. Use whichever name you prefer. Everything is sharded for maximum memory efficiency.

Constructors

ZeRO3Optimizer(IOptimizer<T, TInput, TOutput>, IShardingConfiguration<T>)

public ZeRO3Optimizer(IOptimizer<T, TInput, TOutput> wrappedOptimizer, IShardingConfiguration<T> config)

Parameters

wrappedOptimizer IOptimizer<T, TInput, TOutput>
config IShardingConfiguration<T>