Class ZeRO3Optimizer<T, TInput, TOutput>
- Namespace
- AiDotNet.DistributedTraining
- Assembly
- AiDotNet.dll
Implements ZeRO Stage 3 optimizer - full sharding equivalent to FSDP.
public class ZeRO3Optimizer<T, TInput, TOutput> : FSDPOptimizer<T, TInput, TOutput>, IShardedOptimizer<T, TInput, TOutput>, IOptimizer<T, TInput, TOutput>, IModelSerializer
Type Parameters
TThe numeric type
TInputThe input type for the model
TOutputThe output type for the model
- Inheritance
-
ShardedOptimizerBase<T, TInput, TOutput>FSDPOptimizer<T, TInput, TOutput>ZeRO3Optimizer<T, TInput, TOutput>
- Implements
-
IShardedOptimizer<T, TInput, TOutput>IOptimizer<T, TInput, TOutput>
- Inherited Members
- Extension Methods
Remarks
Strategy Overview: ZeRO-3 is equivalent to FSDP optimizer - full sharding of parameters, gradients, and optimizer states. This class is an alias to FSDPOptimizer for ZeRO terminology consistency.
For Beginners: ZeRO-3 and FSDP optimizers are the same thing. Use whichever name you prefer. Everything is sharded for maximum memory efficiency.
Constructors
ZeRO3Optimizer(IOptimizer<T, TInput, TOutput>, IShardingConfiguration<T>)
public ZeRO3Optimizer(IOptimizer<T, TInput, TOutput> wrappedOptimizer, IShardingConfiguration<T> config)
Parameters
wrappedOptimizerIOptimizer<T, TInput, TOutput>configIShardingConfiguration<T>