Class LinearWarmupScheduler
- Namespace
- AiDotNet.LearningRateSchedulers
- Assembly
- AiDotNet.dll
Implements linear learning rate warmup followed by constant or decay schedule.
public class LinearWarmupScheduler : LearningRateSchedulerBase, ILearningRateScheduler
- Inheritance
-
LinearWarmupScheduler
- Implements
- Inherited Members
Examples
// Warmup for 1000 steps, then linear decay over remaining 9000 steps
var scheduler = new LinearWarmupScheduler(
baseLearningRate: 0.001,
warmupSteps: 1000,
totalSteps: 10000,
decayMode: LinearWarmupScheduler.DecayMode.Linear
);
Remarks
Linear warmup gradually increases the learning rate from a small initial value to the target learning rate over a specified number of warmup steps. This is commonly used in transformer training and helps stabilize early training dynamics.
For Beginners: When training starts, the model's weights are random and can produce large, unstable gradients. Starting with a very small learning rate and gradually increasing it (warmup) helps the model stabilize before moving to the full learning rate. Think of it like warming up an engine before driving at full speed.
This scheduler supports three modes after warmup: - Constant: Keep the base learning rate after warmup - Linear decay: Linearly decrease to a minimum value - Cosine decay: Use cosine annealing to decrease to a minimum value
Constructors
LinearWarmupScheduler(double, int, int, double, DecayMode?, double)
Initializes a new instance of the LinearWarmupScheduler class.
public LinearWarmupScheduler(double baseLearningRate, int warmupSteps, int totalSteps = 0, double warmupInitLr = 0, LinearWarmupScheduler.DecayMode? decayMode = null, double endLr = 0)
Parameters
baseLearningRatedoubleThe target learning rate after warmup.
warmupStepsintNumber of warmup steps.
totalStepsintTotal number of training steps (required for decay modes).
warmupInitLrdoubleInitial learning rate at start of warmup. Default: 0
decayModeLinearWarmupScheduler.DecayMode?Decay mode after warmup. When null, automatically selects Linear decay if endLr differs from baseLearningRate, otherwise uses Constant. Default: null (auto-detect)
endLrdoubleFinal learning rate after decay. Default: 0
Exceptions
- ArgumentException
Thrown when parameters are invalid.
Properties
CurrentDecayMode
Gets the decay mode.
public LinearWarmupScheduler.DecayMode CurrentDecayMode { get; }
Property Value
TotalSteps
Gets the total number of steps.
public int TotalSteps { get; }
Property Value
WarmupSteps
Gets the number of warmup steps.
public int WarmupSteps { get; }
Property Value
Methods
ComputeLearningRate(int)
Computes the learning rate for a given step.
protected override double ComputeLearningRate(int step)
Parameters
stepintThe step number.
Returns
- double
The computed learning rate.
GetState()
Gets the scheduler state for serialization/checkpointing.
public override Dictionary<string, object> GetState()
Returns
- Dictionary<string, object>
A dictionary containing the scheduler state.