Table of Contents

Class LinearWarmupScheduler

Namespace
AiDotNet.LearningRateSchedulers
Assembly
AiDotNet.dll

Implements linear learning rate warmup followed by constant or decay schedule.

public class LinearWarmupScheduler : LearningRateSchedulerBase, ILearningRateScheduler
Inheritance
LinearWarmupScheduler
Implements
Inherited Members

Examples

// Warmup for 1000 steps, then linear decay over remaining 9000 steps
var scheduler = new LinearWarmupScheduler(
    baseLearningRate: 0.001,
    warmupSteps: 1000,
    totalSteps: 10000,
    decayMode: LinearWarmupScheduler.DecayMode.Linear
);

Remarks

Linear warmup gradually increases the learning rate from a small initial value to the target learning rate over a specified number of warmup steps. This is commonly used in transformer training and helps stabilize early training dynamics.

For Beginners: When training starts, the model's weights are random and can produce large, unstable gradients. Starting with a very small learning rate and gradually increasing it (warmup) helps the model stabilize before moving to the full learning rate. Think of it like warming up an engine before driving at full speed.

This scheduler supports three modes after warmup: - Constant: Keep the base learning rate after warmup - Linear decay: Linearly decrease to a minimum value - Cosine decay: Use cosine annealing to decrease to a minimum value

Constructors

LinearWarmupScheduler(double, int, int, double, DecayMode?, double)

Initializes a new instance of the LinearWarmupScheduler class.

public LinearWarmupScheduler(double baseLearningRate, int warmupSteps, int totalSteps = 0, double warmupInitLr = 0, LinearWarmupScheduler.DecayMode? decayMode = null, double endLr = 0)

Parameters

baseLearningRate double

The target learning rate after warmup.

warmupSteps int

Number of warmup steps.

totalSteps int

Total number of training steps (required for decay modes).

warmupInitLr double

Initial learning rate at start of warmup. Default: 0

decayMode LinearWarmupScheduler.DecayMode?

Decay mode after warmup. When null, automatically selects Linear decay if endLr differs from baseLearningRate, otherwise uses Constant. Default: null (auto-detect)

endLr double

Final learning rate after decay. Default: 0

Exceptions

ArgumentException

Thrown when parameters are invalid.

Properties

CurrentDecayMode

Gets the decay mode.

public LinearWarmupScheduler.DecayMode CurrentDecayMode { get; }

Property Value

LinearWarmupScheduler.DecayMode

TotalSteps

Gets the total number of steps.

public int TotalSteps { get; }

Property Value

int

WarmupSteps

Gets the number of warmup steps.

public int WarmupSteps { get; }

Property Value

int

Methods

ComputeLearningRate(int)

Computes the learning rate for a given step.

protected override double ComputeLearningRate(int step)

Parameters

step int

The step number.

Returns

double

The computed learning rate.

GetState()

Gets the scheduler state for serialization/checkpointing.

public override Dictionary<string, object> GetState()

Returns

Dictionary<string, object>

A dictionary containing the scheduler state.