Class LinearWarmupScheduler

Namespace: AiDotNet.LearningRateSchedulers

Assembly: AiDotNet.dll

Implements linear learning rate warmup followed by constant or decay schedule.

public class LinearWarmupScheduler : LearningRateSchedulerBase, ILearningRateScheduler

Inheritance: object

LearningRateSchedulerBase

LinearWarmupScheduler

Implements: ILearningRateScheduler

Inherited Members: LearningRateSchedulerBase._baseLearningRate

LearningRateSchedulerBase._currentLearningRate

LearningRateSchedulerBase._currentStep

LearningRateSchedulerBase._minLearningRate

LearningRateSchedulerBase.CurrentLearningRate

LearningRateSchedulerBase.BaseLearningRate

LearningRateSchedulerBase.CurrentStep

LearningRateSchedulerBase.Step()

LearningRateSchedulerBase.GetLearningRateAtStep(int)

LearningRateSchedulerBase.Reset()

LearningRateSchedulerBase.LoadState(Dictionary<string, object>)

object.Equals(object)

object.Equals(object, object)

object.GetHashCode()

object.GetType()

object.MemberwiseClone()

object.ReferenceEquals(object, object)

object.ToString()

Examples

// Warmup for 1000 steps, then linear decay over remaining 9000 steps
var scheduler = new LinearWarmupScheduler(
    baseLearningRate: 0.001,
    warmupSteps: 1000,
    totalSteps: 10000,
    decayMode: LinearWarmupScheduler.DecayMode.Linear
);

Remarks

Linear warmup gradually increases the learning rate from a small initial value to the target learning rate over a specified number of warmup steps. This is commonly used in transformer training and helps stabilize early training dynamics.

For Beginners: When training starts, the model's weights are random and can produce large, unstable gradients. Starting with a very small learning rate and gradually increasing it (warmup) helps the model stabilize before moving to the full learning rate. Think of it like warming up an engine before driving at full speed.

This scheduler supports three modes after warmup: - Constant: Keep the base learning rate after warmup - Linear decay: Linearly decrease to a minimum value - Cosine decay: Use cosine annealing to decrease to a minimum value

Constructors

LinearWarmupScheduler(double, int, int, double, DecayMode?, double)

Initializes a new instance of the LinearWarmupScheduler class.

public LinearWarmupScheduler(double baseLearningRate, int warmupSteps, int totalSteps = 0, double warmupInitLr = 0, LinearWarmupScheduler.DecayMode? decayMode = null, double endLr = 0)

Parameters

baseLearningRate double: The target learning rate after warmup.
warmupSteps int: Number of warmup steps.
totalSteps int: Total number of training steps (required for decay modes).
warmupInitLr double: Initial learning rate at start of warmup. Default: 0
decayMode LinearWarmupScheduler.DecayMode?: Decay mode after warmup. When null, automatically selects Linear decay if endLr differs from baseLearningRate, otherwise uses Constant. Default: null (auto-detect)
endLr double: Final learning rate after decay. Default: 0

Exceptions

ArgumentException: Thrown when parameters are invalid.

Properties

CurrentDecayMode

Gets the decay mode.

public LinearWarmupScheduler.DecayMode CurrentDecayMode { get; }

Property Value

LinearWarmupScheduler.DecayMode

TotalSteps

Gets the total number of steps.

public int TotalSteps { get; }

Property Value

int

WarmupSteps

Gets the number of warmup steps.

public int WarmupSteps { get; }

Property Value

int

Methods

ComputeLearningRate(int)

Computes the learning rate for a given step.

protected override double ComputeLearningRate(int step)

Parameters

step int: The step number.

Returns

double: The computed learning rate.

GetState()

Gets the scheduler state for serialization/checkpointing.

public override Dictionary<string, object> GetState()

Returns

Dictionary<string, object>: A dictionary containing the scheduler state.

Table of Contents

Class LinearWarmupScheduler

Examples

Remarks

Constructors

LinearWarmupScheduler(double, int, int, double, DecayMode?, double)

Parameters

Exceptions

Properties

CurrentDecayMode

Property Value

TotalSteps

Property Value

WarmupSteps

Property Value

Methods

ComputeLearningRate(int)

Parameters

Returns

GetState()

Returns