Class LionOptimizerOptions<T, TInput, TOutput>

Namespace: AiDotNet.Models.Options

Assembly: AiDotNet.dll

Configuration options for the Lion (Evolved Sign Momentum) optimization algorithm.

public class LionOptimizerOptions<T, TInput, TOutput> : GradientBasedOptimizerOptions<T, TInput, TOutput>

Type Parameters

T
TInput
TOutput

Inheritance: object

ModelOptions

OptimizationAlgorithmOptions<T, TInput, TOutput>

GradientBasedOptimizerOptions<T, TInput, TOutput>

LionOptimizerOptions<T, TInput, TOutput>

Inherited Members: GradientBasedOptimizerOptions<T, TInput, TOutput>.GradientCache

GradientBasedOptimizerOptions<T, TInput, TOutput>.LossFunction

GradientBasedOptimizerOptions<T, TInput, TOutput>.Regularization

GradientBasedOptimizerOptions<T, TInput, TOutput>.DataSampler

GradientBasedOptimizerOptions<T, TInput, TOutput>.ShuffleData

GradientBasedOptimizerOptions<T, TInput, TOutput>.DropLastBatch

GradientBasedOptimizerOptions<T, TInput, TOutput>.RandomSeed

GradientBasedOptimizerOptions<T, TInput, TOutput>.EnableGradientClipping

GradientBasedOptimizerOptions<T, TInput, TOutput>.GradientClippingMethod

GradientBasedOptimizerOptions<T, TInput, TOutput>.MaxGradientNorm

GradientBasedOptimizerOptions<T, TInput, TOutput>.MaxGradientValue

GradientBasedOptimizerOptions<T, TInput, TOutput>.LearningRateScheduler

GradientBasedOptimizerOptions<T, TInput, TOutput>.SchedulerStepMode

OptimizationAlgorithmOptions<T, TInput, TOutput>.MaxIterations

OptimizationAlgorithmOptions<T, TInput, TOutput>.UseEarlyStopping

OptimizationAlgorithmOptions<T, TInput, TOutput>.EarlyStoppingPatience

OptimizationAlgorithmOptions<T, TInput, TOutput>.BadFitPatience

OptimizationAlgorithmOptions<T, TInput, TOutput>.MinimumFeatures

OptimizationAlgorithmOptions<T, TInput, TOutput>.MaximumFeatures

OptimizationAlgorithmOptions<T, TInput, TOutput>.UseExpressionTrees

OptimizationAlgorithmOptions<T, TInput, TOutput>.InitialLearningRate

OptimizationAlgorithmOptions<T, TInput, TOutput>.UseAdaptiveLearningRate

OptimizationAlgorithmOptions<T, TInput, TOutput>.LearningRateDecay

OptimizationAlgorithmOptions<T, TInput, TOutput>.MinLearningRate

OptimizationAlgorithmOptions<T, TInput, TOutput>.MaxLearningRate

OptimizationAlgorithmOptions<T, TInput, TOutput>.UseAdaptiveMomentum

OptimizationAlgorithmOptions<T, TInput, TOutput>.InitialMomentum

OptimizationAlgorithmOptions<T, TInput, TOutput>.MomentumIncreaseFactor

OptimizationAlgorithmOptions<T, TInput, TOutput>.MomentumDecreaseFactor

OptimizationAlgorithmOptions<T, TInput, TOutput>.MinMomentum

OptimizationAlgorithmOptions<T, TInput, TOutput>.MaxMomentum

OptimizationAlgorithmOptions<T, TInput, TOutput>.ExplorationRate

OptimizationAlgorithmOptions<T, TInput, TOutput>.MinExplorationRate

OptimizationAlgorithmOptions<T, TInput, TOutput>.MaxExplorationRate

OptimizationAlgorithmOptions<T, TInput, TOutput>.Tolerance

OptimizationAlgorithmOptions<T, TInput, TOutput>.OptimizationMode

OptimizationAlgorithmOptions<T, TInput, TOutput>.ParameterAdjustmentScale

OptimizationAlgorithmOptions<T, TInput, TOutput>.SignFlipProbability

OptimizationAlgorithmOptions<T, TInput, TOutput>.FeatureSelectionProbability

OptimizationAlgorithmOptions<T, TInput, TOutput>.ParameterAdjustmentProbability

OptimizationAlgorithmOptions<T, TInput, TOutput>.PredictionOptions

OptimizationAlgorithmOptions<T, TInput, TOutput>.ModelStatsOptions

OptimizationAlgorithmOptions<T, TInput, TOutput>.ModelEvaluator

OptimizationAlgorithmOptions<T, TInput, TOutput>.FitDetector

OptimizationAlgorithmOptions<T, TInput, TOutput>.FitnessCalculator

OptimizationAlgorithmOptions<T, TInput, TOutput>.ModelCache

OptimizationAlgorithmOptions<T, TInput, TOutput>.CreateDefaults(OptimizerType)

ModelOptions.Seed

object.Equals(object)

object.Equals(object, object)

object.GetHashCode()

object.GetType()

object.MemberwiseClone()

object.ReferenceEquals(object, object)

object.ToString()

Remarks

Lion is a modern optimization algorithm discovered through symbolic program search that offers significant advantages over Adam, including 50% memory reduction and superior performance on large models. Unlike Adam which maintains both momentum and variance, Lion uses only a single momentum state and relies on sign-based updates for improved efficiency and generalization.

For Beginners: Think of Lion as a streamlined version of Adam that focuses on the direction of learning (not the magnitude). It's like a compass that only tells you which way to go, making decisions faster and using less memory. Lion is particularly effective for training large neural networks and transformers, where it can achieve better results than Adam while using half the memory.

Properties

BatchSize

Gets or sets the batch size for mini-batch gradient descent.

public int BatchSize { get; set; }

Property Value

int: A positive integer, defaulting to 32.

Remarks

For Beginners: The batch size controls how many examples the optimizer looks at before making an update to the model. The default of 32 is a good balance for Lion.

Beta1

Gets or sets the exponential decay rate for the momentum interpolation (used for computing the update).

public double Beta1 { get; set; }

Property Value

double: The beta1 value, defaulting to 0.9.

Remarks

For Beginners: Beta1 controls how much Lion blends the current gradient with past momentum when deciding which direction to move. A value of 0.9 means it gives 90% weight to the past momentum and 10% to the new gradient. This is like having inertia - you don't change direction immediately when you get new information. Higher values (closer to 1) create smoother updates but slower adaptation, while lower values respond more quickly to new gradients.

Beta1DecreaseFactor

Gets or sets the factor by which Beta1 is decreased when fitness does not improve.

public double Beta1DecreaseFactor { get; set; }

Property Value

double: The Beta1 decrease factor, defaulting to 0.98.

Remarks

For Beginners: When adaptive Beta1 is enabled and the optimizer is not improving, Beta1 is multiplied by this factor. A value of 0.98 means Beta1 decreases by 2% each time fitness doesn't improve. Lower Beta1 values make the optimizer more responsive to new gradients.

Beta1IncreaseFactor

Gets or sets the factor by which Beta1 is increased when fitness improves.

public double Beta1IncreaseFactor { get; set; }

Property Value

double: The Beta1 increase factor, defaulting to 1.02.

Remarks

For Beginners: When adaptive Beta1 is enabled and the optimizer is improving, Beta1 is multiplied by this factor. A value of 1.02 means Beta1 increases by 2% each time fitness improves. Higher Beta1 values create smoother, more stable updates.

Beta2

Gets or sets the exponential decay rate for updating the momentum state.

public double Beta2 { get; set; }

Property Value

double: The beta2 value, defaulting to 0.99.

Remarks

For Beginners: Beta2 controls how much Lion remembers from its momentum history when updating the momentum state for the next iteration. A value of 0.99 means it retains 99% of the old momentum and incorporates 1% from the new gradient. This creates a long memory of past gradients, helping smooth out noisy updates. Think of it like a heavy flywheel that doesn't change speed quickly - it provides stability during training.

Beta2DecreaseFactor

Gets or sets the factor by which Beta2 is decreased when fitness does not improve.

public double Beta2DecreaseFactor { get; set; }

Property Value

double: The Beta2 decrease factor, defaulting to 0.98.

Remarks

For Beginners: When adaptive Beta2 is enabled and the optimizer is not improving, Beta2 is multiplied by this factor. A value of 0.98 means Beta2 decreases by 2% each time fitness doesn't improve. Lower Beta2 values make the momentum state more responsive to recent changes.

Beta2IncreaseFactor

Gets or sets the factor by which Beta2 is increased when fitness improves.

public double Beta2IncreaseFactor { get; set; }

Property Value

double: The Beta2 increase factor, defaulting to 1.02.

Remarks

For Beginners: When adaptive Beta2 is enabled and the optimizer is improving, Beta2 is multiplied by this factor. A value of 1.02 means Beta2 increases by 2% each time fitness improves. Higher Beta2 values create longer memory of past gradients for more stability.

InitialLearningRate

Gets or sets the initial learning rate for the Lion optimizer.

public override double InitialLearningRate { get; set; }

Property Value

double: The learning rate, defaulting to 0.0001.

Remarks

For Beginners: The learning rate controls how big each step is during training. Lion typically uses a smaller learning rate (0.0001) compared to Adam (0.001) because sign-based updates provide more consistent step sizes. Think of it like setting the speed of your car - Lion moves more carefully but more reliably. You may need to tune this based on your problem, but 0.0001 (or 1e-4) is a good starting point for most applications.

MaxBeta1

Gets or sets the maximum allowed value for Beta1.

public double MaxBeta1 { get; set; }

Property Value

double: The maximum Beta1 value, defaulting to 0.95.

Remarks

For Beginners: If adaptive Beta1 is enabled, this prevents it from becoming too high. A maximum of 0.95 ensures some responsiveness to new gradients.

MaxBeta2

Gets or sets the maximum allowed value for Beta2.

public double MaxBeta2 { get; set; }

Property Value

double: The maximum Beta2 value, defaulting to 0.999.

Remarks

For Beginners: If adaptive Beta2 is enabled, this prevents it from becoming too high. A maximum of 0.999 ensures the momentum state can still adapt to changes.

MinBeta1

Gets or sets the minimum allowed value for Beta1.

public double MinBeta1 { get; set; }

Property Value

double: The minimum Beta1 value, defaulting to 0.85.

Remarks

For Beginners: If adaptive Beta1 is enabled, this prevents it from dropping too low. A minimum of 0.85 ensures some momentum is always maintained.

MinBeta2

Gets or sets the minimum allowed value for Beta2.

public double MinBeta2 { get; set; }

Property Value

double: The minimum Beta2 value, defaulting to 0.95.

Remarks

For Beginners: If adaptive Beta2 is enabled, this prevents it from dropping too low. A minimum of 0.95 ensures momentum state retains sufficient history.

UseAdaptiveBeta1

Gets or sets whether to automatically adjust Beta1 during training.

public bool UseAdaptiveBeta1 { get; set; }

Property Value

bool: False by default, as Lion typically uses fixed betas.

Remarks

For Beginners: When enabled, Beta1 will be automatically adjusted based on training progress. However, Lion was designed to work well with fixed beta values, so this is disabled by default. Unlike Adam, Lion is less sensitive to beta parameter choices due to its sign-based updates. You typically don't need to enable this unless you're doing advanced experimentation.

UseAdaptiveBeta2

Gets or sets whether to automatically adjust Beta2 during training.

public bool UseAdaptiveBeta2 { get; set; }

Property Value

bool: False by default, as Lion typically uses fixed betas.

Remarks

For Beginners: When enabled, Beta2 will be automatically adjusted based on training progress. However, Lion was designed to work well with fixed beta values, so this is disabled by default. The sign-based nature of Lion makes it robust to beta parameter variations.

WeightDecay

Gets or sets the weight decay (L2 regularization) coefficient.

public double WeightDecay { get; set; }

Property Value

double: The weight decay value, defaulting to 0.0.

Remarks

For Beginners: Weight decay helps prevent overfitting by penalizing large parameter values. A value of 0.0 means no weight decay. When set to a small positive value (e.g., 0.01 or 0.1), it encourages the model to keep weights small, which often improves generalization to new data. Think of it like a tax on complexity - it encourages the model to be as simple as possible while still solving the problem. Lion applies weight decay in a decoupled manner, similar to AdamW.

Table of Contents

Class LionOptimizerOptions<T, TInput, TOutput>

Type Parameters

Remarks

Properties

BatchSize

Property Value

Remarks

Beta1

Property Value

Remarks

Beta1DecreaseFactor

Property Value

Remarks

Beta1IncreaseFactor

Property Value

Remarks

Beta2

Property Value

Remarks

Beta2DecreaseFactor

Property Value

Remarks

Beta2IncreaseFactor

Property Value

Remarks

InitialLearningRate

Property Value

Remarks

MaxBeta1

Property Value

Remarks

MaxBeta2

Property Value

Remarks

MinBeta1

Property Value

Remarks

MinBeta2

Property Value

Remarks

UseAdaptiveBeta1

Property Value

Remarks

UseAdaptiveBeta2

Property Value

Remarks

WeightDecay

Property Value

Remarks