Class AdaMaxOptimizerOptions<T, TInput, TOutput>

Namespace: AiDotNet.Models.Options

Assembly: AiDotNet.dll

Configuration options for the AdaMax optimization algorithm, a variant of Adam that uses the infinity norm.

public class AdaMaxOptimizerOptions<T, TInput, TOutput> : GradientBasedOptimizerOptions<T, TInput, TOutput>

Type Parameters

T
TInput
TOutput

Inheritance: object

ModelOptions

OptimizationAlgorithmOptions<T, TInput, TOutput>

GradientBasedOptimizerOptions<T, TInput, TOutput>

AdaMaxOptimizerOptions<T, TInput, TOutput>

Inherited Members: GradientBasedOptimizerOptions<T, TInput, TOutput>.GradientCache

GradientBasedOptimizerOptions<T, TInput, TOutput>.LossFunction

GradientBasedOptimizerOptions<T, TInput, TOutput>.Regularization

GradientBasedOptimizerOptions<T, TInput, TOutput>.DataSampler

GradientBasedOptimizerOptions<T, TInput, TOutput>.ShuffleData

GradientBasedOptimizerOptions<T, TInput, TOutput>.DropLastBatch

GradientBasedOptimizerOptions<T, TInput, TOutput>.RandomSeed

GradientBasedOptimizerOptions<T, TInput, TOutput>.EnableGradientClipping

GradientBasedOptimizerOptions<T, TInput, TOutput>.GradientClippingMethod

GradientBasedOptimizerOptions<T, TInput, TOutput>.MaxGradientNorm

GradientBasedOptimizerOptions<T, TInput, TOutput>.MaxGradientValue

GradientBasedOptimizerOptions<T, TInput, TOutput>.LearningRateScheduler

GradientBasedOptimizerOptions<T, TInput, TOutput>.SchedulerStepMode

OptimizationAlgorithmOptions<T, TInput, TOutput>.MaxIterations

OptimizationAlgorithmOptions<T, TInput, TOutput>.UseEarlyStopping

OptimizationAlgorithmOptions<T, TInput, TOutput>.EarlyStoppingPatience

OptimizationAlgorithmOptions<T, TInput, TOutput>.BadFitPatience

OptimizationAlgorithmOptions<T, TInput, TOutput>.MinimumFeatures

OptimizationAlgorithmOptions<T, TInput, TOutput>.MaximumFeatures

OptimizationAlgorithmOptions<T, TInput, TOutput>.UseExpressionTrees

OptimizationAlgorithmOptions<T, TInput, TOutput>.InitialLearningRate

OptimizationAlgorithmOptions<T, TInput, TOutput>.UseAdaptiveLearningRate

OptimizationAlgorithmOptions<T, TInput, TOutput>.LearningRateDecay

OptimizationAlgorithmOptions<T, TInput, TOutput>.MinLearningRate

OptimizationAlgorithmOptions<T, TInput, TOutput>.MaxLearningRate

OptimizationAlgorithmOptions<T, TInput, TOutput>.UseAdaptiveMomentum

OptimizationAlgorithmOptions<T, TInput, TOutput>.InitialMomentum

OptimizationAlgorithmOptions<T, TInput, TOutput>.MomentumIncreaseFactor

OptimizationAlgorithmOptions<T, TInput, TOutput>.MomentumDecreaseFactor

OptimizationAlgorithmOptions<T, TInput, TOutput>.MinMomentum

OptimizationAlgorithmOptions<T, TInput, TOutput>.MaxMomentum

OptimizationAlgorithmOptions<T, TInput, TOutput>.ExplorationRate

OptimizationAlgorithmOptions<T, TInput, TOutput>.MinExplorationRate

OptimizationAlgorithmOptions<T, TInput, TOutput>.MaxExplorationRate

OptimizationAlgorithmOptions<T, TInput, TOutput>.Tolerance

OptimizationAlgorithmOptions<T, TInput, TOutput>.OptimizationMode

OptimizationAlgorithmOptions<T, TInput, TOutput>.ParameterAdjustmentScale

OptimizationAlgorithmOptions<T, TInput, TOutput>.SignFlipProbability

OptimizationAlgorithmOptions<T, TInput, TOutput>.FeatureSelectionProbability

OptimizationAlgorithmOptions<T, TInput, TOutput>.ParameterAdjustmentProbability

OptimizationAlgorithmOptions<T, TInput, TOutput>.PredictionOptions

OptimizationAlgorithmOptions<T, TInput, TOutput>.ModelStatsOptions

OptimizationAlgorithmOptions<T, TInput, TOutput>.ModelEvaluator

OptimizationAlgorithmOptions<T, TInput, TOutput>.FitDetector

OptimizationAlgorithmOptions<T, TInput, TOutput>.FitnessCalculator

OptimizationAlgorithmOptions<T, TInput, TOutput>.ModelCache

OptimizationAlgorithmOptions<T, TInput, TOutput>.CreateDefaults(OptimizerType)

ModelOptions.Seed

object.Equals(object)

object.Equals(object, object)

object.GetHashCode()

object.GetType()

object.MemberwiseClone()

object.ReferenceEquals(object, object)

object.ToString()

Remarks

AdaMax is a variant of the Adam optimizer that uses the infinity norm instead of the L2 norm in the update rule. This can make it more stable for certain types of problems, especially those with sparse gradients.

For Beginners: AdaMax is like a specialized version of a popular learning algorithm (Adam) that's particularly good at handling situations where most values are zero with occasional large values. Think of it as a specialized tool that works better than general-purpose tools for certain specific tasks.

Properties

BatchSize

Gets or sets the batch size for mini-batch gradient descent.

public int BatchSize { get; set; }

Property Value

int: A positive integer, defaulting to 32.

Remarks

For Beginners: The batch size controls how many examples the optimizer looks at before making an update to the model. The default of 32 is a good balance for AdaMax.

Beta1

Gets or sets the exponential decay rate for the first moment estimates.

public double Beta1 { get; set; }

Property Value

double: The beta1 value, defaulting to 0.9.

Remarks

For Beginners: Beta1 controls how much the algorithm remembers about the direction it was moving in previous steps. A value of 0.9 means it gives about 90% importance to past directions and 10% to the new direction. Think of it like steering a boat - you don't want to change direction completely with every small wave (which would make for a zigzag path), but rather maintain your general course while making small adjustments.

Beta2

Gets or sets the exponential decay rate for the infinity norm of past gradients.

public double Beta2 { get; set; }

Property Value

double: The beta2 value, defaulting to 0.999.

Remarks

For Beginners: Beta2 controls how much the algorithm remembers about the size of past steps. A value of 0.999 means it has a very long memory for step sizes. This helps the algorithm adapt to different parts of the learning process - taking appropriately sized steps whether it's making big initial adjustments or fine-tuning at the end. Think of it like remembering how difficult different parts of a hiking trail were, so you can pace yourself appropriately.

Epsilon

Gets or sets a small constant added to denominators to prevent division by zero.

public double Epsilon { get; set; }

Property Value

double: The epsilon value, defaulting to 0.00000001.

Remarks

For Beginners: Epsilon is a tiny safety value that prevents the algorithm from crashing when it would otherwise divide by zero. It's like having training wheels that only activate when needed. You typically don't need to change this unless you're experiencing numerical stability issues.

InitialLearningRate

Gets or sets the learning rate, which controls how quickly the model adapts to the problem.

public override double InitialLearningRate { get; set; }

Property Value

double: The learning rate, defaulting to 0.002.

Remarks

For Beginners: The learning rate is like the size of steps you take when searching for something. A larger value (like 0.1) means taking bigger steps, which can help you find the solution faster but might cause you to step over it. A smaller value (like 0.001) means taking smaller steps, which is slower but more precise. The default of 0.002 is a good balance for most problems when using AdaMax.

LearningRateDecreaseFactor

Gets or sets the factor by which the learning rate decreases when performance worsens.

public double LearningRateDecreaseFactor { get; set; }

Property Value

double: The learning rate decrease factor, defaulting to 0.95.

Remarks

For Beginners: When the model is getting worse, the learning rate will be decreased by this factor. A value of 0.95 means the learning rate becomes 95% of its previous value, causing the model to take smaller steps when it might be heading in the wrong direction. This is like slowing down when you're unsure of your path.

LearningRateIncreaseFactor

Gets or sets the factor by which the learning rate increases when performance improves.

public double LearningRateIncreaseFactor { get; set; }

Property Value

double: The learning rate increase factor, defaulting to 1.05.

Remarks

For Beginners: When the model is improving, the learning rate will be increased by this factor. A value of 1.05 means the learning rate becomes 105% of its previous value, allowing the model to learn faster when it's on the right track. This is like increasing your pace when you're heading in the right direction.

MaxLearningRate

Gets or sets the maximum allowed learning rate, overriding the base class value with a value optimized for AdaMax.

public double MaxLearningRate { get; set; }

Property Value

double: The maximum learning rate, defaulting to 0.1.

Remarks

For Beginners: This sets a ceiling for how fast the learning can get. Even if the algorithm wants to increase the learning rate further, it won't go above this value. For AdaMax, we cap it at 0.1 to prevent the algorithm from taking steps that are too large, which could cause instability. Think of it as putting a speed limit to prevent the algorithm from "overshooting" the optimal solution.

MinLearningRate

Gets or sets the minimum allowed learning rate, overriding the base class value with a value optimized for AdaMax.

public double MinLearningRate { get; set; }

Property Value

double: The minimum learning rate, defaulting to 0.00001.

Remarks

For Beginners: This sets a floor for how slow the learning can get. Even if the algorithm wants to reduce the learning rate further, it won't go below this value. For AdaMax, we use a smaller minimum value (0.00001) than the base optimizer because AdaMax can benefit from very fine adjustments in certain situations. Think of it as ensuring the algorithm never slows down too much to make progress.

Table of Contents

Class AdaMaxOptimizerOptions<T, TInput, TOutput>

Type Parameters

Remarks

Properties

BatchSize

Property Value

Remarks

Beta1

Property Value

Remarks

Beta2

Property Value

Remarks

Epsilon

Property Value

Remarks

InitialLearningRate

Property Value

Remarks

LearningRateDecreaseFactor

Property Value

Remarks

LearningRateIncreaseFactor

Property Value

Remarks

MaxLearningRate

Property Value

Remarks

MinLearningRate

Property Value

Remarks