Class AdamOptimizerOptions<T, TInput, TOutput>

Namespace: AiDotNet.Models.Options

Assembly: AiDotNet.dll

Configuration options for the Adam optimization algorithm, which combines the benefits of AdaGrad and RMSProp.

public class AdamOptimizerOptions<T, TInput, TOutput> : GradientBasedOptimizerOptions<T, TInput, TOutput>

Type Parameters

T
TInput
TOutput

Inheritance: object

ModelOptions

OptimizationAlgorithmOptions<T, TInput, TOutput>

GradientBasedOptimizerOptions<T, TInput, TOutput>

AdamOptimizerOptions<T, TInput, TOutput>

Inherited Members: GradientBasedOptimizerOptions<T, TInput, TOutput>.GradientCache

GradientBasedOptimizerOptions<T, TInput, TOutput>.LossFunction

GradientBasedOptimizerOptions<T, TInput, TOutput>.Regularization

GradientBasedOptimizerOptions<T, TInput, TOutput>.DataSampler

GradientBasedOptimizerOptions<T, TInput, TOutput>.ShuffleData

GradientBasedOptimizerOptions<T, TInput, TOutput>.DropLastBatch

GradientBasedOptimizerOptions<T, TInput, TOutput>.RandomSeed

GradientBasedOptimizerOptions<T, TInput, TOutput>.EnableGradientClipping

GradientBasedOptimizerOptions<T, TInput, TOutput>.GradientClippingMethod

GradientBasedOptimizerOptions<T, TInput, TOutput>.MaxGradientNorm

GradientBasedOptimizerOptions<T, TInput, TOutput>.MaxGradientValue

GradientBasedOptimizerOptions<T, TInput, TOutput>.LearningRateScheduler

GradientBasedOptimizerOptions<T, TInput, TOutput>.SchedulerStepMode

OptimizationAlgorithmOptions<T, TInput, TOutput>.MaxIterations

OptimizationAlgorithmOptions<T, TInput, TOutput>.UseEarlyStopping

OptimizationAlgorithmOptions<T, TInput, TOutput>.EarlyStoppingPatience

OptimizationAlgorithmOptions<T, TInput, TOutput>.BadFitPatience

OptimizationAlgorithmOptions<T, TInput, TOutput>.MinimumFeatures

OptimizationAlgorithmOptions<T, TInput, TOutput>.MaximumFeatures

OptimizationAlgorithmOptions<T, TInput, TOutput>.UseExpressionTrees

OptimizationAlgorithmOptions<T, TInput, TOutput>.InitialLearningRate

OptimizationAlgorithmOptions<T, TInput, TOutput>.UseAdaptiveLearningRate

OptimizationAlgorithmOptions<T, TInput, TOutput>.LearningRateDecay

OptimizationAlgorithmOptions<T, TInput, TOutput>.MinLearningRate

OptimizationAlgorithmOptions<T, TInput, TOutput>.MaxLearningRate

OptimizationAlgorithmOptions<T, TInput, TOutput>.UseAdaptiveMomentum

OptimizationAlgorithmOptions<T, TInput, TOutput>.InitialMomentum

OptimizationAlgorithmOptions<T, TInput, TOutput>.MomentumIncreaseFactor

OptimizationAlgorithmOptions<T, TInput, TOutput>.MomentumDecreaseFactor

OptimizationAlgorithmOptions<T, TInput, TOutput>.MinMomentum

OptimizationAlgorithmOptions<T, TInput, TOutput>.MaxMomentum

OptimizationAlgorithmOptions<T, TInput, TOutput>.ExplorationRate

OptimizationAlgorithmOptions<T, TInput, TOutput>.MinExplorationRate

OptimizationAlgorithmOptions<T, TInput, TOutput>.MaxExplorationRate

OptimizationAlgorithmOptions<T, TInput, TOutput>.Tolerance

OptimizationAlgorithmOptions<T, TInput, TOutput>.OptimizationMode

OptimizationAlgorithmOptions<T, TInput, TOutput>.ParameterAdjustmentScale

OptimizationAlgorithmOptions<T, TInput, TOutput>.SignFlipProbability

OptimizationAlgorithmOptions<T, TInput, TOutput>.FeatureSelectionProbability

OptimizationAlgorithmOptions<T, TInput, TOutput>.ParameterAdjustmentProbability

OptimizationAlgorithmOptions<T, TInput, TOutput>.PredictionOptions

OptimizationAlgorithmOptions<T, TInput, TOutput>.ModelStatsOptions

OptimizationAlgorithmOptions<T, TInput, TOutput>.ModelEvaluator

OptimizationAlgorithmOptions<T, TInput, TOutput>.FitDetector

OptimizationAlgorithmOptions<T, TInput, TOutput>.FitnessCalculator

OptimizationAlgorithmOptions<T, TInput, TOutput>.ModelCache

OptimizationAlgorithmOptions<T, TInput, TOutput>.CreateDefaults(OptimizerType)

ModelOptions.Seed

object.Equals(object)

object.Equals(object, object)

object.GetHashCode()

object.GetType()

object.MemberwiseClone()

object.ReferenceEquals(object, object)

object.ToString()

Remarks

Adam (Adaptive Moment Estimation) is a popular optimization algorithm that computes adaptive learning rates for each parameter. It stores both an exponentially decaying average of past gradients (first moment) and past squared gradients (second moment).

For Beginners: Adam is like a smart learning assistant that remembers both the direction (momentum) and the size of previous steps. It automatically adjusts how big each step should be for each parameter, making it easier to train models without having to manually tune the learning rate. Adam is often a good default choice for many machine learning problems.

Properties

BatchSize

Gets or sets the batch size for mini-batch gradient descent.

public int BatchSize { get; set; }

Property Value

int: A positive integer, defaulting to 32.

Remarks

The batch size determines how many samples are processed before updating the model parameters. Larger batch sizes provide more stable gradient estimates but use more memory.

For Beginners: The batch size controls how many examples the optimizer looks at before making an update to the model:

BatchSize = 1: Update after each sample (true stochastic)
BatchSize = 32: Update after every 32 samples (typical mini-batch)
BatchSize = [entire dataset]: Batch gradient descent

The default of 32 is a good balance between speed and stability for Adam.

Beta1

Gets or sets the exponential decay rate for the first moment estimates.

public double Beta1 { get; set; }

Property Value

double: The beta1 value, defaulting to 0.9.

Remarks

For Beginners: Beta1 controls how much the algorithm remembers about the direction it was moving in previous steps. A value of 0.9 means it gives 90% importance to past directions and 10% to the new direction. Think of it like steering a boat - this parameter determines how much you consider your previous steering direction versus the new direction you want to go. Higher values (closer to 1) make for smoother but potentially slower learning.

Beta2

Gets or sets the exponential decay rate for the second moment estimates.

public double Beta2 { get; set; }

Property Value

double: The beta2 value, defaulting to 0.999.

Remarks

For Beginners: Beta2 controls how much the algorithm remembers about the size of previous steps for each parameter. A value of 0.999 means it gives 99.9% importance to past step sizes and only 0.1% to new information. This helps stabilize learning by preventing wild changes in step size. Think of it like remembering how bumpy the road has been for each wheel of your car, allowing you to adjust the suspension accordingly for a smoother ride.

Epsilon

Gets or sets a small constant added to denominators to prevent division by zero.

public double Epsilon { get; set; }

Property Value

double: The epsilon value, defaulting to 0.00000001.

Remarks

For Beginners: Epsilon is a tiny safety value that prevents the algorithm from crashing when it would otherwise divide by zero. It's like having training wheels that only activate when needed. You typically don't need to change this unless you're experiencing numerical stability issues.

InitialLearningRate

Gets or sets the initial learning rate for the Adam optimizer.

public override double InitialLearningRate { get; set; }

Property Value

double: The learning rate, defaulting to 0.001.

Remarks

For Beginners: The learning rate controls how big each step is during training. A value of 0.001 means taking small, careful steps. If this value is too large, the model might miss the optimal solution by stepping too far. If it's too small, training will take a very long time. The default of 0.001 works well for most problems, which is why Adam is popular - it doesn't require much tuning of this value.

MaxBeta1

Gets or sets the maximum allowed value for Beta1.

public double MaxBeta1 { get; set; }

Property Value

double: The maximum Beta1 value, defaulting to 0.999.

Remarks

For Beginners: This prevents Beta1 from becoming too large, which would make the algorithm rely too heavily on past directions and adapt too slowly to new information. Even if Beta1 keeps increasing, it won't go above this value. A maximum of 0.999 ensures the algorithm always incorporates at least some new directional information.

MaxBeta2

Gets or sets the maximum allowed value for Beta2.

public double MaxBeta2 { get; set; }

Property Value

double: The maximum Beta2 value, defaulting to 0.9999.

Remarks

For Beginners: This prevents Beta2 from becoming too large, which would make the algorithm rely too heavily on past step sizes and adapt too slowly. Even if Beta2 keeps increasing, it won't go above this value. A maximum of 0.9999 ensures the algorithm always incorporates at least some new information about step sizes, allowing it to adapt to changing conditions during training.

MinBeta1

Gets or sets the minimum allowed value for Beta1.

public double MinBeta1 { get; set; }

Property Value

double: The minimum Beta1 value, defaulting to 0.8.

Remarks

For Beginners: This prevents Beta1 from becoming too small, which would make the algorithm ignore past directions too much. Even if Beta1 keeps decreasing, it won't go below this value. A minimum of 0.8 ensures the algorithm always considers at least some of its previous momentum, preventing it from changing direction too abruptly.

MinBeta2

Gets or sets the minimum allowed value for Beta2.

public double MinBeta2 { get; set; }

Property Value

double: The minimum Beta2 value, defaulting to 0.8.

Remarks

For Beginners: This prevents Beta2 from becoming too small, which would make the algorithm ignore past step sizes too much. Even if Beta2 keeps decreasing, it won't go below this value. A minimum of 0.8 ensures the algorithm always considers at least some of its previous step size information, maintaining some stability in the learning process.

UseAdaptiveBetas

Gets or sets whether to automatically adjust the Beta parameters during training.

public bool UseAdaptiveBetas { get; set; }

Property Value

bool: True to use adaptive betas (default), false otherwise.

Remarks

For Beginners: When enabled, the algorithm will automatically adjust how much it relies on past information based on how well it's performing. If the model is improving, it will trust its memory more. If performance worsens, it will pay more attention to new information. This helps the algorithm adapt to different phases of learning, like slowing down when approaching the destination and speeding up when far away.

Table of Contents

Class AdamOptimizerOptions<T, TInput, TOutput>

Type Parameters

Remarks

Properties

BatchSize

Property Value

Remarks

Beta1

Property Value

Remarks

Beta2

Property Value

Remarks

Epsilon

Property Value

Remarks

InitialLearningRate

Property Value

Remarks

MaxBeta1

Property Value

Remarks

MaxBeta2

Property Value

Remarks

MinBeta1

Property Value

Remarks

MinBeta2

Property Value

Remarks

UseAdaptiveBetas

Property Value

Remarks