Class AdaDeltaOptimizerOptions<T, TInput, TOutput>

Namespace: AiDotNet.Models.Options

Assembly: AiDotNet.dll

Configuration options for the AdaDelta optimization algorithm, which is an extension of AdaGrad that adapts learning rates based on a moving window of gradient updates.

public class AdaDeltaOptimizerOptions<T, TInput, TOutput> : GradientBasedOptimizerOptions<T, TInput, TOutput>

Type Parameters

T
TInput
TOutput

Inheritance: object

ModelOptions

OptimizationAlgorithmOptions<T, TInput, TOutput>

GradientBasedOptimizerOptions<T, TInput, TOutput>

AdaDeltaOptimizerOptions<T, TInput, TOutput>

Inherited Members: GradientBasedOptimizerOptions<T, TInput, TOutput>.GradientCache

GradientBasedOptimizerOptions<T, TInput, TOutput>.LossFunction

GradientBasedOptimizerOptions<T, TInput, TOutput>.Regularization

GradientBasedOptimizerOptions<T, TInput, TOutput>.DataSampler

GradientBasedOptimizerOptions<T, TInput, TOutput>.ShuffleData

GradientBasedOptimizerOptions<T, TInput, TOutput>.DropLastBatch

GradientBasedOptimizerOptions<T, TInput, TOutput>.RandomSeed

GradientBasedOptimizerOptions<T, TInput, TOutput>.EnableGradientClipping

GradientBasedOptimizerOptions<T, TInput, TOutput>.GradientClippingMethod

GradientBasedOptimizerOptions<T, TInput, TOutput>.MaxGradientNorm

GradientBasedOptimizerOptions<T, TInput, TOutput>.MaxGradientValue

GradientBasedOptimizerOptions<T, TInput, TOutput>.LearningRateScheduler

GradientBasedOptimizerOptions<T, TInput, TOutput>.SchedulerStepMode

OptimizationAlgorithmOptions<T, TInput, TOutput>.MaxIterations

OptimizationAlgorithmOptions<T, TInput, TOutput>.UseEarlyStopping

OptimizationAlgorithmOptions<T, TInput, TOutput>.EarlyStoppingPatience

OptimizationAlgorithmOptions<T, TInput, TOutput>.BadFitPatience

OptimizationAlgorithmOptions<T, TInput, TOutput>.MinimumFeatures

OptimizationAlgorithmOptions<T, TInput, TOutput>.MaximumFeatures

OptimizationAlgorithmOptions<T, TInput, TOutput>.UseExpressionTrees

OptimizationAlgorithmOptions<T, TInput, TOutput>.InitialLearningRate

OptimizationAlgorithmOptions<T, TInput, TOutput>.UseAdaptiveLearningRate

OptimizationAlgorithmOptions<T, TInput, TOutput>.LearningRateDecay

OptimizationAlgorithmOptions<T, TInput, TOutput>.MinLearningRate

OptimizationAlgorithmOptions<T, TInput, TOutput>.MaxLearningRate

OptimizationAlgorithmOptions<T, TInput, TOutput>.UseAdaptiveMomentum

OptimizationAlgorithmOptions<T, TInput, TOutput>.InitialMomentum

OptimizationAlgorithmOptions<T, TInput, TOutput>.MomentumIncreaseFactor

OptimizationAlgorithmOptions<T, TInput, TOutput>.MomentumDecreaseFactor

OptimizationAlgorithmOptions<T, TInput, TOutput>.MinMomentum

OptimizationAlgorithmOptions<T, TInput, TOutput>.MaxMomentum

OptimizationAlgorithmOptions<T, TInput, TOutput>.ExplorationRate

OptimizationAlgorithmOptions<T, TInput, TOutput>.MinExplorationRate

OptimizationAlgorithmOptions<T, TInput, TOutput>.MaxExplorationRate

OptimizationAlgorithmOptions<T, TInput, TOutput>.Tolerance

OptimizationAlgorithmOptions<T, TInput, TOutput>.OptimizationMode

OptimizationAlgorithmOptions<T, TInput, TOutput>.ParameterAdjustmentScale

OptimizationAlgorithmOptions<T, TInput, TOutput>.SignFlipProbability

OptimizationAlgorithmOptions<T, TInput, TOutput>.FeatureSelectionProbability

OptimizationAlgorithmOptions<T, TInput, TOutput>.ParameterAdjustmentProbability

OptimizationAlgorithmOptions<T, TInput, TOutput>.PredictionOptions

OptimizationAlgorithmOptions<T, TInput, TOutput>.ModelStatsOptions

OptimizationAlgorithmOptions<T, TInput, TOutput>.ModelEvaluator

OptimizationAlgorithmOptions<T, TInput, TOutput>.FitDetector

OptimizationAlgorithmOptions<T, TInput, TOutput>.FitnessCalculator

OptimizationAlgorithmOptions<T, TInput, TOutput>.ModelCache

OptimizationAlgorithmOptions<T, TInput, TOutput>.CreateDefaults(OptimizerType)

ModelOptions.Seed

object.Equals(object)

object.Equals(object, object)

object.GetHashCode()

object.GetType()

object.MemberwiseClone()

object.ReferenceEquals(object, object)

object.ToString()

Remarks

AdaDelta is an optimization algorithm that dynamically adapts the learning rate for each parameter based on historical gradient information. It addresses some limitations of earlier algorithms by using a moving average of squared gradients.

For Beginners: AdaDelta is like a smart learning system that automatically adjusts how quickly it learns based on past experience. Instead of using a fixed learning speed, it slows down for parameters that change a lot and speeds up for those that change little. This helps the model learn more efficiently without requiring manual tuning of the learning rate.

Properties

BatchSize

Gets or sets the batch size for mini-batch gradient descent.

public int BatchSize { get; set; }

Property Value

int: A positive integer, defaulting to 32.

Remarks

For Beginners: The batch size controls how many examples the optimizer looks at before making an update to the model. The default of 32 is a good balance for AdaDelta.

Epsilon

Gets or sets a small constant added to denominators to prevent division by zero.

public double Epsilon { get; set; }

Property Value

double: The epsilon value, defaulting to 0.000001.

Remarks

For Beginners: Epsilon is a tiny safety value that prevents the algorithm from crashing when it would otherwise divide by zero. It's like having a small backup plan that kicks in only when needed. You typically don't need to change this unless you're experiencing numerical stability issues.

InitialLearningRate

Gets or sets the initial learning rate for the AdaDelta optimizer, overriding the base class value.

public override double InitialLearningRate { get; set; }

Property Value

double: The initial learning rate, defaulting to 1.0.

Remarks

While AdaDelta is designed to eliminate the need for manually setting a learning rate, this parameter serves as a scaling factor for the updates. The default value of 1.0 works well in most cases since AdaDelta automatically adapts the effective learning rate during training.

For Beginners: Unlike other optimization algorithms where the learning rate directly controls how big each learning step is, in AdaDelta this value is more like an initial scaling factor. Think of it as setting the overall speed limit rather than controlling each individual step. The default value of 1.0 is higher than in other algorithms because AdaDelta has built-in mechanisms that automatically adjust how it learns, making it less sensitive to this initial setting. In most cases, you won't need to change this value.

MaxRho

Gets or sets the maximum allowed value for Rho.

public double MaxRho { get; set; }

Property Value

double: The maximum Rho value, defaulting to 0.9999.

Remarks

For Beginners: This prevents Rho from becoming too large, which would make the algorithm rely too heavily on past information and adapt too slowly. Even if Rho keeps increasing, it won't go above this value. A maximum of 0.9999 ensures the algorithm always incorporates at least some new information.

MinRho

Gets or sets the minimum allowed value for Rho.

public double MinRho { get; set; }

Property Value

double: The minimum Rho value, defaulting to 0.5.

Remarks

For Beginners: This prevents Rho from becoming too small, which would make the algorithm ignore past information too much. Even if Rho keeps decreasing, it won't go below this value. A minimum of 0.5 ensures the algorithm always considers at least some historical information.

Rho

Gets or sets the decay rate for the moving average of squared gradients.

public double Rho { get; set; }

Property Value

double: The decay rate, defaulting to 0.95.

Remarks

For Beginners: Rho controls how much the algorithm "remembers" about past gradients. A value of 0.95 means it gives high importance (95%) to past information and only 5% to new information. Higher values (closer to 1) make learning more stable but slower to adapt to changes. Think of it like averaging your test scores, but giving more weight to older scores than newer ones.

RhoDecreaseFactor

Gets or sets the factor by which Rho decreases when performance worsens.

public double RhoDecreaseFactor { get; set; }

Property Value

double: The Rho decrease factor, defaulting to 0.99.

Remarks

For Beginners: When the model is getting worse, Rho will be decreased by this factor. A value of 0.99 means Rho becomes 99% of its previous value, making the algorithm pay more attention to new information when things aren't going well. This helps the model adapt more quickly when it needs to change course.

RhoIncreaseFactor

Gets or sets the factor by which Rho increases when performance improves.

public double RhoIncreaseFactor { get; set; }

Property Value

double: The Rho increase factor, defaulting to 1.01.

Remarks

For Beginners: When the model is improving, Rho will be increased by this factor. A value of 1.01 means Rho becomes 101% of its previous value, making the algorithm rely slightly more on past information when things are going well. This helps stabilize learning when on the right track.

UseAdaptiveRho

Gets or sets whether to automatically adjust the Rho parameter during training.

public bool UseAdaptiveRho { get; set; }

Property Value

bool: True to use adaptive Rho (default), false otherwise.

Remarks

For Beginners: When enabled, the algorithm will automatically adjust how much it relies on past information based on how well it's performing. If the model is improving, it will trust its memory more. If performance worsens, it will pay more attention to new information. This helps the algorithm adapt to different phases of learning.

Table of Contents

Class AdaDeltaOptimizerOptions<T, TInput, TOutput>

Type Parameters

Remarks

Properties

BatchSize

Property Value

Remarks

Epsilon

Property Value

Remarks

InitialLearningRate

Property Value

Remarks

MaxRho

Property Value

Remarks

MinRho

Property Value

Remarks

Rho

Property Value

Remarks

RhoDecreaseFactor

Property Value

Remarks

RhoIncreaseFactor

Property Value

Remarks

UseAdaptiveRho

Property Value

Remarks