Table of Contents

Class GradientDescentOptimizerOptions<T, TInput, TOutput>

Namespace
AiDotNet.Models.Options
Assembly
AiDotNet.dll

Configuration options for the Gradient Descent optimizer, which is a fundamental algorithm for finding the minimum of a function by iteratively moving in the direction of steepest descent.

public class GradientDescentOptimizerOptions<T, TInput, TOutput> : GradientBasedOptimizerOptions<T, TInput, TOutput>

Type Parameters

T
TInput
TOutput
Inheritance
OptimizationAlgorithmOptions<T, TInput, TOutput>
GradientBasedOptimizerOptions<T, TInput, TOutput>
GradientDescentOptimizerOptions<T, TInput, TOutput>
Inherited Members

Remarks

Gradient Descent is one of the most widely used optimization algorithms in machine learning. It works by calculating the gradient (slope) of a loss function with respect to the model parameters, then updating those parameters in the opposite direction of the gradient to minimize the loss. This class inherits from GradientBasedOptimizerOptions, so all general gradient-based optimization settings are also available.

For Beginners: Think of Gradient Descent like finding the lowest point in a valley by always walking downhill. Imagine you're standing on a hilly landscape and want to reach the lowest point. You look around, figure out which direction is most steeply downhill, take a step in that direction, and repeat until you can't go any lower.

In machine learning, the "landscape" is the error or loss function (how wrong your model's predictions are), and the "lowest point" represents the best possible model parameters. Gradient descent helps your model learn by repeatedly adjusting its parameters to reduce prediction errors.

This is the most basic form of optimization used in many machine learning algorithms, including neural networks, linear regression, and logistic regression. The options in this class let you control how quickly the algorithm moves downhill and how it avoids certain pitfalls during the optimization process.

Constructors

GradientDescentOptimizerOptions()

Initializes a new instance of the GradientDescentOptimizerOptions class with default settings.

public GradientDescentOptimizerOptions()

Remarks

The constructor initializes the regularization options with default values that are suitable for gradient descent optimization.

For Beginners: This constructor creates a new set of options for gradient descent with reasonable default values. When you create a new GradientDescentOptimizerOptions object without specifying any parameters, it will use these defaults, which work well for many common machine learning problems. You can then customize specific settings as needed for your particular task.

Properties

BatchSize

Gets or sets the batch size for mini-batch gradient descent.

public int BatchSize { get; set; }

Property Value

int

A positive integer, defaulting to 32.

Remarks

For Beginners: The batch size controls how many examples the optimizer looks at before making an update to the model. The default of 32 is a good balance for gradient descent. Setting it to -1 uses full-batch gradient descent (all training samples in each iteration).

RegularizationOptions

Gets or sets the regularization options to control overfitting during optimization.

public RegularizationOptions RegularizationOptions { get; set; }

Property Value

RegularizationOptions

The regularization options, defaulting to L2 regularization with a strength of 0.01.

Remarks

Regularization adds a penalty to the loss function based on the model's parameter values, which helps prevent overfitting by discouraging overly complex models. The setter ensures that the regularization options are never null by using default options if null is provided.

For Beginners: Regularization is like adding a "simplicity rule" to your model's training process. Without regularization, your model might become too complex and start memorizing the training data instead of learning general patterns (this is called "overfitting").

Think of it like learning to drive - you want to learn general rules of the road, not memorize every specific turn you made during practice. Regularization penalizes your model for becoming too complex, encouraging it to find simpler solutions that are more likely to work well on new data.

The default setting uses "L2 regularization" (also called "ridge" or "weight decay"), which is like telling your model "don't make any single parameter too large." This tends to work well for most problems and helps your model generalize better to new data. The strength value of 0.01 provides a moderate amount of regularization - higher values would enforce simplicity more strongly.