Class StochasticGradientDescentOptimizerOptions<T, TInput, TOutput>

Namespace: AiDotNet.Models.Options

Assembly: AiDotNet.dll

Configuration options for Stochastic Gradient Descent (SGD) optimization, a widely used algorithm for training machine learning models with large datasets.

public class StochasticGradientDescentOptimizerOptions<T, TInput, TOutput> : GradientBasedOptimizerOptions<T, TInput, TOutput>

Type Parameters

T
TInput
TOutput

Inheritance: object

ModelOptions

OptimizationAlgorithmOptions<T, TInput, TOutput>

GradientBasedOptimizerOptions<T, TInput, TOutput>

StochasticGradientDescentOptimizerOptions<T, TInput, TOutput>

Inherited Members: GradientBasedOptimizerOptions<T, TInput, TOutput>.GradientCache

GradientBasedOptimizerOptions<T, TInput, TOutput>.LossFunction

GradientBasedOptimizerOptions<T, TInput, TOutput>.Regularization

GradientBasedOptimizerOptions<T, TInput, TOutput>.DataSampler

GradientBasedOptimizerOptions<T, TInput, TOutput>.ShuffleData

GradientBasedOptimizerOptions<T, TInput, TOutput>.DropLastBatch

GradientBasedOptimizerOptions<T, TInput, TOutput>.RandomSeed

GradientBasedOptimizerOptions<T, TInput, TOutput>.EnableGradientClipping

GradientBasedOptimizerOptions<T, TInput, TOutput>.GradientClippingMethod

GradientBasedOptimizerOptions<T, TInput, TOutput>.MaxGradientNorm

GradientBasedOptimizerOptions<T, TInput, TOutput>.MaxGradientValue

GradientBasedOptimizerOptions<T, TInput, TOutput>.LearningRateScheduler

GradientBasedOptimizerOptions<T, TInput, TOutput>.SchedulerStepMode

OptimizationAlgorithmOptions<T, TInput, TOutput>.MaxIterations

OptimizationAlgorithmOptions<T, TInput, TOutput>.UseEarlyStopping

OptimizationAlgorithmOptions<T, TInput, TOutput>.EarlyStoppingPatience

OptimizationAlgorithmOptions<T, TInput, TOutput>.BadFitPatience

OptimizationAlgorithmOptions<T, TInput, TOutput>.MinimumFeatures

OptimizationAlgorithmOptions<T, TInput, TOutput>.MaximumFeatures

OptimizationAlgorithmOptions<T, TInput, TOutput>.UseExpressionTrees

OptimizationAlgorithmOptions<T, TInput, TOutput>.InitialLearningRate

OptimizationAlgorithmOptions<T, TInput, TOutput>.UseAdaptiveLearningRate

OptimizationAlgorithmOptions<T, TInput, TOutput>.LearningRateDecay

OptimizationAlgorithmOptions<T, TInput, TOutput>.MinLearningRate

OptimizationAlgorithmOptions<T, TInput, TOutput>.MaxLearningRate

OptimizationAlgorithmOptions<T, TInput, TOutput>.UseAdaptiveMomentum

OptimizationAlgorithmOptions<T, TInput, TOutput>.InitialMomentum

OptimizationAlgorithmOptions<T, TInput, TOutput>.MomentumIncreaseFactor

OptimizationAlgorithmOptions<T, TInput, TOutput>.MomentumDecreaseFactor

OptimizationAlgorithmOptions<T, TInput, TOutput>.MinMomentum

OptimizationAlgorithmOptions<T, TInput, TOutput>.MaxMomentum

OptimizationAlgorithmOptions<T, TInput, TOutput>.ExplorationRate

OptimizationAlgorithmOptions<T, TInput, TOutput>.MinExplorationRate

OptimizationAlgorithmOptions<T, TInput, TOutput>.MaxExplorationRate

OptimizationAlgorithmOptions<T, TInput, TOutput>.Tolerance

OptimizationAlgorithmOptions<T, TInput, TOutput>.OptimizationMode

OptimizationAlgorithmOptions<T, TInput, TOutput>.ParameterAdjustmentScale

OptimizationAlgorithmOptions<T, TInput, TOutput>.SignFlipProbability

OptimizationAlgorithmOptions<T, TInput, TOutput>.FeatureSelectionProbability

OptimizationAlgorithmOptions<T, TInput, TOutput>.ParameterAdjustmentProbability

OptimizationAlgorithmOptions<T, TInput, TOutput>.PredictionOptions

OptimizationAlgorithmOptions<T, TInput, TOutput>.ModelStatsOptions

OptimizationAlgorithmOptions<T, TInput, TOutput>.ModelEvaluator

OptimizationAlgorithmOptions<T, TInput, TOutput>.FitDetector

OptimizationAlgorithmOptions<T, TInput, TOutput>.FitnessCalculator

OptimizationAlgorithmOptions<T, TInput, TOutput>.ModelCache

OptimizationAlgorithmOptions<T, TInput, TOutput>.CreateDefaults(OptimizerType)

ModelOptions.Seed

object.Equals(object)

object.Equals(object, object)

object.GetHashCode()

object.GetType()

object.MemberwiseClone()

object.ReferenceEquals(object, object)

object.ToString()

Remarks

Stochastic Gradient Descent (SGD) is a variation of the gradient descent optimization algorithm that updates model parameters using gradients calculated from randomly selected subsets of the training data (mini-batches) rather than the entire dataset. This approach significantly reduces computational cost per iteration, making it suitable for large-scale machine learning problems. SGD introduces randomness into the optimization process, which can help escape local minima and potentially find better solutions. However, this randomness also leads to noisier updates and potentially slower convergence compared to full-batch gradient descent. This class inherits from GradientBasedOptimizerOptions and overrides the MaxIterations property to provide a more appropriate default value for SGD optimization.

For Beginners: Stochastic Gradient Descent is a faster way to train machine learning models with large datasets.

When training machine learning models:

We need to find the best parameters that minimize errors
Traditional gradient descent uses the entire dataset for each update
This becomes very slow with large datasets

Stochastic Gradient Descent solves this by:

Using only a small random subset of data (mini-batch) for each update
Making many faster, approximate updates instead of fewer exact ones
Eventually converging to a good solution, often more quickly

This approach offers several benefits:

Much faster iterations, especially with large datasets
Can escape local minima due to the noise in updates
Often finds good solutions faster in practice
Enables training on datasets too large to fit in memory

This class lets you configure the SGD optimization process.

Properties

BatchSize

Gets or sets the batch size for stochastic gradient descent.

public int BatchSize { get; set; }

Property Value

int: A positive integer, defaulting to 1 for true stochastic behavior.

Remarks

The batch size determines how many samples are processed before updating the model parameters. A batch size of 1 represents true Stochastic Gradient Descent, processing one sample at a time. Larger batch sizes create mini-batch gradient descent behavior with SGD's update rule.

For Beginners: The batch size controls how many examples the optimizer looks at before making an update to the model:

BatchSize = 1: True stochastic - update after each sample (default)
BatchSize = 32: Mini-batch - update after every 32 samples
BatchSize = [entire dataset]: Batch gradient descent

Smaller batch sizes:

More frequent updates (faster convergence initially)
More noise in gradients (can help escape local minima)
Less efficient use of vectorized operations

Larger batch sizes:

Smoother gradient estimates
Better use of GPU/vectorization
May require adjusting learning rate

The default of 1 gives true stochastic behavior. Consider using MiniBatchGradientDescentOptimizer if you want mini-batch behavior with additional features like adaptive learning rates.

MaxIterations

Gets or sets the maximum number of iterations for the optimization algorithm.

public int MaxIterations { get; set; }

Property Value

int: A positive integer, defaulting to 1000.

Remarks

This property specifies the maximum number of epochs that the SGD algorithm will perform. Each epoch processes all batches of data. The optimization will stop either when this number of epochs is reached or when another stopping criterion (such as convergence tolerance) is met, whichever comes first. The default value of 1000 is suitable for many applications, but may need adjustment based on the specific problem, dataset size, and batch size.

For Beginners: This setting limits how many complete passes through the data the algorithm will perform.

The maximum iterations (epochs) parameter:

Sets an upper limit on training duration
Prevents the algorithm from running indefinitely
Serves as a safety mechanism if convergence isn't reached

The default value of 1000 epochs means:

The algorithm will make at most 1000 complete passes through your data
This is often sufficient for many problems

When to adjust this value:

Increase it for complex problems that need more iterations to converge
Decrease it for simpler problems or when you need faster results
Monitor validation metrics to determine if more iterations are helpful

Table of Contents

Class StochasticGradientDescentOptimizerOptions<T, TInput, TOutput>

Type Parameters

Remarks

Properties

BatchSize

Property Value

Remarks

MaxIterations

Property Value

Remarks