Class MiniBatchGradientDescentOptimizer<T, TInput, TOutput>

Namespace: AiDotNet.Optimizers

Assembly: AiDotNet.dll

Implements the Mini-Batch Gradient Descent optimization algorithm.

public class MiniBatchGradientDescentOptimizer<T, TInput, TOutput> : GradientBasedOptimizerBase<T, TInput, TOutput>, IGradientBasedOptimizer<T, TInput, TOutput>, IOptimizer<T, TInput, TOutput>, IModelSerializer

Type Parameters

T: The numeric type used for calculations, typically float or double.
TInput
TOutput

Inheritance: object

OptimizerBase<T, TInput, TOutput>

GradientBasedOptimizerBase<T, TInput, TOutput>

MiniBatchGradientDescentOptimizer<T, TInput, TOutput>

Implements: IGradientBasedOptimizer<T, TInput, TOutput>

IOptimizer<T, TInput, TOutput>

IModelSerializer

Inherited Members: GradientBasedOptimizerBase<T, TInput, TOutput>.GradientOptions

GradientBasedOptimizerBase<T, TInput, TOutput>._previousGradient

GradientBasedOptimizerBase<T, TInput, TOutput>._lastComputedGradients

GradientBasedOptimizerBase<T, TInput, TOutput>.GradientCache

GradientBasedOptimizerBase<T, TInput, TOutput>.LossFunction

GradientBasedOptimizerBase<T, TInput, TOutput>.Regularization

GradientBasedOptimizerBase<T, TInput, TOutput>._mixedPrecisionContext

GradientBasedOptimizerBase<T, TInput, TOutput>._learningRateScheduler

GradientBasedOptimizerBase<T, TInput, TOutput>._schedulerStepMode

GradientBasedOptimizerBase<T, TInput, TOutput>._currentStep

GradientBasedOptimizerBase<T, TInput, TOutput>._currentEpoch

GradientBasedOptimizerBase<T, TInput, TOutput>.IsMixedPrecisionEnabled

GradientBasedOptimizerBase<T, TInput, TOutput>.LearningRateScheduler

GradientBasedOptimizerBase<T, TInput, TOutput>.SchedulerStepMode

GradientBasedOptimizerBase<T, TInput, TOutput>.CreateBatcher(OptimizationInputData<T, TInput, TOutput>, int)

GradientBasedOptimizerBase<T, TInput, TOutput>.CreateBatcher(OptimizationInputData<T, TInput, TOutput>, int, IDataSampler)

GradientBasedOptimizerBase<T, TInput, TOutput>.NotifyEpochStart(int)

GradientBasedOptimizerBase<T, TInput, TOutput>.LastComputedGradients

GradientBasedOptimizerBase<T, TInput, TOutput>.ApplyGradients(Vector<T>, IFullModel<T, TInput, TOutput>)

GradientBasedOptimizerBase<T, TInput, TOutput>.ApplyGradients(Vector<T>, Vector<T>, IFullModel<T, TInput, TOutput>)

GradientBasedOptimizerBase<T, TInput, TOutput>.ReverseUpdate(Vector<T>, Vector<T>)

GradientBasedOptimizerBase<T, TInput, TOutput>.CreateRegularization(GradientDescentOptimizerOptions<T, TInput, TOutput>)

GradientBasedOptimizerBase<T, TInput, TOutput>.CalculateGradient(IFullModel<T, TInput, TOutput>, TInput, TOutput)

GradientBasedOptimizerBase<T, TInput, TOutput>.ApplyGradientClipping(Vector<T>)

GradientBasedOptimizerBase<T, TInput, TOutput>.AreGradientsExploding(double)

GradientBasedOptimizerBase<T, TInput, TOutput>.AreGradientsVanishing(double)

GradientBasedOptimizerBase<T, TInput, TOutput>.GetGradientNorm()

GradientBasedOptimizerBase<T, TInput, TOutput>.ComputeHessianEfficiently(IFullModel<T, TInput, TOutput>, OptimizationInputData<T, TInput, TOutput>)

GradientBasedOptimizerBase<T, TInput, TOutput>.ComputeHessianFiniteDifferences(IFullModel<T, TInput, TOutput>, OptimizationInputData<T, TInput, TOutput>)

GradientBasedOptimizerBase<T, TInput, TOutput>.LineSearch(IFullModel<T, TInput, TOutput>, Vector<T>, Vector<T>, OptimizationInputData<T, TInput, TOutput>)

GradientBasedOptimizerBase<T, TInput, TOutput>.CalculateGradient(IFullModel<T, TInput, TOutput>, TInput, TOutput, int[])

GradientBasedOptimizerBase<T, TInput, TOutput>.UpdateSolution(IFullModel<T, TInput, TOutput>, Vector<T>)

GradientBasedOptimizerBase<T, TInput, TOutput>.GenerateGradientCacheKey(IFullModel<T, TInput, TOutput>, TInput, TOutput)

GradientBasedOptimizerBase<T, TInput, TOutput>.Reset()

GradientBasedOptimizerBase<T, TInput, TOutput>.StepScheduler()

GradientBasedOptimizerBase<T, TInput, TOutput>.OnEpochEnd()

GradientBasedOptimizerBase<T, TInput, TOutput>.OnBatchEnd()

GradientBasedOptimizerBase<T, TInput, TOutput>.IsInWarmupPhase()

GradientBasedOptimizerBase<T, TInput, TOutput>.GetCurrentLearningRate()

GradientBasedOptimizerBase<T, TInput, TOutput>.CurrentStep

GradientBasedOptimizerBase<T, TInput, TOutput>.CurrentEpoch

GradientBasedOptimizerBase<T, TInput, TOutput>.ApplyMomentum(Vector<T>)

GradientBasedOptimizerBase<T, TInput, TOutput>.UpdateParameters(List<ILayer<T>>)

GradientBasedOptimizerBase<T, TInput, TOutput>.UpdateParameters(Matrix<T>, Matrix<T>)

GradientBasedOptimizerBase<T, TInput, TOutput>.UpdateParameters(Tensor<T>, Tensor<T>)

GradientBasedOptimizerBase<T, TInput, TOutput>.UpdateParameters(Vector<T>, Vector<T>)

GradientBasedOptimizerBase<T, TInput, TOutput>.UpdateOptions(OptimizationAlgorithmOptions<T, TInput, TOutput>)

GradientBasedOptimizerBase<T, TInput, TOutput>.SupportsGpuUpdate

GradientBasedOptimizerBase<T, TInput, TOutput>._gpuState

GradientBasedOptimizerBase<T, TInput, TOutput>._gpuStateInitialized

GradientBasedOptimizerBase<T, TInput, TOutput>.UpdateParametersGpu(IGpuBuffer, IGpuBuffer, int, IDirectGpuBackend)

GradientBasedOptimizerBase<T, TInput, TOutput>.InitializeGpuState(int, IDirectGpuBackend)

GradientBasedOptimizerBase<T, TInput, TOutput>.DisposeGpuState()

OptimizerBase<T, TInput, TOutput>.Engine

OptimizerBase<T, TInput, TOutput>.NumOps

OptimizerBase<T, TInput, TOutput>.Random

OptimizerBase<T, TInput, TOutput>.Options

OptimizerBase<T, TInput, TOutput>.PredictionOptions

OptimizerBase<T, TInput, TOutput>.ModelStatsOptions

OptimizerBase<T, TInput, TOutput>.ModelEvaluator

OptimizerBase<T, TInput, TOutput>.FitDetector

OptimizerBase<T, TInput, TOutput>.FitnessCalculator

OptimizerBase<T, TInput, TOutput>.FitnessList

OptimizerBase<T, TInput, TOutput>.IterationHistoryList

OptimizerBase<T, TInput, TOutput>.ModelCache

OptimizerBase<T, TInput, TOutput>.CurrentLearningRate

OptimizerBase<T, TInput, TOutput>.CurrentMomentum

OptimizerBase<T, TInput, TOutput>.IterationsWithoutImprovement

OptimizerBase<T, TInput, TOutput>.IterationsWithImprovement

OptimizerBase<T, TInput, TOutput>.Model

OptimizerBase<T, TInput, TOutput>.Optimize(OptimizationInputData<T, TInput, TOutput>)

OptimizerBase<T, TInput, TOutput>.GetCachedStepData(string)

OptimizerBase<T, TInput, TOutput>.CacheStepData(string, OptimizationStepData<T, TInput, TOutput>)

OptimizerBase<T, TInput, TOutput>.AdjustModelParameters(IFullModel<T, TInput, TOutput>, double, double)

OptimizerBase<T, TInput, TOutput>.RandomlySelectFeatures(int, int?, int?)

OptimizerBase<T, TInput, TOutput>.ApplyFeatureSelection(IFullModel<T, TInput, TOutput>, List<int>)

OptimizerBase<T, TInput, TOutput>.AdjustParameters(Vector<T>, double, double)

OptimizerBase<T, TInput, TOutput>.EvaluateSolution(IFullModel<T, TInput, TOutput>, OptimizationInputData<T, TInput, TOutput>)

OptimizerBase<T, TInput, TOutput>.PrepareAndEvaluateSolution(IFullModel<T, TInput, TOutput>, OptimizationInputData<T, TInput, TOutput>)

OptimizerBase<T, TInput, TOutput>.CalculateLoss(IFullModel<T, TInput, TOutput>, OptimizationInputData<T, TInput, TOutput>)

OptimizerBase<T, TInput, TOutput>.CreateOptimizationResult(OptimizationStepData<T, TInput, TOutput>, OptimizationInputData<T, TInput, TOutput>)

OptimizerBase<T, TInput, TOutput>.ApplyFeatureSelection(IFullModel<T, TInput, TOutput>, int)

OptimizerBase<T, TInput, TOutput>.CreateSolution(TInput)

OptimizerBase<T, TInput, TOutput>.GenerateCacheKey(IFullModel<T, TInput, TOutput>, OptimizationInputData<T, TInput, TOutput>)

OptimizerBase<T, TInput, TOutput>.UpdateBestSolution(OptimizationStepData<T, TInput, TOutput>, ref OptimizationStepData<T, TInput, TOutput>)

OptimizerBase<T, TInput, TOutput>.InitializeAdaptiveParameters()

OptimizerBase<T, TInput, TOutput>.Reset()

OptimizerBase<T, TInput, TOutput>.ResetAdaptiveParameters()

OptimizerBase<T, TInput, TOutput>.UpdateAdaptiveParameters(OptimizationStepData<T, TInput, TOutput>, OptimizationStepData<T, TInput, TOutput>)

OptimizerBase<T, TInput, TOutput>.UpdateIterationHistoryAndCheckEarlyStopping(int, OptimizationStepData<T, TInput, TOutput>)

OptimizerBase<T, TInput, TOutput>.ShouldEarlyStop()

OptimizerBase<T, TInput, TOutput>.Serialize()

OptimizerBase<T, TInput, TOutput>.Deserialize(byte[])

OptimizerBase<T, TInput, TOutput>.SerializeAdditionalData(BinaryWriter)

OptimizerBase<T, TInput, TOutput>.DeserializeAdditionalData(BinaryReader)

OptimizerBase<T, TInput, TOutput>.UpdateOptions(OptimizationAlgorithmOptions<T, TInput, TOutput>)

OptimizerBase<T, TInput, TOutput>.Step()

OptimizerBase<T, TInput, TOutput>.CalculateUpdate(Dictionary<string, Vector<T>>)

OptimizerBase<T, TInput, TOutput>.GetOptions()

OptimizerBase<T, TInput, TOutput>.CalculateUpdate(Vector<T>, Vector<T>)

OptimizerBase<T, TInput, TOutput>.InitializeRandomSolution(Vector<T>, Vector<T>)

OptimizerBase<T, TInput, TOutput>.InitializeRandomSolution(TInput)

OptimizerBase<T, TInput, TOutput>.SaveModel(string)

OptimizerBase<T, TInput, TOutput>.LoadModel(string)

object.Equals(object)

object.Equals(object, object)

object.GetHashCode()

object.GetType()

object.MemberwiseClone()

object.ReferenceEquals(object, object)

object.ToString()

Extension Methods: DistributedExtensions.AsDistributed<T, TInput, TOutput>(IOptimizer<T, TInput, TOutput>, ICommunicationBackend<T>)

DistributedExtensions.AsDistributed<T, TInput, TOutput>(IOptimizer<T, TInput, TOutput>, IShardingConfiguration<T>)

Remarks

Mini-Batch Gradient Descent is a variation of gradient descent that splits the training data into small batches to calculate model error and update model coefficients. This approach strikes a balance between the efficiency of stochastic gradient descent and the stability of batch gradient descent.

For Beginners: Imagine you're trying to find the bottom of a valley while blindfolded. Mini-Batch Gradient Descent is like taking a few steps, checking your position, adjusting your direction, and repeating. It's faster than checking after every single step (Stochastic Gradient Descent) but more precise than taking a lot of steps before checking (Batch Gradient Descent).

Constructors

MiniBatchGradientDescentOptimizer(IFullModel<T, TInput, TOutput>, MiniBatchGradientDescentOptions<T, TInput, TOutput>?, IEngine?)

Initializes a new instance of the MiniBatchGradientDescentOptimizer class.

public MiniBatchGradientDescentOptimizer(IFullModel<T, TInput, TOutput> model, MiniBatchGradientDescentOptions<T, TInput, TOutput>? options = null, IEngine? engine = null)

Parameters

model IFullModel<T, TInput, TOutput>: The model to optimize.
options MiniBatchGradientDescentOptions<T, TInput, TOutput>
engine IEngine

Remarks

This constructor sets up the optimizer with the provided options and dependencies. If no options are provided, it uses default settings. It also initializes a random number generator for shuffling data.

For Beginners: This is like setting up your hiking gear before starting the journey to find the valley's bottom. You're deciding on your strategy (options) and packing your tools (dependencies) that you'll use along the way.

Methods

Deserialize(byte[])

Deserializes a byte array to restore the optimizer's state.

public override void Deserialize(byte[] data)

Parameters

data byte[]: The byte array containing the serialized optimizer state.

Remarks

This method takes a byte array (previously created by Serialize) and uses it to restore the optimizer's state, including its base class state and options.

For Beginners: This is like using a detailed map and instructions to recreate your exact position and plan from a previous point in your journey. It allows you to pick up right where you left off, with all your strategies and progress intact.

Exceptions

InvalidOperationException: Thrown when the optimizer options cannot be deserialized.

GenerateGradientCacheKey(IFullModel<T, TInput, TOutput>, TInput, TOutput)

Generates a unique key for caching gradients based on the model and input data.

protected override string GenerateGradientCacheKey(IFullModel<T, TInput, TOutput> model, TInput X, TOutput y)

Parameters

model IFullModel<T, TInput, TOutput>: The symbolic model being optimized.
X TInput: The input data matrix.
y TOutput: The target output vector.

Returns

string: A string representing the unique gradient cache key.

Remarks

This method creates a unique identifier for caching gradients. It combines the base gradient cache key with specific parameters of the Mini-Batch Gradient Descent algorithm.

For Beginners: Imagine you're leaving markers along your hiking path. This method creates a unique label for each marker, combining information about where you are (the model and data) with specifics about how you're hiking (batch size and number of rounds). This helps you quickly recognize and use information from similar situations you've encountered before.

GetOptions()

Gets the current optimization algorithm options.

public override OptimizationAlgorithmOptions<T, TInput, TOutput> GetOptions()

Returns

OptimizationAlgorithmOptions<T, TInput, TOutput>: The current MiniBatchGradientDescentOptions object.

Remarks

This method returns the current options used by the Mini-Batch Gradient Descent optimizer.

For Beginners: This is like checking your current hiking plan. It lets you see all the settings and strategies you're currently using in your journey to find the valley's bottom.

InitializeAdaptiveParameters()

Initializes adaptive parameters for the optimization process.

protected override void InitializeAdaptiveParameters()

Remarks

This method sets up the initial learning rate for the optimization process based on the options provided.

For Beginners: This is like deciding how big your steps will be when you start your journey. The learning rate determines how much you adjust your position based on each batch of information you process.

Optimize(OptimizationInputData<T, TInput, TOutput>)

Performs the optimization process using Mini-Batch Gradient Descent.

public override OptimizationResult<T, TInput, TOutput> Optimize(OptimizationInputData<T, TInput, TOutput> inputData)

Parameters

inputData OptimizationInputData<T, TInput, TOutput>: The input data for the optimization process.

Returns

OptimizationResult<T, TInput, TOutput>: The result of the optimization process.

Remarks

This method implements the main optimization loop. It iterates through the data in mini-batches, calculating gradients and updating the model parameters for each batch. The process continues for a specified number of epochs or until a stopping criterion is met.

For Beginners: This is the actual journey to find the valley's bottom. You're taking steps (processing batches of data), checking your position (evaluating the model), and adjusting your direction (updating the model parameters). You do this repeatedly (for each epoch) until you're satisfied with your position or you've taken the maximum number of steps you allowed yourself.

DataLoader Integration: This optimizer now uses the DataLoader batching infrastructure which supports: - Custom samplers (weighted, stratified, curriculum, importance, active learning) - Reproducible shuffling via RandomSeed - Option to drop incomplete final batches Set these options via GradientBasedOptimizerOptions.DataSampler, ShuffleData, DropLastBatch, and RandomSeed.

ReverseUpdate(Vector<T>, Vector<T>)

Reverses a Mini-Batch Gradient Descent update to recover original parameters.

public override Vector<T> ReverseUpdate(Vector<T> updatedParameters, Vector<T> appliedGradients)

Parameters

updatedParameters Vector<T>: Parameters after Mini-Batch GD update
appliedGradients Vector<T>: The gradients that were applied

Returns

Vector<T>: Original parameters before the update

Remarks

Mini-Batch Gradient Descent uses vanilla SGD update rule: params_new = params_old - lr * gradient. The reverse is straightforward: params_old = params_new + lr * gradient.

For Beginners: This calculates where parameters were before a Mini-Batch GD update. Since Mini-Batch GD uses simple steps (parameter minus learning_rate times gradient), reversing just means adding back that step.

Serialize()

Serializes the optimizer's state into a byte array.

public override byte[] Serialize()

Returns

byte[]: A byte array representing the serialized state of the optimizer.

Remarks

This method converts the current state of the optimizer, including its base class state and options, into a byte array. This is useful for saving the optimizer's state or transferring it between systems.

For Beginners: Think of this as taking a snapshot of your entire journey so far. It captures all the details of your current position, your hiking plan, and how you got there. This snapshot can be used to continue your journey later or share your exact situation with others.

UpdateAdaptiveParameters(OptimizationStepData<T, TInput, TOutput>, OptimizationStepData<T, TInput, TOutput>)

Updates the adaptive parameters of the optimizer based on the current and previous optimization steps.

protected override void UpdateAdaptiveParameters(OptimizationStepData<T, TInput, TOutput> currentStepData, OptimizationStepData<T, TInput, TOutput> previousStepData)

Parameters

currentStepData OptimizationStepData<T, TInput, TOutput>: Data from the current optimization step.
previousStepData OptimizationStepData<T, TInput, TOutput>: Data from the previous optimization step.

Remarks

This method adjusts the learning rate based on the performance of the current step compared to the previous step. If improvement is seen, the learning rate may be increased, otherwise it may be decreased.

For Beginners: This is like adjusting your step size based on how well you're doing. If you're making good progress, you might take slightly bigger steps. If you're not improving, you might take smaller, more careful steps.

UpdateOptions(OptimizationAlgorithmOptions<T, TInput, TOutput>)

Updates the optimizer's options with new settings.

protected override void UpdateOptions(OptimizationAlgorithmOptions<T, TInput, TOutput> options)

Parameters

options OptimizationAlgorithmOptions<T, TInput, TOutput>: The new options to be applied to the optimizer.

Remarks

This method allows updating the optimizer's settings during runtime. It ensures that only compatible option types are used with this optimizer.

For Beginners: This is like changing your hiking strategy mid-journey. It makes sure you're only using strategies that work for this specific type of journey (Mini-Batch Gradient Descent).

Exceptions

ArgumentException: Thrown when the provided options are not of the correct type.

UpdateParametersGpu(IGpuBuffer, IGpuBuffer, int, IDirectGpuBackend)

Updates parameters on the GPU using vanilla SGD (same as SGD for parameter updates).

public override void UpdateParametersGpu(IGpuBuffer parameters, IGpuBuffer gradients, int parameterCount, IDirectGpuBackend backend)

Parameters

parameters IGpuBuffer
gradients IGpuBuffer
parameterCount int
backend IDirectGpuBackend

UpdateSolution(IFullModel<T, TInput, TOutput>, Vector<T>)

Updates the current solution based on the calculated gradient.

protected override IFullModel<T, TInput, TOutput> UpdateSolution(IFullModel<T, TInput, TOutput> currentSolution, Vector<T> gradient)

Parameters

currentSolution IFullModel<T, TInput, TOutput>: The current model solution.
gradient Vector<T>: The calculated gradient.

Returns

IFullModel<T, TInput, TOutput>: An updated symbolic model with improved coefficients.

Remarks

This method applies the gradient to the current solution, adjusting each coefficient by the gradient scaled by the learning rate.

For Beginners: This is like taking a step in the direction you think will lead you closer to the valley's bottom. The size of your step is determined by the learning rate, and the direction is given by the gradient.

Table of Contents

Class MiniBatchGradientDescentOptimizer<T, TInput, TOutput>

Type Parameters

Remarks

Constructors

MiniBatchGradientDescentOptimizer(IFullModel<T, TInput, TOutput>, MiniBatchGradientDescentOptions<T, TInput, TOutput>?, IEngine?)

Parameters

Remarks

Methods

Deserialize(byte[])

Parameters

Remarks

Exceptions

GenerateGradientCacheKey(IFullModel<T, TInput, TOutput>, TInput, TOutput)

Parameters

Returns

Remarks

GetOptions()

Returns

Remarks

InitializeAdaptiveParameters()

Remarks

Optimize(OptimizationInputData<T, TInput, TOutput>)

Parameters

Returns

Remarks

ReverseUpdate(Vector<T>, Vector<T>)

Parameters

Returns

Remarks

Serialize()

Returns

Remarks

UpdateAdaptiveParameters(OptimizationStepData<T, TInput, TOutput>, OptimizationStepData<T, TInput, TOutput>)

Parameters

Remarks

UpdateOptions(OptimizationAlgorithmOptions<T, TInput, TOutput>)

Parameters

Remarks

Exceptions

UpdateParametersGpu(IGpuBuffer, IGpuBuffer, int, IDirectGpuBackend)

Parameters

UpdateSolution(IFullModel<T, TInput, TOutput>, Vector<T>)

Parameters

Returns

Remarks