Class StochasticGradientDescentOptimizer<T, TInput, TOutput>
- Namespace
- AiDotNet.Optimizers
- Assembly
- AiDotNet.dll
Represents a Stochastic Gradient Descent (SGD) optimizer for machine learning models.
public class StochasticGradientDescentOptimizer<T, TInput, TOutput> : GradientBasedOptimizerBase<T, TInput, TOutput>, IGradientBasedOptimizer<T, TInput, TOutput>, IOptimizer<T, TInput, TOutput>, IModelSerializer
Type Parameters
TThe numeric type used for calculations, typically float or double.
TInputTOutput
- Inheritance
-
OptimizerBase<T, TInput, TOutput>GradientBasedOptimizerBase<T, TInput, TOutput>StochasticGradientDescentOptimizer<T, TInput, TOutput>
- Implements
-
IGradientBasedOptimizer<T, TInput, TOutput>IOptimizer<T, TInput, TOutput>
- Inherited Members
- Extension Methods
Remarks
The StochasticGradientDescentOptimizer is a gradient-based optimization algorithm that iteratively adjusts model parameters to minimize the loss function. It uses a stochastic approach, updating parameters based on a subset of the training data in each iteration.
For Beginners: Think of this optimizer as a hiker trying to find the lowest point in a hilly landscape:
- The hiker (optimizer) takes steps downhill to find the lowest point (best model parameters)
- Instead of looking at the entire landscape at once, the hiker looks at small patches (subsets of data)
- The hiker adjusts their step size (learning rate) as they go
- This approach helps the hiker find a good low point quickly, even in a complex landscape
This method is efficient for large datasets and can often find good solutions quickly.
Constructors
StochasticGradientDescentOptimizer(IFullModel<T, TInput, TOutput>, StochasticGradientDescentOptimizerOptions<T, TInput, TOutput>?, IEngine?)
Initializes a new instance of the StochasticGradientDescentOptimizer class.
public StochasticGradientDescentOptimizer(IFullModel<T, TInput, TOutput> model, StochasticGradientDescentOptimizerOptions<T, TInput, TOutput>? options = null, IEngine? engine = null)
Parameters
modelIFullModel<T, TInput, TOutput>optionsStochasticGradientDescentOptimizerOptions<T, TInput, TOutput>Options specific to the SGD optimizer.
engineIEngine
Remarks
This constructor sets up the SGD optimizer with the specified options and components. If no options are provided, default options are used.
For Beginners: This is like setting up your hiker with their gear before the hike:
- You can give the hiker special instructions (options) for how to search
- You can provide tools to measure progress (evaluator, fit detector, etc.)
- If you don't provide instructions, the hiker will use a standard set
This setup ensures the optimizer is ready to start finding the best solution.
Methods
Deserialize(byte[])
Deserializes a byte array to restore the state of the StochasticGradientDescentOptimizer.
public override void Deserialize(byte[] data)
Parameters
databyte[]The byte array containing the serialized optimizer state.
Remarks
This method restores the state of the optimizer from a byte array, including its base class data and SGD-specific options. It uses a BinaryReader to read the serialized data and reconstruct the optimizer's state.
For Beginners: This is like unpacking the hiker's backpack after a journey:
- It reads the saved snapshot of the hiker's journey
- It restores both general hiking info and SGD-specific details
- If there's a problem reading the SGD-specific details, it reports an error
This allows you to continue from a previously saved state of the optimizer.
Exceptions
- InvalidOperationException
Thrown when deserialization of optimizer options fails.
GenerateGradientCacheKey(IFullModel<T, TInput, TOutput>, TInput, TOutput)
Generates a unique cache key for gradient calculations.
protected override string GenerateGradientCacheKey(IFullModel<T, TInput, TOutput> model, TInput X, TOutput y)
Parameters
modelIFullModel<T, TInput, TOutput>The symbolic model for which the gradient is being calculated.
XTInputThe input data matrix.
yTOutputThe target vector.
Returns
- string
A string representing the unique cache key.
Remarks
This method creates a unique identifier for caching gradient calculations. It combines the base cache key with SGD-specific parameters to ensure that cached gradients are only reused when all relevant parameters are identical.
For Beginners: This is like creating a unique label for each calculation the hiker does:
- It starts with a basic label (baseKey) that describes the general calculation
- It adds SGD-specific information like the current step size (learning rate) and how many steps the hiker is allowed to take (max iterations)
- This unique label helps the hiker remember and quickly recall previous calculations instead of redoing them unnecessarily
This improves efficiency by avoiding redundant calculations.
GetOptions()
Gets the current options for this optimizer.
public override OptimizationAlgorithmOptions<T, TInput, TOutput> GetOptions()
Returns
- OptimizationAlgorithmOptions<T, TInput, TOutput>
The current StochasticGradientDescentOptimizerOptions.
Remarks
This method returns the current configuration options of the SGD optimizer.
For Beginners: This is like asking the hiker what their current instructions are:
- You can see how the hiker is currently set up to search
- This includes things like how big their steps are, how many steps they're allowed to take, etc.
This is useful for understanding or checking the current setup of the optimizer.
Optimize(OptimizationInputData<T, TInput, TOutput>)
Performs the optimization process to find the best solution for the given input data.
public override OptimizationResult<T, TInput, TOutput> Optimize(OptimizationInputData<T, TInput, TOutput> inputData)
Parameters
inputDataOptimizationInputData<T, TInput, TOutput>The input data to optimize against.
Returns
- OptimizationResult<T, TInput, TOutput>
An optimization result containing the best solution found and associated metrics.
Remarks
This method implements the main SGD algorithm. It iteratively updates the model parameters based on the calculated gradient, applying momentum and adaptive learning rates if configured. The process continues until either the maximum number of iterations is reached or early stopping criteria are met.
For Beginners: This is the main journey of our hiker:
- Start at a random point on the hill (initialize random solution)
- For each epoch (pass through the data):
- Process data in batches (default BatchSize=1 for true stochastic)
- For each batch:
- Look around to decide which way is downhill (calculate gradient)
- Apply momentum if configured
- Take a step in that direction (update solution)
- Check if this is the lowest point found so far (evaluate and update best solution)
- Adjust step size if needed (update adaptive parameters)
- Decide whether to stop early if no progress is being made
- Return the lowest point found during the entire journey
This process helps find a good solution efficiently, even in complex landscapes.
DataLoader Integration: This optimizer now uses the DataLoader batching infrastructure which supports: - Custom samplers (weighted, stratified, curriculum, importance, active learning) - Reproducible shuffling via RandomSeed - Option to drop incomplete final batches - True stochastic behavior with BatchSize=1 (default) Set these options via GradientBasedOptimizerOptions.DataSampler, ShuffleData, DropLastBatch, and RandomSeed.
Serialize()
Serializes the current state of the StochasticGradientDescentOptimizer to a byte array.
public override byte[] Serialize()
Returns
- byte[]
A byte array representing the serialized state of the optimizer.
Remarks
This method saves the current state of the optimizer, including its base class data and SGD-specific options, into a byte array.
For Beginners: This is like taking a snapshot of the hiker's journey:
- It saves all the current settings and progress
- This saved data can be used later to continue from where you left off
- It includes both general hiking info and SGD-specific details
This is useful for saving progress or sharing the optimizer's current state.
UpdateOptions(OptimizationAlgorithmOptions<T, TInput, TOutput>)
Updates the optimizer's options with the provided options.
protected override void UpdateOptions(OptimizationAlgorithmOptions<T, TInput, TOutput> options)
Parameters
optionsOptimizationAlgorithmOptions<T, TInput, TOutput>The options to apply to this optimizer.
Remarks
This method ensures that only StochasticGradientDescentOptimizerOptions can be applied to this optimizer.
For Beginners: This is like giving the hiker new instructions mid-journey:
- You can only give instructions specific to this type of hike (SGD)
- If you try to give the wrong type of instructions, it will cause an error
This ensures that the optimizer always has the correct type of settings.
Exceptions
- ArgumentException
Thrown when the options are not of the expected type.
UpdateParametersGpu(IGpuBuffer, IGpuBuffer, int, IDirectGpuBackend)
Updates parameters on the GPU using vanilla SGD.
public override void UpdateParametersGpu(IGpuBuffer parameters, IGpuBuffer gradients, int parameterCount, IDirectGpuBackend backend)
Parameters
parametersIGpuBuffergradientsIGpuBufferparameterCountintbackendIDirectGpuBackend
UpdateSolution(IFullModel<T, TInput, TOutput>, Vector<T>)
Updates the current solution based on the calculated gradient.
protected override IFullModel<T, TInput, TOutput> UpdateSolution(IFullModel<T, TInput, TOutput> currentSolution, Vector<T> gradient)
Parameters
currentSolutionIFullModel<T, TInput, TOutput>The current solution to update.
gradientVector<T>The calculated gradient.
Returns
- IFullModel<T, TInput, TOutput>
A new ISymbolicModel representing the updated solution.
Remarks
This method applies the gradient descent update rule, subtracting the gradient multiplied by the learning rate from the current solution's coefficients.
For Beginners: This is like the hiker taking a step:
- The direction to step is given by the gradient
- The size of the step is controlled by the learning rate
- The hiker moves from their current position in this direction and distance
This small step helps the hiker gradually move towards the lowest point.