Class GradientDescentOptimizer<T, TInput, TOutput>
- Namespace
- AiDotNet.Optimizers
- Assembly
- AiDotNet.dll
Represents a Gradient Descent optimizer for machine learning models.
public class GradientDescentOptimizer<T, TInput, TOutput> : GradientBasedOptimizerBase<T, TInput, TOutput>, IGradientBasedOptimizer<T, TInput, TOutput>, IOptimizer<T, TInput, TOutput>, IModelSerializer
Type Parameters
TThe numeric type used for calculations, typically float or double.
TInputTOutput
- Inheritance
-
OptimizerBase<T, TInput, TOutput>GradientBasedOptimizerBase<T, TInput, TOutput>GradientDescentOptimizer<T, TInput, TOutput>
- Implements
-
IGradientBasedOptimizer<T, TInput, TOutput>IOptimizer<T, TInput, TOutput>
- Inherited Members
- Extension Methods
Remarks
Gradient Descent is a first-order iterative optimization algorithm for finding a local minimum of a differentiable function. It takes steps proportional to the negative of the gradient of the function at the current point.
For Beginners: Imagine you're trying to find the lowest point in a valley:
- You start at a random point (initial model parameters)
- You look around to see which way is steepest downhill (calculate the gradient)
- You take a step in that direction (update the parameters)
- You repeat this process until you reach the bottom of the valley (optimize the model)
This optimizer helps the model learn by gradually adjusting its parameters to minimize errors.
Constructors
GradientDescentOptimizer(IFullModel<T, TInput, TOutput>, GradientDescentOptimizerOptions<T, TInput, TOutput>?, IEngine?)
Initializes a new instance of the GradientDescentOptimizer class.
public GradientDescentOptimizer(IFullModel<T, TInput, TOutput> model, GradientDescentOptimizerOptions<T, TInput, TOutput>? options = null, IEngine? engine = null)
Parameters
modelIFullModel<T, TInput, TOutput>The model to optimize.
optionsGradientDescentOptimizerOptions<T, TInput, TOutput>Options for the Gradient Descent optimizer.
engineIEngine
Remarks
For Beginners: This sets up the Gradient Descent optimizer with its initial settings. It's like preparing for your hike by choosing your starting point, deciding how big your steps will be, and how you'll adjust your path to avoid getting stuck in small dips.
Methods
Deserialize(byte[])
Restores the state of the Gradient Descent optimizer from a byte array.
public override void Deserialize(byte[] data)
Parameters
databyte[]The byte array containing the serialized optimizer state.
Remarks
This method deserializes both the base class data and the Gradient Descent-specific options from a byte array, typically created by the Serialize method. It reconstructs the optimizer's state, including all settings and progress information.
For Beginners: This is like unpacking your hiking gear and reading your saved plan:
- It takes the saved information (byte array) and uses it to set up the optimizer
- This allows you to continue optimizing from where you left off, or use someone else's setup
- It's the reverse process of Serialize, turning the saved data back into a working optimizer
Imagine you're starting a hike using a very detailed guide someone else wrote. This method helps you set everything up exactly as described in that guide.
Exceptions
- InvalidOperationException
Thrown when deserialization of optimizer options fails.
GenerateGradientCacheKey(IFullModel<T, TInput, TOutput>, TInput, TOutput)
Generates a unique key for caching gradients specific to the Gradient Descent optimizer.
protected override string GenerateGradientCacheKey(IFullModel<T, TInput, TOutput> model, TInput X, TOutput y)
Parameters
modelIFullModel<T, TInput, TOutput>The current model being optimized.
XTInputThe input features used for gradient calculation.
yTOutputThe target values used for gradient calculation.
Returns
- string
A string that uniquely identifies the current gradient calculation scenario.
Remarks
This method extends the base class's gradient cache key generation by adding Gradient Descent-specific parameters. The resulting key is unique to the current state of the optimizer and the input data, allowing for efficient caching and retrieval of previously calculated gradients.
For Beginners: Think of this method as creating a unique label for each gradient calculation:
- It starts with a basic label (from the base class) that describes the model and data
- Then it adds specific details about the Gradient Descent optimizer, like how big steps it's taking (learning rate) and how many times it plans to adjust the model (max iterations)
- This unique label helps the optimizer remember and quickly find previous calculations, making the whole process faster and more efficient
It's like keeping a well-organized hiking journal where you can quickly look up information about specific points in your journey.
GetOptions()
Retrieves the current options for the Gradient Descent optimizer.
public override OptimizationAlgorithmOptions<T, TInput, TOutput> GetOptions()
Returns
- OptimizationAlgorithmOptions<T, TInput, TOutput>
The current Gradient Descent optimizer options.
Remarks
This method returns the current configuration options for the Gradient Descent optimizer. These options control various aspects of the optimization process, such as learning rate, maximum iterations, and regularization settings.
For Beginners: Think of this method as checking your current hiking plan:
- It tells you things like how big your steps are (learning rate)
- How long you plan to hike (maximum iterations)
- What rules you're following to avoid getting lost (regularization settings)
This information is useful if you want to understand or adjust how the optimizer is currently set up.
Optimize(OptimizationInputData<T, TInput, TOutput>)
Performs the main optimization process using the Gradient Descent algorithm.
public override OptimizationResult<T, TInput, TOutput> Optimize(OptimizationInputData<T, TInput, TOutput> inputData)
Parameters
inputDataOptimizationInputData<T, TInput, TOutput>The input data for the optimization process.
Returns
- OptimizationResult<T, TInput, TOutput>
The result of the optimization process.
Remarks
For Beginners: This is the heart of the Gradient Descent algorithm. It: 1. Starts with a random solution 2. Calculates how to improve the solution (the gradient) 3. Updates the solution by taking a step in the direction of improvement 4. Repeats this process many times
It's like repeatedly adjusting your path as you hike, always trying to move towards lower ground.
DataLoader Integration: This method uses the DataLoader API for efficient batch processing. It creates a batcher using CreateBatcher(OptimizationInputData<T, TInput, TOutput>, int) and notifies the sampler of epoch starts using NotifyEpochStart(int).
ReverseUpdate(Vector<T>, Vector<T>)
Reverses a Gradient Descent update to recover original parameters.
public override Vector<T> ReverseUpdate(Vector<T> updatedParameters, Vector<T> appliedGradients)
Parameters
updatedParametersVector<T>Parameters after GD update
appliedGradientsVector<T>The gradients that were applied
Returns
- Vector<T>
Original parameters before the update
Remarks
Gradient Descent uses vanilla SGD update rule: params_new = params_old - lr * gradient. The reverse is straightforward: params_old = params_new + lr * gradient.
For Beginners: This calculates where parameters were before a Gradient Descent update. Since GD uses simple steps (parameter minus learning_rate times gradient), reversing just means adding back that step.
Serialize()
Converts the current state of the Gradient Descent optimizer into a byte array for storage or transmission.
public override byte[] Serialize()
Returns
- byte[]
A byte array representing the serialized state of the optimizer.
Remarks
This method serializes both the base class data and the Gradient Descent-specific options. It uses a combination of binary serialization for efficiency and JSON serialization for flexibility.
For Beginners: This is like packing up your hiking gear and writing down your plan:
- It saves all the important information about the optimizer's current state
- This saved information can be used later to recreate the optimizer exactly as it is now
- It's useful for saving your progress or sharing your optimizer setup with others
Think of it as creating a detailed snapshot of your hiking journey that you can use to continue from the same point later or allow someone else to follow your exact path.
UpdateOptions(OptimizationAlgorithmOptions<T, TInput, TOutput>)
Updates the options for the Gradient Descent optimizer.
protected override void UpdateOptions(OptimizationAlgorithmOptions<T, TInput, TOutput> options)
Parameters
optionsOptimizationAlgorithmOptions<T, TInput, TOutput>The new options to apply to the optimizer.
Remarks
For Beginners: This method allows you to change the settings of the optimizer while it's running. It's like adjusting your hiking strategy mid-journey based on the terrain you encounter.
Exceptions
- ArgumentException
Thrown when the provided options are not of the correct type.
UpdateParametersGpu(IGpuBuffer, IGpuBuffer, int, IDirectGpuBackend)
Updates parameters on the GPU using vanilla SGD.
public override void UpdateParametersGpu(IGpuBuffer parameters, IGpuBuffer gradients, int parameterCount, IDirectGpuBackend backend)
Parameters
parametersIGpuBuffergradientsIGpuBufferparameterCountintbackendIDirectGpuBackend
UpdateSolution(IFullModel<T, TInput, TOutput>, Vector<T>)
Updates the current solution based on the calculated gradient.
protected override IFullModel<T, TInput, TOutput> UpdateSolution(IFullModel<T, TInput, TOutput> currentSolution, Vector<T> gradient)
Parameters
currentSolutionIFullModel<T, TInput, TOutput>The current solution.
gradientVector<T>The calculated gradient.
Returns
- IFullModel<T, TInput, TOutput>
The updated solution.
Remarks
For Beginners: This method adjusts the current solution to make it better. It's like taking a step in the direction you've determined will lead you downhill.