Class SgdGpuConfig
- Namespace
- AiDotNet.Interfaces
- Assembly
- AiDotNet.dll
Configuration for SGD (Stochastic Gradient Descent) optimizer on GPU.
public class SgdGpuConfig : IGpuOptimizerConfig
- Inheritance
-
SgdGpuConfig
- Implements
- Inherited Members
Remarks
SGD updates weights using: w = w - lr * (grad + weightDecay * w) + momentum * velocity
For Beginners: SGD is the simplest optimizer. It moves weights in the direction opposite to the gradient, scaled by the learning rate. Momentum helps accelerate training by accumulating velocity from past updates.
Constructors
SgdGpuConfig(float, float, float, int)
Creates a new SGD GPU configuration.
public SgdGpuConfig(float learningRate, float momentum = 0.9, float weightDecay = 0, int step = 0)
Parameters
learningRatefloatLearning rate for parameter updates.
momentumfloatMomentum coefficient (default 0.9).
weightDecayfloatWeight decay coefficient (default 0).
stepintCurrent optimization step.
Properties
LearningRate
Gets the learning rate for parameter updates.
public float LearningRate { get; init; }
Property Value
Momentum
Gets the momentum coefficient (typically 0.9).
public float Momentum { get; init; }
Property Value
Remarks
Momentum accumulates past gradients to smooth updates and escape local minima. Set to 0 for vanilla SGD without momentum.
OptimizerType
Gets the type of optimizer (SGD, Adam, AdamW, etc.).
public GpuOptimizerType OptimizerType { get; }
Property Value
Step
Gets the current optimization step (used for bias correction in Adam-family optimizers).
public int Step { get; init; }
Property Value
WeightDecay
Gets the weight decay (L2 regularization) coefficient.
public float WeightDecay { get; init; }
Property Value
Methods
ApplyUpdate(IDirectGpuBackend, IGpuBuffer, IGpuBuffer, GpuOptimizerState, int)
Applies the optimizer update to the given parameter buffer using its gradient.
public void ApplyUpdate(IDirectGpuBackend backend, IGpuBuffer param, IGpuBuffer gradient, GpuOptimizerState state, int size)
Parameters
backendIDirectGpuBackendThe GPU backend to execute the update.
paramIGpuBufferBuffer containing the parameters to update (modified in-place).
gradientIGpuBufferBuffer containing the gradients.
stateGpuOptimizerStateOptimizer state buffers (momentum, squared gradients, etc.).
sizeintNumber of parameters to update.
Remarks
For Beginners: This method applies the optimizer's update rule directly on the GPU. Each optimizer type (SGD, Adam, etc.) implements its own update logic using GPU kernels. The state parameter contains any auxiliary buffers needed (like velocity for SGD with momentum, or m/v buffers for Adam).
Design Note: Following the Open/Closed Principle, each optimizer config knows how to apply its own update, so adding new optimizers doesn't require modifying layer code.