Table of Contents

Class AdamGpuConfig

Namespace
AiDotNet.Interfaces
Assembly
AiDotNet.dll

Configuration for Adam optimizer on GPU.

public class AdamGpuConfig : IGpuOptimizerConfig
Inheritance
AdamGpuConfig
Implements
Inherited Members

Remarks

Adam maintains moving averages of gradients (m) and squared gradients (v), with bias correction for the initial steps.

For Beginners: Adam is one of the most popular optimizers. It adapts the learning rate for each parameter based on: - First moment (mean of gradients) - like momentum - Second moment (variance of gradients) - adapts to gradient magnitude This typically leads to faster convergence than plain SGD.

Constructors

AdamGpuConfig(float, float, float, float, float, int)

Creates a new Adam GPU configuration.

public AdamGpuConfig(float learningRate, float beta1 = 0.9, float beta2 = 0.999, float epsilon = 1E-08, float weightDecay = 0, int step = 0)

Parameters

learningRate float

Learning rate for parameter updates.

beta1 float

Exponential decay rate for first moment (default 0.9).

beta2 float

Exponential decay rate for second moment (default 0.999).

epsilon float

Numerical stability constant (default 1e-8).

weightDecay float

Weight decay coefficient (default 0).

step int

Current optimization step.

Properties

Beta1

Gets the exponential decay rate for the first moment estimates (typically 0.9).

public float Beta1 { get; init; }

Property Value

float

Beta2

Gets the exponential decay rate for the second moment estimates (typically 0.999).

public float Beta2 { get; init; }

Property Value

float

Epsilon

Gets the small constant for numerical stability (typically 1e-8).

public float Epsilon { get; init; }

Property Value

float

LearningRate

Gets the learning rate for parameter updates.

public float LearningRate { get; init; }

Property Value

float

OptimizerType

Gets the type of optimizer (SGD, Adam, AdamW, etc.).

public GpuOptimizerType OptimizerType { get; }

Property Value

GpuOptimizerType

Step

Gets the current optimization step (used for bias correction in Adam-family optimizers).

public int Step { get; init; }

Property Value

int

WeightDecay

Gets the weight decay (L2 regularization) coefficient.

public float WeightDecay { get; init; }

Property Value

float

Methods

ApplyUpdate(IDirectGpuBackend, IGpuBuffer, IGpuBuffer, GpuOptimizerState, int)

Applies the optimizer update to the given parameter buffer using its gradient.

public void ApplyUpdate(IDirectGpuBackend backend, IGpuBuffer param, IGpuBuffer gradient, GpuOptimizerState state, int size)

Parameters

backend IDirectGpuBackend

The GPU backend to execute the update.

param IGpuBuffer

Buffer containing the parameters to update (modified in-place).

gradient IGpuBuffer

Buffer containing the gradients.

state GpuOptimizerState

Optimizer state buffers (momentum, squared gradients, etc.).

size int

Number of parameters to update.

Remarks

For Beginners: This method applies the optimizer's update rule directly on the GPU. Each optimizer type (SGD, Adam, etc.) implements its own update logic using GPU kernels. The state parameter contains any auxiliary buffers needed (like velocity for SGD with momentum, or m/v buffers for Adam).

Design Note: Following the Open/Closed Principle, each optimizer config knows how to apply its own update, so adding new optimizers doesn't require modifying layer code.