Class BFGSOptimizer<T, TInput, TOutput>
- Namespace
- AiDotNet.Optimizers
- Assembly
- AiDotNet.dll
Implements the Broyden-Fletcher-Goldfarb-Shanno (BFGS) optimization algorithm.
public class BFGSOptimizer<T, TInput, TOutput> : GradientBasedOptimizerBase<T, TInput, TOutput>, IGradientBasedOptimizer<T, TInput, TOutput>, IOptimizer<T, TInput, TOutput>, IModelSerializer
Type Parameters
TThe numeric type used for calculations (e.g., float, double).
TInputTOutput
- Inheritance
-
OptimizerBase<T, TInput, TOutput>GradientBasedOptimizerBase<T, TInput, TOutput>BFGSOptimizer<T, TInput, TOutput>
- Implements
-
IGradientBasedOptimizer<T, TInput, TOutput>IOptimizer<T, TInput, TOutput>
- Inherited Members
- Extension Methods
Remarks
BFGS is a quasi-Newton method for solving unconstrained nonlinear optimization problems. It approximates the Hessian matrix of second derivatives of the function to be minimized.
For Beginners: BFGS is an advanced optimization algorithm that tries to find the best solution by making smart steps based on the function's behavior. It's particularly good at handling complex problems where the function being optimized is smooth but potentially has many variables.
Constructors
BFGSOptimizer(IFullModel<T, TInput, TOutput>, BFGSOptimizerOptions<T, TInput, TOutput>?, IEngine?)
Initializes a new instance of the BFGSOptimizer class.
public BFGSOptimizer(IFullModel<T, TInput, TOutput> model, BFGSOptimizerOptions<T, TInput, TOutput>? options = null, IEngine? engine = null)
Parameters
modelIFullModel<T, TInput, TOutput>The model to optimize.
optionsBFGSOptimizerOptions<T, TInput, TOutput>The options for configuring the BFGS algorithm.
engineIEngineThe computation engine (CPU or GPU) for vectorized operations.
Remarks
For Beginners: This constructor sets up the BFGS optimizer with its initial configuration. You can customize various aspects of how it works, or use default settings.
Methods
Deserialize(byte[])
Restores the state of the BFGS optimizer from a byte array.
public override void Deserialize(byte[] data)
Parameters
databyte[]The byte array containing the serialized state of the optimizer.
Remarks
For Beginners: This method takes a saved state of the BFGS Optimizer (in the form of a byte array) and uses it to restore the optimizer to that state. It's like loading a saved game, bringing back all the important settings and progress that were saved earlier.
GenerateGradientCacheKey(IFullModel<T, TInput, TOutput>, TInput, TOutput)
Generates a unique key for caching gradients in the BFGS optimization process.
protected override string GenerateGradientCacheKey(IFullModel<T, TInput, TOutput> model, TInput X, TOutput y)
Parameters
modelIFullModel<T, TInput, TOutput>The current model.
XTInputThe input data matrix.
yTOutputThe target values vector.
Returns
- string
A string representing the unique cache key.
Remarks
For Beginners: This method creates a unique identifier for storing and retrieving gradients during the optimization process. It helps avoid recalculating gradients unnecessarily, which can save time. The key includes BFGS-specific information to ensure it's unique to this optimizer's current state.
GetOptions()
Gets the current options for the BFGS optimizer.
public override OptimizationAlgorithmOptions<T, TInput, TOutput> GetOptions()
Returns
- OptimizationAlgorithmOptions<T, TInput, TOutput>
The current optimization options.
Remarks
For Beginners: This method lets you see what settings the BFGS optimizer is currently using.
InitializeAdaptiveParameters()
Initializes the adaptive parameters used in the BFGS algorithm.
protected override void InitializeAdaptiveParameters()
Remarks
For Beginners: This method sets up the initial state for the optimizer, including the learning rate and iteration count.
Optimize(OptimizationInputData<T, TInput, TOutput>)
Performs the main optimization process using the BFGS algorithm.
public override OptimizationResult<T, TInput, TOutput> Optimize(OptimizationInputData<T, TInput, TOutput> inputData)
Parameters
inputDataOptimizationInputData<T, TInput, TOutput>The input data for the optimization process.
Returns
- OptimizationResult<T, TInput, TOutput>
The result of the optimization process.
Remarks
For Beginners: This is the heart of the BFGS algorithm. It iteratively improves the solution by updating the parameters based on the gradient and the approximated inverse Hessian matrix. The process continues until it reaches the maximum number of iterations or meets the convergence criteria.
DataLoader Integration: This method uses the DataLoader API for epoch management. BFGS typically operates on the full dataset because it builds an approximation of the inverse Hessian matrix that requires consistent gradients between iterations. The method notifies the sampler of epoch starts using NotifyEpochStart(int) for compatibility with curriculum learning and sampling strategies.
Serialize()
Converts the current state of the BFGS optimizer into a byte array for storage or transmission.
public override byte[] Serialize()
Returns
- byte[]
A byte array representing the serialized state of the optimizer.
Remarks
For Beginners: This method takes all the important information about the current state of the BFGS Optimizer and turns it into a format that can be easily saved or sent to another computer. It includes both the base optimizer data and BFGS-specific data.
UpdateAdaptiveParameters(OptimizationStepData<T, TInput, TOutput>, OptimizationStepData<T, TInput, TOutput>)
Updates the adaptive parameters of the optimizer.
protected override void UpdateAdaptiveParameters(OptimizationStepData<T, TInput, TOutput> currentStepData, OptimizationStepData<T, TInput, TOutput> previousStepData)
Parameters
currentStepDataOptimizationStepData<T, TInput, TOutput>The current step data.
previousStepDataOptimizationStepData<T, TInput, TOutput>The previous step data.
Remarks
For Beginners: This method adjusts the learning rate based on the performance of the current step compared to the previous step. If the current step improved the fitness score, the learning rate is increased; otherwise, it's decreased. This helps the optimizer adapt to the landscape of the problem.
UpdateOptions(OptimizationAlgorithmOptions<T, TInput, TOutput>)
Updates the options for the BFGS optimizer.
protected override void UpdateOptions(OptimizationAlgorithmOptions<T, TInput, TOutput> options)
Parameters
optionsOptimizationAlgorithmOptions<T, TInput, TOutput>The new options to be set.
Remarks
For Beginners: This method allows you to change the settings of the BFGS optimizer during runtime. It checks to make sure you're providing the right kind of options specific to the BFGS algorithm.
Exceptions
- ArgumentException
Thrown when the provided options are not of the correct type.
UpdateParameters(Vector<T>, Vector<T>)
Updates parameters using the BFGS algorithm with inverse Hessian approximation.
public override Vector<T> UpdateParameters(Vector<T> parameters, Vector<T> gradient)
Parameters
parametersVector<T>The current parameter values.
gradientVector<T>The gradient at the current parameters.
Returns
- Vector<T>
The updated parameters.
Remarks
For Beginners: This method implements the core BFGS update formula. It uses the inverse Hessian approximation to determine a search direction that typically converges faster than standard gradient descent.
UpdateParametersGpu(IGpuBuffer, IGpuBuffer, int, IDirectGpuBackend)
Updates parameters using GPU-accelerated BFGS.
public override void UpdateParametersGpu(IGpuBuffer parameters, IGpuBuffer gradients, int parameterCount, IDirectGpuBackend backend)
Parameters
parametersIGpuBuffergradientsIGpuBufferparameterCountintbackendIDirectGpuBackend
Remarks
BFGS is a second-order quasi-Newton method that requires Hessian approximation. GPU implementation is not yet available due to the complexity of maintaining the inverse Hessian approximation across GPU memory.