Class DFPOptimizer<T, TInput, TOutput>
- Namespace
- AiDotNet.Optimizers
- Assembly
- AiDotNet.dll
Implements the Davidon-Fletcher-Powell (DFP) optimization algorithm for numerical optimization problems.
public class DFPOptimizer<T, TInput, TOutput> : GradientBasedOptimizerBase<T, TInput, TOutput>, IGradientBasedOptimizer<T, TInput, TOutput>, IOptimizer<T, TInput, TOutput>, IModelSerializer
Type Parameters
TThe numeric type used for calculations (e.g., float, double).
TInputTOutput
- Inheritance
-
OptimizerBase<T, TInput, TOutput>GradientBasedOptimizerBase<T, TInput, TOutput>DFPOptimizer<T, TInput, TOutput>
- Implements
-
IGradientBasedOptimizer<T, TInput, TOutput>IOptimizer<T, TInput, TOutput>
- Inherited Members
- Extension Methods
Remarks
The DFP algorithm is a quasi-Newton method for solving unconstrained nonlinear optimization problems. It approximates the inverse Hessian matrix to determine the search direction, combining the efficiency of Newton's method with the stability of gradient descent.
For Beginners: This optimizer is like a smart navigator that learns from its past steps to make better decisions about which direction to move in the future. It's particularly good at handling complex optimization problems where the landscape of possible solutions is intricate.
Constructors
DFPOptimizer(IFullModel<T, TInput, TOutput>, DFPOptimizerOptions<T, TInput, TOutput>?, IEngine?)
Initializes a new instance of the DFPOptimizer class.
public DFPOptimizer(IFullModel<T, TInput, TOutput> model, DFPOptimizerOptions<T, TInput, TOutput>? options = null, IEngine? engine = null)
Parameters
modelIFullModel<T, TInput, TOutput>The model to optimize.
optionsDFPOptimizerOptions<T, TInput, TOutput>The options for configuring the DFP algorithm.
engineIEngineThe computation engine (CPU or GPU) for vectorized operations.
Remarks
For Beginners: This constructor sets up the DFP optimizer with its initial configuration. You can customize various aspects of how it works, or use default settings.
Methods
Deserialize(byte[])
Deserializes the DFP optimizer from a byte array.
public override void Deserialize(byte[] data)
Parameters
databyte[]The byte array containing the serialized optimizer state.
Remarks
For Beginners: This method reconstructs the optimizer's state from a series of bytes. It's used to restore a previously saved state of the optimizer, allowing you to continue from where you left off.
Exceptions
- InvalidOperationException
Thrown when deserialization of optimizer options fails.
GetOptions()
Retrieves the current options of the DFP optimizer.
public override OptimizationAlgorithmOptions<T, TInput, TOutput> GetOptions()
Returns
- OptimizationAlgorithmOptions<T, TInput, TOutput>
The current optimization algorithm options.
Remarks
For Beginners: This method allows you to check the current settings of the optimizer. It's useful if you need to inspect or copy the current configuration.
InitializeAdaptiveParameters()
Initializes the adaptive parameters used in the DFP algorithm.
protected override void InitializeAdaptiveParameters()
Remarks
For Beginners: This method sets up the initial learning rate for the optimizer. The learning rate determines how big of steps the optimizer takes when improving the solution.
Optimize(OptimizationInputData<T, TInput, TOutput>)
Performs the main optimization process using the DFP algorithm.
public override OptimizationResult<T, TInput, TOutput> Optimize(OptimizationInputData<T, TInput, TOutput> inputData)
Parameters
inputDataOptimizationInputData<T, TInput, TOutput>The input data for the optimization process.
Returns
- OptimizationResult<T, TInput, TOutput>
The result of the optimization process.
Remarks
For Beginners: This is the heart of the DFP algorithm. It iteratively improves the solution by calculating gradients, determining search directions, and updating the solution. The process continues until it reaches the maximum number of iterations or meets the stopping criteria.
DataLoader Integration: This method uses the DataLoader API for epoch management. DFP typically operates on the full dataset because it builds an approximation of the inverse Hessian matrix that requires consistent gradients between iterations. The method notifies the sampler of epoch starts using NotifyEpochStart(int) for compatibility with curriculum learning and sampling strategies.
Serialize()
Serializes the DFP optimizer to a byte array.
public override byte[] Serialize()
Returns
- byte[]
A byte array representing the serialized state of the optimizer.
Remarks
For Beginners: This method converts the current state of the optimizer into a series of bytes. This is useful for saving the optimizer's state to a file or sending it over a network.
UpdateAdaptiveParameters(OptimizationStepData<T, TInput, TOutput>, OptimizationStepData<T, TInput, TOutput>)
Updates the adaptive parameters based on the optimization progress.
protected override void UpdateAdaptiveParameters(OptimizationStepData<T, TInput, TOutput> currentStepData, OptimizationStepData<T, TInput, TOutput> previousStepData)
Parameters
currentStepDataOptimizationStepData<T, TInput, TOutput>Data from the current optimization step.
previousStepDataOptimizationStepData<T, TInput, TOutput>Data from the previous optimization step.
Remarks
For Beginners: This method adjusts how big of steps the optimizer takes. If the solution is improving, it might increase the step size to progress faster. If not, it might decrease the step size to be more careful.
UpdateOptions(OptimizationAlgorithmOptions<T, TInput, TOutput>)
Updates the options for the DFP optimizer.
protected override void UpdateOptions(OptimizationAlgorithmOptions<T, TInput, TOutput> options)
Parameters
optionsOptimizationAlgorithmOptions<T, TInput, TOutput>The new options to be set.
Remarks
For Beginners: This method allows you to change the settings of the optimizer during runtime. It ensures that only the correct type of options (specific to DFP) can be used.
Exceptions
- ArgumentException
Thrown when the provided options are not of type DFPOptimizerOptions.
UpdateParameters(Vector<T>, Vector<T>)
Updates parameters using the DFP algorithm with inverse Hessian approximation.
public override Vector<T> UpdateParameters(Vector<T> parameters, Vector<T> gradient)
Parameters
parametersVector<T>The current parameter values.
gradientVector<T>The gradient at the current parameters.
Returns
- Vector<T>
The updated parameters.
Remarks
For Beginners: This method implements the core DFP update formula. DFP uses the inverse Hessian approximation to determine a search direction that typically converges faster than standard gradient descent.
UpdateParametersGpu(IGpuBuffer, IGpuBuffer, int, IDirectGpuBackend)
Updates parameters using GPU-accelerated DFP.
public override void UpdateParametersGpu(IGpuBuffer parameters, IGpuBuffer gradients, int parameterCount, IDirectGpuBackend backend)
Parameters
parametersIGpuBuffergradientsIGpuBufferparameterCountintbackendIDirectGpuBackend
Remarks
DFP is a quasi-Newton method that approximates the inverse Hessian. GPU implementation is not yet available due to the complexity of maintaining the dense inverse Hessian approximation matrix across GPU memory.