Interface IFineTuning<T, TInput, TOutput>

Namespace: AiDotNet.Interfaces

Assembly: AiDotNet.dll

Defines the contract for fine-tuning methods that adapt pre-trained models to specific tasks or preferences.

public interface IFineTuning<T, TInput, TOutput> : IModelSerializer

Type Parameters

T: The numeric data type used for calculations (e.g., float, double).
TInput: The input data type for the model.
TOutput: The output data type for the model.

Inherited Members: IModelSerializer.Serialize()

IModelSerializer.Deserialize(byte[])

IModelSerializer.SaveModel(string)

IModelSerializer.LoadModel(string)

Remarks

Fine-tuning encompasses a wide range of techniques for adapting models, from supervised fine-tuning (SFT) to advanced preference optimization methods like DPO, RLHF, and their variants.

For Beginners: Fine-tuning is like specialized training for an AI that already knows the basics. Just like a doctor goes through general education before specializing, AI models first learn general knowledge (pre-training) and then learn specific skills or behaviors (fine-tuning).

Categories of Fine-Tuning Methods:

Supervised Fine-Tuning (SFT)Train on labeled input-output pairs
Reinforcement LearningRLHF, PPO, GRPO - learn from reward signals
Direct Preference OptimizationDPO, IPO, KTO, SimPO - learn from preference pairs
Constitutional MethodsCAI, RLAIF - learn from AI-generated feedback with principles
Self-Play MethodsSPIN - model learns from itself

Properties

MethodName

Gets the name of this fine-tuning method.

string MethodName { get; }

Property Value

string

Remarks

Examples: "DPO", "RLHF", "SimPO", "ORPO", "SFT", "Constitutional AI"

RequiresReferenceModel

Gets whether this method requires a reference model.

bool RequiresReferenceModel { get; }

Property Value

bool

Remarks

Most preference methods (DPO, IPO) require a reference model for KL regularization. Reference-free methods (SimPO, ORPO) do not require one, making them more memory efficient.

RequiresRewardModel

Gets whether this method requires a reward model.

bool RequiresRewardModel { get; }

Property Value

bool

Remarks

RL-based methods (RLHF, PPO, GRPO) require a reward model. Direct preference methods (DPO, IPO, KTO) do not require one.

SupportsPEFT

Gets whether this method supports parameter-efficient fine-tuning (PEFT).

bool SupportsPEFT { get; }

Property Value

bool

Remarks

When true, this method can be combined with LoRA, QLoRA, or other PEFT techniques to reduce memory requirements during fine-tuning.

Methods

EvaluateAsync(IFullModel<T, TInput, TOutput>, FineTuningData<T, TInput, TOutput>, CancellationToken)

Evaluates the fine-tuning quality of a model.

Task<FineTuningMetrics<T>> EvaluateAsync(IFullModel<T, TInput, TOutput> model, FineTuningData<T, TInput, TOutput> evaluationData, CancellationToken cancellationToken = default)

Parameters

model IFullModel<T, TInput, TOutput>: The fine-tuned model to evaluate.
evaluationData FineTuningData<T, TInput, TOutput>: Evaluation dataset.
cancellationToken CancellationToken: Token for cancellation.

Returns

Task<FineTuningMetrics<T>>: Metrics describing the fine-tuning quality.

Remarks

Different fine-tuning methods have different evaluation metrics:

Preference methodsWin rate against reference, preference accuracy
RL methodsReward scores, KL divergence from base model
Safety methodsHarmlessness scores, refusal rates

FineTuneAsync(IFullModel<T, TInput, TOutput>, FineTuningData<T, TInput, TOutput>, CancellationToken)

Fine-tunes a model using the configured method and provided training data.

Task<IFullModel<T, TInput, TOutput>> FineTuneAsync(IFullModel<T, TInput, TOutput> baseModel, FineTuningData<T, TInput, TOutput> trainingData, CancellationToken cancellationToken = default)

Parameters

baseModel IFullModel<T, TInput, TOutput>: The pre-trained model to fine-tune.
trainingData FineTuningData<T, TInput, TOutput>: The training data appropriate for this fine-tuning method.
cancellationToken CancellationToken: Token for cancellation.

Returns

Task<IFullModel<T, TInput, TOutput>>: The fine-tuned model.

Remarks

This method applies the fine-tuning algorithm to adapt the base model. The specific behavior depends on the method category:

SFTUses labeled examples from trainingData
Preference-based (DPO, etc.)Uses preference pairs from trainingData
RL-based (RLHF, PPO)Uses reward model and training data

For Beginners: This is where the actual training happens. You give it a model and training data, and it returns an improved model that's better at the specific task.

GetOptions()

Gets the configuration options for this fine-tuning method.

FineTuningOptions<T> GetOptions()

Returns

FineTuningOptions<T>

Reset()

Resets the fine-tuning method state.

void Reset()

Table of Contents

Interface IFineTuning<T, TInput, TOutput>

Type Parameters

Remarks

Properties

Category

Property Value

MethodName

Property Value

Remarks

RequiresReferenceModel

Property Value

Remarks

RequiresRewardModel

Property Value

Remarks

SupportsPEFT

Property Value

Remarks

Methods

EvaluateAsync(IFullModel<T, TInput, TOutput>, FineTuningData<T, TInput, TOutput>, CancellationToken)

Parameters

Returns

Remarks

FineTuneAsync(IFullModel<T, TInput, TOutput>, FineTuningData<T, TInput, TOutput>, CancellationToken)

Parameters

Returns

Remarks

GetOptions()

Returns

Reset()