Table of Contents

Interface IFineTuning<T, TInput, TOutput>

Namespace
AiDotNet.Interfaces
Assembly
AiDotNet.dll

Defines the contract for fine-tuning methods that adapt pre-trained models to specific tasks or preferences.

public interface IFineTuning<T, TInput, TOutput> : IModelSerializer

Type Parameters

T

The numeric data type used for calculations (e.g., float, double).

TInput

The input data type for the model.

TOutput

The output data type for the model.

Inherited Members

Remarks

Fine-tuning encompasses a wide range of techniques for adapting models, from supervised fine-tuning (SFT) to advanced preference optimization methods like DPO, RLHF, and their variants.

For Beginners: Fine-tuning is like specialized training for an AI that already knows the basics. Just like a doctor goes through general education before specializing, AI models first learn general knowledge (pre-training) and then learn specific skills or behaviors (fine-tuning).

Categories of Fine-Tuning Methods:

  • Supervised Fine-Tuning (SFT)Train on labeled input-output pairs
  • Reinforcement LearningRLHF, PPO, GRPO - learn from reward signals
  • Direct Preference OptimizationDPO, IPO, KTO, SimPO - learn from preference pairs
  • Constitutional MethodsCAI, RLAIF - learn from AI-generated feedback with principles
  • Self-Play MethodsSPIN - model learns from itself

Properties

Category

Gets the category of this fine-tuning method.

FineTuningCategory Category { get; }

Property Value

FineTuningCategory

MethodName

Gets the name of this fine-tuning method.

string MethodName { get; }

Property Value

string

Remarks

Examples: "DPO", "RLHF", "SimPO", "ORPO", "SFT", "Constitutional AI"

RequiresReferenceModel

Gets whether this method requires a reference model.

bool RequiresReferenceModel { get; }

Property Value

bool

Remarks

Most preference methods (DPO, IPO) require a reference model for KL regularization. Reference-free methods (SimPO, ORPO) do not require one, making them more memory efficient.

RequiresRewardModel

Gets whether this method requires a reward model.

bool RequiresRewardModel { get; }

Property Value

bool

Remarks

RL-based methods (RLHF, PPO, GRPO) require a reward model. Direct preference methods (DPO, IPO, KTO) do not require one.

SupportsPEFT

Gets whether this method supports parameter-efficient fine-tuning (PEFT).

bool SupportsPEFT { get; }

Property Value

bool

Remarks

When true, this method can be combined with LoRA, QLoRA, or other PEFT techniques to reduce memory requirements during fine-tuning.

Methods

EvaluateAsync(IFullModel<T, TInput, TOutput>, FineTuningData<T, TInput, TOutput>, CancellationToken)

Evaluates the fine-tuning quality of a model.

Task<FineTuningMetrics<T>> EvaluateAsync(IFullModel<T, TInput, TOutput> model, FineTuningData<T, TInput, TOutput> evaluationData, CancellationToken cancellationToken = default)

Parameters

model IFullModel<T, TInput, TOutput>

The fine-tuned model to evaluate.

evaluationData FineTuningData<T, TInput, TOutput>

Evaluation dataset.

cancellationToken CancellationToken

Token for cancellation.

Returns

Task<FineTuningMetrics<T>>

Metrics describing the fine-tuning quality.

Remarks

Different fine-tuning methods have different evaluation metrics:

  • Preference methodsWin rate against reference, preference accuracy
  • RL methodsReward scores, KL divergence from base model
  • Safety methodsHarmlessness scores, refusal rates

FineTuneAsync(IFullModel<T, TInput, TOutput>, FineTuningData<T, TInput, TOutput>, CancellationToken)

Fine-tunes a model using the configured method and provided training data.

Task<IFullModel<T, TInput, TOutput>> FineTuneAsync(IFullModel<T, TInput, TOutput> baseModel, FineTuningData<T, TInput, TOutput> trainingData, CancellationToken cancellationToken = default)

Parameters

baseModel IFullModel<T, TInput, TOutput>

The pre-trained model to fine-tune.

trainingData FineTuningData<T, TInput, TOutput>

The training data appropriate for this fine-tuning method.

cancellationToken CancellationToken

Token for cancellation.

Returns

Task<IFullModel<T, TInput, TOutput>>

The fine-tuned model.

Remarks

This method applies the fine-tuning algorithm to adapt the base model. The specific behavior depends on the method category:

  • SFTUses labeled examples from trainingData
  • Preference-based (DPO, etc.)Uses preference pairs from trainingData
  • RL-based (RLHF, PPO)Uses reward model and training data

For Beginners: This is where the actual training happens. You give it a model and training data, and it returns an improved model that's better at the specific task.

GetOptions()

Gets the configuration options for this fine-tuning method.

FineTuningOptions<T> GetOptions()

Returns

FineTuningOptions<T>

Reset()

Resets the fine-tuning method state.

void Reset()