Table of Contents

Enum FineTuningCategory

Namespace
AiDotNet.Interfaces
Assembly
AiDotNet.dll

Categories of fine-tuning methods.

public enum FineTuningCategory

Fields

Adversarial = 9

Adversarial methods that use game-theoretic approaches.

Examples: APO, GAPO

Constitutional = 5

Constitutional AI methods that use principles for self-improvement.

Examples: Constitutional AI, RLAIF

Contrastive = 8

Contrastive methods that learn from positive/negative examples.

Examples: NCA, Safe-NCA

DirectPreference = 2

Direct Preference Optimization - learns directly from preference pairs.

Examples: DPO, IPO, KTO, SimPO, CPO, R-DPO

KnowledgeDistillation = 7

Knowledge distillation - transfer knowledge from teacher to student.

Examples: Standard distillation, response distillation

OddsRatioPreference = 3

Odds/Ratio-based methods that combine SFT and preference learning.

Examples: ORPO

RankingBased = 4

Ranking-based methods that learn from response rankings.

Examples: RSO, RRHF, SLiC-HF, PRO

ReinforcementLearning = 1

Reinforcement Learning - learns from reward signals.

Examples: RLHF, PPO, GRPO, REINFORCE

SelfPlay = 6

Self-play methods where the model learns from itself.

Examples: SPIN

SupervisedFineTuning = 0

Supervised Fine-Tuning - learns from labeled input-output pairs.

Examples: Standard SFT, instruction tuning

Remarks

For Beginners: These categories group fine-tuning methods by how they learn. Some learn from labeled data, others from preferences, and some from reward signals.