Table of Contents

Class FineTuningData<T, TInput, TOutput>

Namespace
AiDotNet.Models.Options
Assembly
AiDotNet.dll

Container for fine-tuning training and evaluation data.

public class FineTuningData<T, TInput, TOutput>

Type Parameters

T

The numeric data type.

TInput

The input data type.

TOutput

The output data type.

Inheritance
FineTuningData<T, TInput, TOutput>
Inherited Members

Remarks

This class holds data for various fine-tuning methods. Different methods use different subsets of the data:

  • SFTUses Inputs and Outputs
  • Preference methodsUses Inputs, ChosenOutputs, RejectedOutputs
  • RL methodsUses Inputs and Rewards
  • Ranking methodsUses Inputs and RankedOutputs

For Beginners: Think of this as a container that holds all the training examples the model needs to learn from. Different training methods need different kinds of examples.

Properties

Advantages

Gets or sets advantages for PPO-style methods.

public double[] Advantages { get; set; }

Property Value

double[]

ChosenOutputs

Gets or sets the chosen (preferred) outputs for preference learning.

public TOutput[] ChosenOutputs { get; set; }

Property Value

TOutput[]

Remarks

Used by DPO, IPO, KTO, SimPO, CPO, and other preference methods.

Count

Gets the number of samples in the dataset.

public int Count { get; }

Property Value

int

CritiqueRevisions

Gets or sets critique-revision pairs for constitutional methods.

public (TOutput Original, string Critique, TOutput Revised)[] CritiqueRevisions { get; set; }

Property Value

(TOutput Original, string Critique, TOutput Revised)[]

Remarks

Each tuple contains (original response, critique, revised response).

DesirabilityLabels

Gets or sets binary labels indicating if outputs are desirable.

public bool[] DesirabilityLabels { get; set; }

Property Value

bool[]

Remarks

Used by KTO which doesn't require pairwise data. True = desirable, False = undesirable.

HasDistillationData

Gets whether this data is suitable for distillation.

public bool HasDistillationData { get; }

Property Value

bool

HasPairwisePreferenceData

Gets whether this data is suitable for pairwise preference methods.

public bool HasPairwisePreferenceData { get; }

Property Value

bool

HasRLData

Gets whether this data is suitable for RL methods.

public bool HasRLData { get; }

Property Value

bool

HasRankingData

Gets whether this data is suitable for ranking methods.

public bool HasRankingData { get; }

Property Value

bool

HasSFTData

Gets whether this data is suitable for SFT.

public bool HasSFTData { get; }

Property Value

bool

HasUnpairedPreferenceData

Gets whether this data is suitable for KTO (unpaired preferences).

public bool HasUnpairedPreferenceData { get; }

Property Value

bool

Inputs

Gets or sets the input data samples.

public TInput[] Inputs { get; set; }

Property Value

TInput[]

Remarks

Required for all fine-tuning methods.

Outputs

Gets or sets the target outputs for supervised fine-tuning.

public TOutput[] Outputs { get; set; }

Property Value

TOutput[]

Remarks

Used by SFT methods. Each output corresponds to an input.

RankedOutputs

Gets or sets ranked outputs for ranking-based methods.

public TOutput[][] RankedOutputs { get; set; }

Property Value

TOutput[][]

Remarks

Used by RSO, RRHF, SLiC-HF, PRO. Each inner array contains outputs ranked from best to worst.

RejectedOutputs

Gets or sets the rejected outputs for preference learning.

public TOutput[] RejectedOutputs { get; set; }

Property Value

TOutput[]

Remarks

Used by DPO, IPO, SimPO, CPO, and other pairwise preference methods.

Rewards

Gets or sets reward values for RL-based methods.

public double[] Rewards { get; set; }

Property Value

double[]

Remarks

Used by RLHF, PPO, GRPO, REINFORCE.

SampleIds

Gets or sets optional sample identifiers for tracking.

public string[] SampleIds { get; set; }

Property Value

string[]

SampleWeights

Gets or sets optional sample weights for weighted training.

public double[] SampleWeights { get; set; }

Property Value

double[]

TeacherConfidences

Gets or sets teacher model confidence scores.

public double[] TeacherConfidences { get; set; }

Property Value

double[]

TeacherOutputs

Gets or sets teacher model logits/outputs for distillation.

public TOutput[] TeacherOutputs { get; set; }

Property Value

TOutput[]

Values

Gets or sets value estimates for critic-based methods.

public double[] Values { get; set; }

Property Value

double[]

Methods

Split(double, int?)

Splits the data into training and validation sets.

public (FineTuningData<T, TInput, TOutput> Train, FineTuningData<T, TInput, TOutput> Validation) Split(double validationRatio = 0.1, int? seed = null)

Parameters

validationRatio double

The ratio of data to use for validation (0.0 to 1.0).

seed int?

Optional random seed for reproducibility.

Returns

(FineTuningData<T, TInput, TOutput> Train, FineTuningData<T, TInput, TOutput> Validation)

A tuple of (training data, validation data).

Subset(int[])

Creates a subset of the data for the given indices.

public FineTuningData<T, TInput, TOutput> Subset(int[] indices)

Parameters

indices int[]

The indices to include in the subset.

Returns

FineTuningData<T, TInput, TOutput>

A new FineTuningData containing only the specified samples.