Class FineTuningData<T, TInput, TOutput>

Namespace: AiDotNet.Models.Options

Assembly: AiDotNet.dll

Container for fine-tuning training and evaluation data.

public class FineTuningData<T, TInput, TOutput>

Type Parameters

T: The numeric data type.
TInput: The input data type.
TOutput: The output data type.

Inheritance: object

FineTuningData<T, TInput, TOutput>

Inherited Members: object.Equals(object)

object.Equals(object, object)

object.GetHashCode()

object.GetType()

object.MemberwiseClone()

object.ReferenceEquals(object, object)

object.ToString()

Remarks

This class holds data for various fine-tuning methods. Different methods use different subsets of the data:

SFTUses Inputs and Outputs
Preference methodsUses Inputs, ChosenOutputs, RejectedOutputs
RL methodsUses Inputs and Rewards
Ranking methodsUses Inputs and RankedOutputs

For Beginners: Think of this as a container that holds all the training examples the model needs to learn from. Different training methods need different kinds of examples.

Properties

Advantages

Gets or sets advantages for PPO-style methods.

public double[] Advantages { get; set; }

Property Value

double[]

ChosenOutputs

Gets or sets the chosen (preferred) outputs for preference learning.

public TOutput[] ChosenOutputs { get; set; }

Property Value

TOutput[]

Remarks

Used by DPO, IPO, KTO, SimPO, CPO, and other preference methods.

Count

Gets the number of samples in the dataset.

public int Count { get; }

Property Value

int

CritiqueRevisions

Gets or sets critique-revision pairs for constitutional methods.

public (TOutput Original, string Critique, TOutput Revised)[] CritiqueRevisions { get; set; }

Property Value

(TOutput Original, string Critique, TOutput Revised)[]

Remarks

Each tuple contains (original response, critique, revised response).

DesirabilityLabels

Gets or sets binary labels indicating if outputs are desirable.

public bool[] DesirabilityLabels { get; set; }

Property Value

bool[]

Remarks

Used by KTO which doesn't require pairwise data. True = desirable, False = undesirable.

HasDistillationData

Gets whether this data is suitable for distillation.

public bool HasDistillationData { get; }

Property Value

bool

HasPairwisePreferenceData

Gets whether this data is suitable for pairwise preference methods.

public bool HasPairwisePreferenceData { get; }

Property Value

bool

HasRLData

Gets whether this data is suitable for RL methods.

public bool HasRLData { get; }

Property Value

bool

HasRankingData

Gets whether this data is suitable for ranking methods.

public bool HasRankingData { get; }

Property Value

bool

HasSFTData

Gets whether this data is suitable for SFT.

public bool HasSFTData { get; }

Property Value

bool

HasUnpairedPreferenceData

Gets whether this data is suitable for KTO (unpaired preferences).

public bool HasUnpairedPreferenceData { get; }

Property Value

bool

Inputs

Gets or sets the input data samples.

public TInput[] Inputs { get; set; }

Property Value

TInput[]

Remarks

Required for all fine-tuning methods.

Outputs

Gets or sets the target outputs for supervised fine-tuning.

public TOutput[] Outputs { get; set; }

Property Value

TOutput[]

Remarks

Used by SFT methods. Each output corresponds to an input.

RankedOutputs

Gets or sets ranked outputs for ranking-based methods.

public TOutput[][] RankedOutputs { get; set; }

Property Value

TOutput[][]

Remarks

Used by RSO, RRHF, SLiC-HF, PRO. Each inner array contains outputs ranked from best to worst.

RejectedOutputs

Gets or sets the rejected outputs for preference learning.

public TOutput[] RejectedOutputs { get; set; }

Property Value

TOutput[]

Remarks

Used by DPO, IPO, SimPO, CPO, and other pairwise preference methods.

Rewards

Gets or sets reward values for RL-based methods.

public double[] Rewards { get; set; }

Property Value

double[]

Remarks

Used by RLHF, PPO, GRPO, REINFORCE.

SampleIds

Gets or sets optional sample identifiers for tracking.

public string[] SampleIds { get; set; }

Property Value

string[]

SampleWeights

Gets or sets optional sample weights for weighted training.

public double[] SampleWeights { get; set; }

Property Value

double[]

TeacherConfidences

Gets or sets teacher model confidence scores.

public double[] TeacherConfidences { get; set; }

Property Value

double[]

TeacherOutputs

Gets or sets teacher model logits/outputs for distillation.

public TOutput[] TeacherOutputs { get; set; }

Property Value

TOutput[]

Values

Gets or sets value estimates for critic-based methods.

public double[] Values { get; set; }

Property Value

double[]

Methods

Split(double, int?)

Splits the data into training and validation sets.

public (FineTuningData<T, TInput, TOutput> Train, FineTuningData<T, TInput, TOutput> Validation) Split(double validationRatio = 0.1, int? seed = null)

Parameters

validationRatio double: The ratio of data to use for validation (0.0 to 1.0).
seed int?: Optional random seed for reproducibility.

Returns

(FineTuningData<T, TInput, TOutput> Train, FineTuningData<T, TInput, TOutput> Validation): A tuple of (training data, validation data).

Subset(int[])

Creates a subset of the data for the given indices.

public FineTuningData<T, TInput, TOutput> Subset(int[] indices)

Parameters

indices int[]: The indices to include in the subset.

Returns

FineTuningData<T, TInput, TOutput>: A new FineTuningData containing only the specified samples.

Table of Contents

Class FineTuningData<T, TInput, TOutput>

Type Parameters

Remarks

Properties

Advantages

Property Value

ChosenOutputs

Property Value

Remarks

Count

Property Value

CritiqueRevisions

Property Value

Remarks

DesirabilityLabels

Property Value

Remarks

HasDistillationData

Property Value

HasPairwisePreferenceData

Property Value

HasRLData

Property Value

HasRankingData

Property Value

HasSFTData

Property Value

HasUnpairedPreferenceData

Property Value

Inputs

Property Value

Remarks

Outputs

Property Value

Remarks

RankedOutputs

Property Value

Remarks

RejectedOutputs

Property Value

Remarks

Rewards

Property Value

Remarks

SampleIds

Property Value

SampleWeights

Property Value

TeacherConfidences

Property Value

TeacherOutputs

Property Value

Values

Property Value

Methods

Split(double, int?)

Parameters

Returns

Subset(int[])

Parameters

Returns