Table of Contents

Interface IDifficultyEstimator<T, TInput, TOutput>

Namespace
AiDotNet.CurriculumLearning.Interfaces
Assembly
AiDotNet.dll

Interface for estimating the difficulty of training samples.

public interface IDifficultyEstimator<T, TInput, TOutput>

Type Parameters

T

The numeric type used for calculations.

TInput

The input data type.

TOutput

The output data type.

Remarks

For Beginners: A difficulty estimator measures how "hard" each training sample is for the model to learn. This is crucial for curriculum learning, as we want to present easy samples first and gradually introduce harder ones.

Common Difficulty Measures:

  • Loss-based: Samples with high loss are harder
  • Confidence-based: Low-confidence predictions indicate difficulty
  • Transfer-based: Performance gap between simple and complex models
  • Expert-defined: Domain knowledge about sample complexity

Research Background:

  • Bengio et al. (2009): Originally proposed using prediction loss
  • Kumar et al. (2010): Self-paced learning using model confidence
  • Weinshall et al. (2018): Transfer teacher approach

Properties

Name

Gets the name of the difficulty estimator.

string Name { get; }

Property Value

string

RequiresModel

Gets whether this estimator requires the model to estimate difficulty.

bool RequiresModel { get; }

Property Value

bool

Remarks

Some estimators (like loss-based) need the model to compute predictions, while others (like expert-defined) don't need the model at all.

Methods

EstimateDifficulties(IDataset<T, TInput, TOutput>, IFullModel<T, TInput, TOutput>?)

Estimates difficulty scores for all samples in a dataset.

Vector<T> EstimateDifficulties(IDataset<T, TInput, TOutput> dataset, IFullModel<T, TInput, TOutput>? model = null)

Parameters

dataset IDataset<T, TInput, TOutput>

The dataset to estimate difficulties for.

model IFullModel<T, TInput, TOutput>

The model to use for estimation (optional for some estimators).

Returns

Vector<T>

Vector of difficulty scores (higher = harder).

Remarks

This method may be more efficient than calling EstimateDifficulty individually for each sample, as it can batch operations.

EstimateDifficulty(TInput, TOutput, IFullModel<T, TInput, TOutput>?)

Estimates the difficulty of a single sample.

T EstimateDifficulty(TInput input, TOutput expectedOutput, IFullModel<T, TInput, TOutput>? model = null)

Parameters

input TInput

The input sample.

expectedOutput TOutput

The expected output/label.

model IFullModel<T, TInput, TOutput>

The model to use for estimation (optional for some estimators).

Returns

T

Difficulty score (higher = harder). Typically in range [0, 1] but not required.

GetSortedIndices(Vector<T>)

Gets the indices of samples sorted by difficulty (easy to hard).

int[] GetSortedIndices(Vector<T> difficulties)

Parameters

difficulties Vector<T>

The difficulty scores.

Returns

int[]

Indices sorted from easiest to hardest.

Reset()

Resets the estimator to its initial state.

void Reset()

Update(int, IFullModel<T, TInput, TOutput>)

Updates the difficulty estimator based on training progress.

void Update(int epoch, IFullModel<T, TInput, TOutput> model)

Parameters

epoch int

Current epoch number.

model IFullModel<T, TInput, TOutput>

Current model state.

Remarks

Some estimators adapt over time. For example, a loss-based estimator might recalculate difficulties as the model improves, since what was "hard" at epoch 1 might be "easy" at epoch 100.