Interface IDifficultyEstimator<T, TInput, TOutput>
- Namespace
- AiDotNet.CurriculumLearning.Interfaces
- Assembly
- AiDotNet.dll
Interface for estimating the difficulty of training samples.
public interface IDifficultyEstimator<T, TInput, TOutput>
Type Parameters
TThe numeric type used for calculations.
TInputThe input data type.
TOutputThe output data type.
Remarks
For Beginners: A difficulty estimator measures how "hard" each training sample is for the model to learn. This is crucial for curriculum learning, as we want to present easy samples first and gradually introduce harder ones.
Common Difficulty Measures:
- Loss-based: Samples with high loss are harder
- Confidence-based: Low-confidence predictions indicate difficulty
- Transfer-based: Performance gap between simple and complex models
- Expert-defined: Domain knowledge about sample complexity
Research Background:
- Bengio et al. (2009): Originally proposed using prediction loss
- Kumar et al. (2010): Self-paced learning using model confidence
- Weinshall et al. (2018): Transfer teacher approach
Properties
Name
Gets the name of the difficulty estimator.
string Name { get; }
Property Value
RequiresModel
Gets whether this estimator requires the model to estimate difficulty.
bool RequiresModel { get; }
Property Value
Remarks
Some estimators (like loss-based) need the model to compute predictions, while others (like expert-defined) don't need the model at all.
Methods
EstimateDifficulties(IDataset<T, TInput, TOutput>, IFullModel<T, TInput, TOutput>?)
Estimates difficulty scores for all samples in a dataset.
Vector<T> EstimateDifficulties(IDataset<T, TInput, TOutput> dataset, IFullModel<T, TInput, TOutput>? model = null)
Parameters
datasetIDataset<T, TInput, TOutput>The dataset to estimate difficulties for.
modelIFullModel<T, TInput, TOutput>The model to use for estimation (optional for some estimators).
Returns
- Vector<T>
Vector of difficulty scores (higher = harder).
Remarks
This method may be more efficient than calling EstimateDifficulty individually for each sample, as it can batch operations.
EstimateDifficulty(TInput, TOutput, IFullModel<T, TInput, TOutput>?)
Estimates the difficulty of a single sample.
T EstimateDifficulty(TInput input, TOutput expectedOutput, IFullModel<T, TInput, TOutput>? model = null)
Parameters
inputTInputThe input sample.
expectedOutputTOutputThe expected output/label.
modelIFullModel<T, TInput, TOutput>The model to use for estimation (optional for some estimators).
Returns
- T
Difficulty score (higher = harder). Typically in range [0, 1] but not required.
GetSortedIndices(Vector<T>)
Gets the indices of samples sorted by difficulty (easy to hard).
int[] GetSortedIndices(Vector<T> difficulties)
Parameters
difficultiesVector<T>The difficulty scores.
Returns
- int[]
Indices sorted from easiest to hardest.
Reset()
Resets the estimator to its initial state.
void Reset()
Update(int, IFullModel<T, TInput, TOutput>)
Updates the difficulty estimator based on training progress.
void Update(int epoch, IFullModel<T, TInput, TOutput> model)
Parameters
epochintCurrent epoch number.
modelIFullModel<T, TInput, TOutput>Current model state.
Remarks
Some estimators adapt over time. For example, a loss-based estimator might recalculate difficulties as the model improves, since what was "hard" at epoch 1 might be "easy" at epoch 100.