Table of Contents

Class ActiveLearningSampler<T>

Namespace
AiDotNet.Data.Sampling
Assembly
AiDotNet.dll

A sampler for active learning that selects the most informative samples for labeling.

public class ActiveLearningSampler<T> : DataSamplerBase, IDataSampler

Type Parameters

T

The numeric type for uncertainty scores.

Inheritance
ActiveLearningSampler<T>
Implements
Inherited Members

Remarks

ActiveLearningSampler implements uncertainty sampling and other active learning strategies to select unlabeled samples that would be most valuable to label.

For Beginners: In active learning, you don't have labels for all data. This sampler helps you decide which unlabeled samples to ask an expert to label:

  • Uncertainty sampling: Ask about samples the model is unsure about
  • Diversity sampling: Ask about samples that are different from what you've seen
  • Expected model change: Ask about samples that would change the model most

This can dramatically reduce labeling costs (50-90% less labels needed)!

Constructors

ActiveLearningSampler(int, ActiveLearningStrategy, double, int?)

Initializes a new instance of the ActiveLearningSampler class.

public ActiveLearningSampler(int datasetSize, ActiveLearningStrategy strategy = ActiveLearningStrategy.Uncertainty, double diversityWeight = 0.3, int? seed = null)

Parameters

datasetSize int

The total number of samples.

strategy ActiveLearningStrategy

The active learning selection strategy.

diversityWeight double

Weight for diversity in hybrid strategy (0-1).

seed int?

Optional random seed for reproducibility.

Fields

NumOps

Numeric operations for type T.

protected static readonly INumericOperations<T> NumOps

Field Value

INumericOperations<T>

Properties

LabeledCount

Gets the number of labeled samples.

public int LabeledCount { get; }

Property Value

int

Length

Gets the total number of samples this sampler will produce per epoch.

public override int Length { get; }

Property Value

int

Remarks

This may differ from the dataset size for oversampling or undersampling strategies.

UnlabeledCount

Gets the number of unlabeled samples.

public int UnlabeledCount { get; }

Property Value

int

Methods

GetIndicesCore()

Core implementation for generating indices. Override this in derived classes.

protected override IEnumerable<int> GetIndicesCore()

Returns

IEnumerable<int>

An enumerable of sample indices.

MarkAsLabeled(IEnumerable<int>)

Marks multiple samples as labeled.

public void MarkAsLabeled(IEnumerable<int> indices)

Parameters

indices IEnumerable<int>

The sample indices.

MarkAsLabeled(int)

Marks a sample as labeled.

public void MarkAsLabeled(int index)

Parameters

index int

The sample index.

SelectForLabeling(int)

Selects the most informative unlabeled samples for labeling.

public IEnumerable<int> SelectForLabeling(int numToSelect)

Parameters

numToSelect int

Number of samples to select.

Returns

IEnumerable<int>

Indices of selected samples.

UpdateUncertainties(IReadOnlyList<int>, IReadOnlyList<T>)

Batch updates uncertainty scores.

public void UpdateUncertainties(IReadOnlyList<int> indices, IReadOnlyList<T> uncertainties)

Parameters

indices IReadOnlyList<int>

The sample indices.

uncertainties IReadOnlyList<T>

The uncertainty scores.

UpdateUncertainty(int, T)

Updates the uncertainty score for a sample.

public void UpdateUncertainty(int index, T uncertainty)

Parameters

index int

The sample index.

uncertainty T

The uncertainty score (higher = more uncertain).