Table of Contents

Class SmoteAugmenter<T>

Namespace
AiDotNet.Augmentation.Tabular
Assembly
AiDotNet.dll

Implements SMOTE (Synthetic Minority Over-sampling Technique) for imbalanced datasets.

public class SmoteAugmenter<T> : TabularAugmenterBase<T>, IAugmentation<T, Matrix<T>>

Type Parameters

T

The numeric type for calculations.

Inheritance
AugmentationBase<T, Matrix<T>>
SmoteAugmenter<T>
Implements
IAugmentation<T, Matrix<T>>
Inherited Members

Remarks

For Beginners: SMOTE creates new synthetic samples for the minority class by interpolating between existing minority samples and their nearest neighbors. This helps balance imbalanced datasets where one class has far fewer samples than others.

How it works:

  1. For each minority sample, find its k nearest neighbors (also from minority class)
  2. Randomly select one of these neighbors
  3. Create a new sample along the line between the original and the neighbor

When to use:

  • Classification with severe class imbalance (e.g., fraud detection, rare disease)
  • When the minority class has too few samples to learn from
  • When undersampling the majority class would lose too much information

When NOT to use:

  • When classes are already balanced
  • For regression tasks (use other techniques)
  • When features are highly categorical (use SMOTE-NC instead)

Reference: Chawla et al., "SMOTE: Synthetic Minority Over-sampling Technique" (2002)

Constructors

SmoteAugmenter(int, double, double)

Creates a new SMOTE augmenter.

public SmoteAugmenter(int kNeighbors = 5, double samplingRatio = 1, double probability = 1)

Parameters

kNeighbors int

Number of nearest neighbors to use (default: 5).

samplingRatio double

Ratio of synthetic samples to generate (default: 1.0).

probability double

Probability of applying this augmentation (default: 1.0).

Properties

KNeighbors

Gets the number of nearest neighbors to consider.

public int KNeighbors { get; }

Property Value

int

Remarks

Default: 5

Higher values create more diverse synthetic samples but require more minority samples.

SamplingRatio

Gets the sampling ratio for synthetic sample generation.

public double SamplingRatio { get; }

Property Value

double

Remarks

Default: 1.0 (generate as many synthetic samples as original minority samples)

Values > 1.0 create more synthetic samples; values < 1.0 create fewer.

Methods

ApplyAugmentation(Matrix<T>, AugmentationContext<T>)

Implement this method to perform the actual augmentation.

protected override Matrix<T> ApplyAugmentation(Matrix<T> data, AugmentationContext<T> context)

Parameters

data Matrix<T>

The input data.

context AugmentationContext<T>

The augmentation context.

Returns

Matrix<T>

The augmented data.

ApplySmoteWithLabels(Matrix<T>, Vector<T>, AugmentationContext<T>)

Applies SMOTE and returns combined original and synthetic data.

public (Matrix<T> Data, Vector<T> Labels) ApplySmoteWithLabels(Matrix<T> minorityData, Vector<T> minorityLabels, AugmentationContext<T> context)

Parameters

minorityData Matrix<T>

Matrix containing only minority class samples.

minorityLabels Vector<T>

Labels for the minority class.

context AugmentationContext<T>

The augmentation context.

Returns

(Matrix<T> Data, Vector<T> Labels)

Tuple of (combined data, combined labels) including both original and synthetic samples.

GenerateSyntheticSamples(Matrix<T>, AugmentationContext<T>)

Applies SMOTE to generate synthetic samples for the minority class.

public Matrix<T> GenerateSyntheticSamples(Matrix<T> minorityData, AugmentationContext<T> context)

Parameters

minorityData Matrix<T>

Matrix containing only minority class samples.

context AugmentationContext<T>

The augmentation context.

Returns

Matrix<T>

Matrix containing synthetic samples (original data is NOT included).

GetParameters()

Gets the parameters of this augmentation.

public override IDictionary<string, object> GetParameters()

Returns

IDictionary<string, object>

A dictionary of parameter names to values.