Table of Contents

Class UMAP<T>

Namespace
AiDotNet.Preprocessing.DimensionalityReduction
Assembly
AiDotNet.dll

Uniform Manifold Approximation and Projection for dimensionality reduction.

public class UMAP<T> : TransformerBase<T, Matrix<T>, Matrix<T>>, IDataTransformer<T, Matrix<T>, Matrix<T>>

Type Parameters

T

The numeric type for calculations (e.g., float, double).

Inheritance
TransformerBase<T, Matrix<T>, Matrix<T>>
UMAP<T>
Implements
IDataTransformer<T, Matrix<T>, Matrix<T>>
Inherited Members

Remarks

UMAP is a nonlinear dimensionality reduction technique that constructs a high-dimensional graph representation and optimizes a low-dimensional graph to be as structurally similar as possible. It is based on Riemannian geometry and algebraic topology.

Key advantages over t-SNE: - Much faster (scales better to large datasets) - Preserves more global structure - Supports out-of-sample transformation - More deterministic results

For Beginners: UMAP creates visualizations similar to t-SNE but: - It's faster, especially for large datasets - Distances between clusters are more meaningful - You can transform new data points without refitting - Great for both visualization AND as a preprocessing step for ML

Example use cases:

  • Visualizing high-dimensional data (gene expression, embeddings)
  • Preprocessing features for classification
  • Clustering analysis
  • Anomaly detection

Constructors

UMAP(int, int, double, double, UMAPMetric, int, double, double, double, double, int?, int[]?)

Creates a new instance of UMAP<T>.

public UMAP(int nComponents = 2, int nNeighbors = 15, double minDist = 0.1, double spread = 1, UMAPMetric metric = UMAPMetric.Euclidean, int nEpochs = 200, double learningRate = 1, double negativeSampleRate = 5, double localConnectivity = 1, double repulsionStrength = 1, int? randomState = null, int[]? columnIndices = null)

Parameters

nComponents int

Target dimensionality (usually 2 or 3). Defaults to 2.

nNeighbors int

Number of neighbors for manifold approximation. Defaults to 15.

minDist double

Minimum distance between points in embedding. Defaults to 0.1.

spread double

Effective scale of embedded points. Defaults to 1.0.

metric UMAPMetric

Distance metric to use. Defaults to Euclidean.

nEpochs int

Number of training epochs. Defaults to 200.

learningRate double

Learning rate for SGD. Defaults to 1.0.

negativeSampleRate double

Negative samples per positive. Defaults to 5.

localConnectivity double

Local connectivity constraint. Defaults to 1.0.

repulsionStrength double

Repulsion strength during optimization. Defaults to 1.0.

randomState int?

Random seed for reproducibility.

columnIndices int[]

The column indices to use, or null for all columns.

Properties

Embedding

Gets the embedding result.

public double[,]? Embedding { get; }

Property Value

double[,]

Metric

Gets the distance metric.

public UMAPMetric Metric { get; }

Property Value

UMAPMetric

MinDist

Gets the minimum distance parameter.

public double MinDist { get; }

Property Value

double

NComponents

Gets the number of components (dimensions).

public int NComponents { get; }

Property Value

int

NNeighbors

Gets the number of neighbors.

public int NNeighbors { get; }

Property Value

int

SupportsInverseTransform

Gets whether this transformer supports inverse transformation.

public override bool SupportsInverseTransform { get; }

Property Value

bool

Methods

FitCore(Matrix<T>)

Fits UMAP and computes the embedding.

protected override void FitCore(Matrix<T> data)

Parameters

data Matrix<T>

GetEmbedding()

Gets the embedding computed during Fit for the training data.

public Matrix<T> GetEmbedding()

Returns

Matrix<T>

The embedding matrix for the training data.

GetFeatureNamesOut(string[]?)

Gets the output feature names after transformation.

public override string[] GetFeatureNamesOut(string[]? inputFeatureNames = null)

Parameters

inputFeatureNames string[]

Returns

string[]

InverseTransformCore(Matrix<T>)

Inverse transformation is not supported.

protected override Matrix<T> InverseTransformCore(Matrix<T> data)

Parameters

data Matrix<T>

Returns

Matrix<T>

TransformCore(Matrix<T>)

Transforms data using the fitted UMAP embedding.

protected override Matrix<T> TransformCore(Matrix<T> data)

Parameters

data Matrix<T>

Returns

Matrix<T>

Remarks

This method always performs out-of-sample transformation using the learned embedding space. To get the original training embedding, use GetEmbedding().