Table of Contents

Class OneHotEncoder<T>

Namespace
AiDotNet.Preprocessing.Encoders
Assembly
AiDotNet.dll

Encodes categorical values as one-hot (binary) vectors.

public class OneHotEncoder<T> : TransformerBase<T, Matrix<T>, Matrix<T>>, IDataTransformer<T, Matrix<T>, Matrix<T>>

Type Parameters

T

The numeric type for calculations (e.g., float, double).

Inheritance
TransformerBase<T, Matrix<T>, Matrix<T>>
OneHotEncoder<T>
Implements
IDataTransformer<T, Matrix<T>, Matrix<T>>
Inherited Members

Remarks

OneHotEncoder transforms categorical values into binary indicator columns. Each unique category value becomes a separate column with 1s and 0s indicating presence. This encoding is required for many machine learning algorithms that cannot work directly with categories.

For Beginners: This encoder converts categories into binary columns: - Each unique value gets its own column - A 1 indicates the category is present, 0 means it's not

Example for colors [red, green, blue, red]: Becomes: [1, 0, 0] (red) [0, 1, 0] (green) [0, 0, 1] (blue) [1, 0, 0] (red)

Constructors

OneHotEncoder(bool, OneHotUnknownHandling, int[]?)

Creates a new instance of OneHotEncoder<T>.

public OneHotEncoder(bool dropFirst = false, OneHotUnknownHandling handleUnknown = OneHotUnknownHandling.Error, int[]? columnIndices = null)

Parameters

dropFirst bool

If true, drops the first category to avoid multicollinearity. Defaults to false.

handleUnknown OneHotUnknownHandling

How to handle unknown categories. Defaults to Error.

columnIndices int[]

The column indices to encode, or null for all columns.

Properties

Categories

Gets the categories for each encoded column.

public List<double[]>? Categories { get; }

Property Value

List<double[]>

DropFirst

Gets whether the first category is dropped (to avoid multicollinearity).

public bool DropFirst { get; }

Property Value

bool

HandleUnknown

Gets how unknown categories are handled.

public OneHotUnknownHandling HandleUnknown { get; }

Property Value

OneHotUnknownHandling

NOutputFeatures

Gets the number of output features after transformation.

public int NOutputFeatures { get; }

Property Value

int

SupportsInverseTransform

Gets whether this transformer supports inverse transformation.

public override bool SupportsInverseTransform { get; }

Property Value

bool

Methods

FitCore(Matrix<T>)

Learns the categories from the training data.

protected override void FitCore(Matrix<T> data)

Parameters

data Matrix<T>

The training data matrix where each column is a feature.

GetFeatureNamesOut(string[]?)

Gets the output feature names after transformation.

public override string[] GetFeatureNamesOut(string[]? inputFeatureNames = null)

Parameters

inputFeatureNames string[]

The input feature names.

Returns

string[]

The output feature names with category suffixes.

InverseTransformCore(Matrix<T>)

Reverses the one-hot encoding to get original category values.

protected override Matrix<T> InverseTransformCore(Matrix<T> data)

Parameters

data Matrix<T>

The one-hot encoded data.

Returns

Matrix<T>

The original categorical values.

TransformCore(Matrix<T>)

Transforms the data by applying one-hot encoding.

protected override Matrix<T> TransformCore(Matrix<T> data)

Parameters

data Matrix<T>

The data to transform.

Returns

Matrix<T>

The one-hot encoded data.