Table of Contents

Class HashingEncoder<T>

Namespace
AiDotNet.Preprocessing.Encoders
Assembly
AiDotNet.dll

Encodes categorical features using feature hashing (hashing trick).

public class HashingEncoder<T> : TransformerBase<T, Matrix<T>, Matrix<T>>, IDataTransformer<T, Matrix<T>, Matrix<T>>

Type Parameters

T

The numeric type for calculations (e.g., float, double).

Inheritance
TransformerBase<T, Matrix<T>, Matrix<T>>
HashingEncoder<T>
Implements
IDataTransformer<T, Matrix<T>, Matrix<T>>
Inherited Members

Remarks

HashingEncoder uses a hash function to map categories to a fixed number of columns. This is useful for high-cardinality categorical features where one-hot encoding would create too many columns.

Unlike other encoders, HashingEncoder doesn't need to store the category mappings, making it memory-efficient and able to handle previously unseen categories.

For Beginners: Instead of creating one column per category: - Hash encoding creates a fixed number of columns (e.g., 8) - Each category is hashed to one of these columns - Multiple categories may share the same column (collision)

Pros: Fixed memory, handles new categories, fast Cons: Information loss from collisions, not reversible

Constructors

HashingEncoder(int, bool, int[]?)

Creates a new instance of HashingEncoder<T>.

public HashingEncoder(int nComponents = 8, bool alternateSign = true, int[]? columnIndices = null)

Parameters

nComponents int

Number of output features per encoded column. Defaults to 8.

alternateSign bool

If true, use alternate signs to reduce collision bias. Defaults to true.

columnIndices int[]

The column indices to encode, or null for all columns.

Properties

AlternateSign

Gets whether alternate signs are used for hash collisions.

public bool AlternateSign { get; }

Property Value

bool

NComponents

Gets the number of hash components (output features per encoded column).

public int NComponents { get; }

Property Value

int

SupportsInverseTransform

Gets whether this transformer supports inverse transformation.

public override bool SupportsInverseTransform { get; }

Property Value

bool

Methods

FitCore(Matrix<T>)

Computes the output feature structure.

protected override void FitCore(Matrix<T> data)

Parameters

data Matrix<T>

The training data matrix.

GetFeatureNamesOut(string[]?)

Gets the output feature names after transformation.

public override string[] GetFeatureNamesOut(string[]? inputFeatureNames = null)

Parameters

inputFeatureNames string[]

Returns

string[]

InverseTransformCore(Matrix<T>)

Inverse transformation is not supported for hash encoding.

protected override Matrix<T> InverseTransformCore(Matrix<T> data)

Parameters

data Matrix<T>

Returns

Matrix<T>

TransformCore(Matrix<T>)

Transforms the data using feature hashing.

protected override Matrix<T> TransformCore(Matrix<T> data)

Parameters

data Matrix<T>

The data to transform.

Returns

Matrix<T>

The hash-encoded data.