Table of Contents

Class TargetEncoder<T>

Namespace
AiDotNet.Preprocessing.Encoders
Assembly
AiDotNet.dll

Encodes categorical features using target mean encoding.

public class TargetEncoder<T> : TransformerBase<T, Matrix<T>, Matrix<T>>, IDataTransformer<T, Matrix<T>, Matrix<T>>

Type Parameters

T

The numeric type for calculations (e.g., float, double).

Inheritance
TransformerBase<T, Matrix<T>, Matrix<T>>
TargetEncoder<T>
Implements
IDataTransformer<T, Matrix<T>, Matrix<T>>
Inherited Members

Remarks

TargetEncoder replaces each category with the mean of the target variable for that category. This creates a continuous feature that captures the relationship between the category and target.

To prevent overfitting, especially with rare categories, smoothing is applied: encoding = (count * category_mean + smoothing * global_mean) / (count + smoothing)

For Beginners: Instead of one-hot encoding (many columns), target encoding creates a single column per feature containing the average target value for each category: - Category "A" with average target 0.8 becomes 0.8 - Category "B" with average target 0.3 becomes 0.3

This is especially useful for high-cardinality features where one-hot would create too many columns.

Constructors

TargetEncoder(double, double, TargetEncoderHandleUnknown, int[]?)

Creates a new instance of TargetEncoder<T>.

public TargetEncoder(double smoothing = 1, double minSamplesLeaf = 1, TargetEncoderHandleUnknown handleUnknown = TargetEncoderHandleUnknown.UseGlobalMean, int[]? columnIndices = null)

Parameters

smoothing double

Smoothing parameter. Higher values give more weight to global mean. Defaults to 1.0.

minSamplesLeaf double

Minimum samples to compute category mean. Categories below this use global mean. Defaults to 1.

handleUnknown TargetEncoderHandleUnknown

How to handle unknown categories during transform. Defaults to UseGlobalMean.

columnIndices int[]

The column indices to encode, or null for all columns.

Properties

EncodingMaps

Gets the encoding maps for each column.

public Dictionary<int, Dictionary<double, double>>? EncodingMaps { get; }

Property Value

Dictionary<int, Dictionary<double, double>>

HandleUnknown

Gets how unknown categories are handled during transform.

public TargetEncoderHandleUnknown HandleUnknown { get; }

Property Value

TargetEncoderHandleUnknown

Smoothing

Gets the smoothing parameter used during encoding.

public double Smoothing { get; }

Property Value

double

SupportsInverseTransform

Gets whether this transformer supports inverse transformation.

public override bool SupportsInverseTransform { get; }

Property Value

bool

Methods

Fit(Matrix<T>, Vector<T>)

Fits the encoder by learning the target means for each category.

public void Fit(Matrix<T> data, Vector<T> target)

Parameters

data Matrix<T>

The feature matrix to fit.

target Vector<T>

The target values used to compute means.

Exceptions

ArgumentException

If target length doesn't match data rows.

FitCore(Matrix<T>)

Fits the encoder using the base Fit method (requires target via FitWithTarget).

protected override void FitCore(Matrix<T> data)

Parameters

data Matrix<T>

The feature matrix.

Exceptions

InvalidOperationException

Always thrown. Use Fit(Matrix, Vector) instead.

FitTransform(Matrix<T>, Vector<T>)

Fits the encoder and transforms the data in one step.

public Matrix<T> FitTransform(Matrix<T> data, Vector<T> target)

Parameters

data Matrix<T>

The feature matrix to fit and transform.

target Vector<T>

The target values used to compute means.

Returns

Matrix<T>

The encoded data.

GetFeatureNamesOut(string[]?)

Gets the output feature names after transformation.

public override string[] GetFeatureNamesOut(string[]? inputFeatureNames = null)

Parameters

inputFeatureNames string[]

Returns

string[]

InverseTransformCore(Matrix<T>)

Inverse transformation is not supported for target encoding.

protected override Matrix<T> InverseTransformCore(Matrix<T> data)

Parameters

data Matrix<T>

Returns

Matrix<T>

TransformCore(Matrix<T>)

Transforms the data by replacing categories with their target means.

protected override Matrix<T> TransformCore(Matrix<T> data)

Parameters

data Matrix<T>

The data to transform.

Returns

Matrix<T>

The encoded data.