Table of Contents

Class LeaveOneOutEncoder<T>

Namespace
AiDotNet.Preprocessing.Encoders
Assembly
AiDotNet.dll

Encodes categorical features using leave-one-out target encoding.

public class LeaveOneOutEncoder<T> : TransformerBase<T, Matrix<T>, Matrix<T>>, IDataTransformer<T, Matrix<T>, Matrix<T>>

Type Parameters

T

The numeric type for calculations (e.g., float, double).

Inheritance
TransformerBase<T, Matrix<T>, Matrix<T>>
LeaveOneOutEncoder<T>
Implements
IDataTransformer<T, Matrix<T>, Matrix<T>>
Inherited Members

Remarks

LeaveOneOutEncoder is similar to TargetEncoder but uses leave-one-out statistics to prevent overfitting. For each sample, the encoding is computed using all other samples in the same category, excluding the current sample.

This reduces the risk of target leakage during training while still capturing the relationship between categories and the target variable.

For Beginners: Regular target encoding can overfit because it uses the same data to encode and train. Leave-one-out encoding prevents this: - When encoding row 1, it uses the average of all OTHER rows with the same category - This prevents the model from "cheating" by memorizing individual samples

Example: If "Category A" has 3 samples with targets [1, 0, 1]:

  • Row 1 gets encoded as average of [0, 1] = 0.5
  • Row 2 gets encoded as average of [1, 1] = 1.0
  • Row 3 gets encoded as average of [1, 0] = 0.5

Constructors

LeaveOneOutEncoder(double, LeaveOneOutHandleUnknown, int[]?)

Creates a new instance of LeaveOneOutEncoder<T>.

public LeaveOneOutEncoder(double smoothing = 1, LeaveOneOutHandleUnknown handleUnknown = LeaveOneOutHandleUnknown.UseGlobalMean, int[]? columnIndices = null)

Parameters

smoothing double

Smoothing parameter for regularization. Defaults to 1.0.

handleUnknown LeaveOneOutHandleUnknown

How to handle unknown categories. Defaults to UseGlobalMean.

columnIndices int[]

The column indices to encode, or null for all columns.

Properties

GlobalMean

Gets the global target mean.

public double GlobalMean { get; }

Property Value

double

HandleUnknown

Gets how unknown categories are handled.

public LeaveOneOutHandleUnknown HandleUnknown { get; }

Property Value

LeaveOneOutHandleUnknown

Smoothing

Gets the smoothing parameter.

public double Smoothing { get; }

Property Value

double

SupportsInverseTransform

Gets whether this transformer supports inverse transformation.

public override bool SupportsInverseTransform { get; }

Property Value

bool

Methods

Fit(Matrix<T>, Vector<T>)

Fits the encoder by computing category statistics.

public void Fit(Matrix<T> data, Vector<T> target)

Parameters

data Matrix<T>

The feature matrix to fit.

target Vector<T>

The target values.

FitCore(Matrix<T>)

Fits the encoder (requires target via specialized Fit method).

protected override void FitCore(Matrix<T> data)

Parameters

data Matrix<T>

FitTransform(Matrix<T>, Vector<T>)

Fits and transforms using leave-one-out encoding.

public Matrix<T> FitTransform(Matrix<T> data, Vector<T> target)

Parameters

data Matrix<T>

The feature matrix.

target Vector<T>

The target values.

Returns

Matrix<T>

The encoded data with leave-one-out statistics.

GetFeatureNamesOut(string[]?)

Gets the output feature names after transformation.

public override string[] GetFeatureNamesOut(string[]? inputFeatureNames = null)

Parameters

inputFeatureNames string[]

Returns

string[]

InverseTransformCore(Matrix<T>)

Inverse transformation is not supported.

protected override Matrix<T> InverseTransformCore(Matrix<T> data)

Parameters

data Matrix<T>

Returns

Matrix<T>

TransformCore(Matrix<T>)

Transforms the data using standard target encoding (for test/inference data).

protected override Matrix<T> TransformCore(Matrix<T> data)

Parameters

data Matrix<T>

The data to transform.

Returns

Matrix<T>

The encoded data.

TransformWithTarget(Matrix<T>, Vector<T>)

Transforms the data using leave-one-out encoding (requires target for training data).

public Matrix<T> TransformWithTarget(Matrix<T> data, Vector<T> target)

Parameters

data Matrix<T>

The data to transform.

target Vector<T>

The target values (for leave-one-out calculation).

Returns

Matrix<T>

The encoded data.