Table of Contents

Class JamesSteinEncoder<T>

Namespace
AiDotNet.Preprocessing.Encoders
Assembly
AiDotNet.dll

Encodes categorical features using James-Stein shrinkage estimation.

public class JamesSteinEncoder<T> : TransformerBase<T, Matrix<T>, Matrix<T>>, IDataTransformer<T, Matrix<T>, Matrix<T>>

Type Parameters

T

The numeric type for calculations (e.g., float, double).

Inheritance
TransformerBase<T, Matrix<T>, Matrix<T>>
JamesSteinEncoder<T>
Implements
IDataTransformer<T, Matrix<T>, Matrix<T>>
Inherited Members

Remarks

JamesSteinEncoder uses Bayesian shrinkage to blend category-specific target means with the global mean. Categories with more samples get weights closer to their own mean, while rare categories shrink toward the global mean.

The shrinkage formula: encoded = (1 - B) * category_mean + B * global_mean where B is the shrinkage factor based on sample size and variance.

For Beginners: This encoder balances between: - Trusting category-specific averages (when we have lots of data) - Falling back to the overall average (when category data is sparse) - The balance is determined automatically using statistical theory

Constructors

JamesSteinEncoder(JamesSteinHandleUnknown, int[]?)

Creates a new instance of JamesSteinEncoder<T>.

public JamesSteinEncoder(JamesSteinHandleUnknown handleUnknown = JamesSteinHandleUnknown.UseGlobalMean, int[]? columnIndices = null)

Parameters

handleUnknown JamesSteinHandleUnknown

How to handle unknown categories. Defaults to UseGlobalMean.

columnIndices int[]

The column indices to encode, or null for all columns.

Properties

GlobalMean

Gets the global target mean.

public double GlobalMean { get; }

Property Value

double

HandleUnknown

Gets how unknown categories are handled.

public JamesSteinHandleUnknown HandleUnknown { get; }

Property Value

JamesSteinHandleUnknown

SupportsInverseTransform

Gets whether this transformer supports inverse transformation.

public override bool SupportsInverseTransform { get; }

Property Value

bool

Methods

Fit(Matrix<T>, Vector<T>)

Fits the encoder by computing James-Stein shrinkage estimates.

public void Fit(Matrix<T> data, Vector<T> target)

Parameters

data Matrix<T>

The feature matrix to fit.

target Vector<T>

The target values.

FitCore(Matrix<T>)

Fits the encoder (requires target via specialized Fit method).

protected override void FitCore(Matrix<T> data)

Parameters

data Matrix<T>

FitTransform(Matrix<T>, Vector<T>)

Fits and transforms the data.

public Matrix<T> FitTransform(Matrix<T> data, Vector<T> target)

Parameters

data Matrix<T>
target Vector<T>

Returns

Matrix<T>

GetFeatureNamesOut(string[]?)

Gets the output feature names after transformation.

public override string[] GetFeatureNamesOut(string[]? inputFeatureNames = null)

Parameters

inputFeatureNames string[]

Returns

string[]

InverseTransformCore(Matrix<T>)

Inverse transformation is not supported.

protected override Matrix<T> InverseTransformCore(Matrix<T> data)

Parameters

data Matrix<T>

Returns

Matrix<T>

TransformCore(Matrix<T>)

Transforms the data using fitted encodings.

protected override Matrix<T> TransformCore(Matrix<T> data)

Parameters

data Matrix<T>

Returns

Matrix<T>