Class LeaveOneOutEncoder<T>
- Namespace
- AiDotNet.Preprocessing.Encoders
- Assembly
- AiDotNet.dll
Encodes categorical features using leave-one-out target encoding.
public class LeaveOneOutEncoder<T> : TransformerBase<T, Matrix<T>, Matrix<T>>, IDataTransformer<T, Matrix<T>, Matrix<T>>
Type Parameters
TThe numeric type for calculations (e.g., float, double).
- Inheritance
-
LeaveOneOutEncoder<T>
- Implements
- Inherited Members
Remarks
LeaveOneOutEncoder is similar to TargetEncoder but uses leave-one-out statistics to prevent overfitting. For each sample, the encoding is computed using all other samples in the same category, excluding the current sample.
This reduces the risk of target leakage during training while still capturing the relationship between categories and the target variable.
For Beginners: Regular target encoding can overfit because it uses the same data to encode and train. Leave-one-out encoding prevents this: - When encoding row 1, it uses the average of all OTHER rows with the same category - This prevents the model from "cheating" by memorizing individual samples
Example: If "Category A" has 3 samples with targets [1, 0, 1]:
- Row 1 gets encoded as average of [0, 1] = 0.5
- Row 2 gets encoded as average of [1, 1] = 1.0
- Row 3 gets encoded as average of [1, 0] = 0.5
Constructors
LeaveOneOutEncoder(double, LeaveOneOutHandleUnknown, int[]?)
Creates a new instance of LeaveOneOutEncoder<T>.
public LeaveOneOutEncoder(double smoothing = 1, LeaveOneOutHandleUnknown handleUnknown = LeaveOneOutHandleUnknown.UseGlobalMean, int[]? columnIndices = null)
Parameters
smoothingdoubleSmoothing parameter for regularization. Defaults to 1.0.
handleUnknownLeaveOneOutHandleUnknownHow to handle unknown categories. Defaults to UseGlobalMean.
columnIndicesint[]The column indices to encode, or null for all columns.
Properties
GlobalMean
Gets the global target mean.
public double GlobalMean { get; }
Property Value
HandleUnknown
Gets how unknown categories are handled.
public LeaveOneOutHandleUnknown HandleUnknown { get; }
Property Value
Smoothing
Gets the smoothing parameter.
public double Smoothing { get; }
Property Value
SupportsInverseTransform
Gets whether this transformer supports inverse transformation.
public override bool SupportsInverseTransform { get; }
Property Value
Methods
Fit(Matrix<T>, Vector<T>)
Fits the encoder by computing category statistics.
public void Fit(Matrix<T> data, Vector<T> target)
Parameters
dataMatrix<T>The feature matrix to fit.
targetVector<T>The target values.
FitCore(Matrix<T>)
Fits the encoder (requires target via specialized Fit method).
protected override void FitCore(Matrix<T> data)
Parameters
dataMatrix<T>
FitTransform(Matrix<T>, Vector<T>)
Fits and transforms using leave-one-out encoding.
public Matrix<T> FitTransform(Matrix<T> data, Vector<T> target)
Parameters
dataMatrix<T>The feature matrix.
targetVector<T>The target values.
Returns
- Matrix<T>
The encoded data with leave-one-out statistics.
GetFeatureNamesOut(string[]?)
Gets the output feature names after transformation.
public override string[] GetFeatureNamesOut(string[]? inputFeatureNames = null)
Parameters
inputFeatureNamesstring[]
Returns
- string[]
InverseTransformCore(Matrix<T>)
Inverse transformation is not supported.
protected override Matrix<T> InverseTransformCore(Matrix<T> data)
Parameters
dataMatrix<T>
Returns
- Matrix<T>
TransformCore(Matrix<T>)
Transforms the data using standard target encoding (for test/inference data).
protected override Matrix<T> TransformCore(Matrix<T> data)
Parameters
dataMatrix<T>The data to transform.
Returns
- Matrix<T>
The encoded data.
TransformWithTarget(Matrix<T>, Vector<T>)
Transforms the data using leave-one-out encoding (requires target for training data).
public Matrix<T> TransformWithTarget(Matrix<T> data, Vector<T> target)
Parameters
dataMatrix<T>The data to transform.
targetVector<T>The target values (for leave-one-out calculation).
Returns
- Matrix<T>
The encoded data.