Class MEstimateEncoder<T>
- Namespace
- AiDotNet.Preprocessing.Encoders
- Assembly
- AiDotNet.dll
Encodes categorical features using M-estimate regularization.
public class MEstimateEncoder<T> : TransformerBase<T, Matrix<T>, Matrix<T>>, IDataTransformer<T, Matrix<T>, Matrix<T>>
Type Parameters
TThe numeric type for calculations (e.g., float, double).
- Inheritance
-
MEstimateEncoder<T>
- Implements
- Inherited Members
Remarks
MEstimateEncoder applies M-estimate smoothing to target encoding, which adds a regularization parameter 'm' that controls shrinkage toward the global mean.
The formula: encoded = (n * category_mean + m * global_mean) / (n + m) where n is the count of samples in the category and m is the smoothing parameter.
For Beginners: M-estimate is like adding 'm' fake samples: - Each fake sample has the global mean as its target - Categories with few samples get pulled toward the global mean - Categories with many samples stay close to their actual mean - Higher m = more smoothing toward global mean
Constructors
MEstimateEncoder(double, MEstimateHandleUnknown, int[]?)
Creates a new instance of MEstimateEncoder<T>.
public MEstimateEncoder(double m = 1, MEstimateHandleUnknown handleUnknown = MEstimateHandleUnknown.UseGlobalMean, int[]? columnIndices = null)
Parameters
mdoubleSmoothing parameter (number of virtual samples). Defaults to 1.0.
handleUnknownMEstimateHandleUnknownHow to handle unknown categories. Defaults to UseGlobalMean.
columnIndicesint[]The column indices to encode, or null for all columns.
Properties
GlobalMean
Gets the global target mean.
public double GlobalMean { get; }
Property Value
HandleUnknown
Gets how unknown categories are handled.
public MEstimateHandleUnknown HandleUnknown { get; }
Property Value
M
Gets the smoothing parameter m.
public double M { get; }
Property Value
SupportsInverseTransform
Gets whether this transformer supports inverse transformation.
public override bool SupportsInverseTransform { get; }
Property Value
Methods
Fit(Matrix<T>, Vector<T>)
Fits the encoder by computing M-estimate encodings.
public void Fit(Matrix<T> data, Vector<T> target)
Parameters
dataMatrix<T>The feature matrix to fit.
targetVector<T>The target values.
FitCore(Matrix<T>)
Fits the encoder (requires target via specialized Fit method).
protected override void FitCore(Matrix<T> data)
Parameters
dataMatrix<T>
FitTransform(Matrix<T>, Vector<T>)
Fits and transforms the data.
public Matrix<T> FitTransform(Matrix<T> data, Vector<T> target)
Parameters
dataMatrix<T>targetVector<T>
Returns
- Matrix<T>
GetFeatureNamesOut(string[]?)
Gets the output feature names after transformation.
public override string[] GetFeatureNamesOut(string[]? inputFeatureNames = null)
Parameters
inputFeatureNamesstring[]
Returns
- string[]
InverseTransformCore(Matrix<T>)
Inverse transformation is not supported.
protected override Matrix<T> InverseTransformCore(Matrix<T> data)
Parameters
dataMatrix<T>
Returns
- Matrix<T>
TransformCore(Matrix<T>)
Transforms the data using fitted encodings.
protected override Matrix<T> TransformCore(Matrix<T> data)
Parameters
dataMatrix<T>
Returns
- Matrix<T>