Class JamesSteinEncoder<T>
- Namespace
- AiDotNet.Preprocessing.Encoders
- Assembly
- AiDotNet.dll
Encodes categorical features using James-Stein shrinkage estimation.
public class JamesSteinEncoder<T> : TransformerBase<T, Matrix<T>, Matrix<T>>, IDataTransformer<T, Matrix<T>, Matrix<T>>
Type Parameters
TThe numeric type for calculations (e.g., float, double).
- Inheritance
-
JamesSteinEncoder<T>
- Implements
- Inherited Members
Remarks
JamesSteinEncoder uses Bayesian shrinkage to blend category-specific target means with the global mean. Categories with more samples get weights closer to their own mean, while rare categories shrink toward the global mean.
The shrinkage formula: encoded = (1 - B) * category_mean + B * global_mean where B is the shrinkage factor based on sample size and variance.
For Beginners: This encoder balances between: - Trusting category-specific averages (when we have lots of data) - Falling back to the overall average (when category data is sparse) - The balance is determined automatically using statistical theory
Constructors
JamesSteinEncoder(JamesSteinHandleUnknown, int[]?)
Creates a new instance of JamesSteinEncoder<T>.
public JamesSteinEncoder(JamesSteinHandleUnknown handleUnknown = JamesSteinHandleUnknown.UseGlobalMean, int[]? columnIndices = null)
Parameters
handleUnknownJamesSteinHandleUnknownHow to handle unknown categories. Defaults to UseGlobalMean.
columnIndicesint[]The column indices to encode, or null for all columns.
Properties
GlobalMean
Gets the global target mean.
public double GlobalMean { get; }
Property Value
HandleUnknown
Gets how unknown categories are handled.
public JamesSteinHandleUnknown HandleUnknown { get; }
Property Value
SupportsInverseTransform
Gets whether this transformer supports inverse transformation.
public override bool SupportsInverseTransform { get; }
Property Value
Methods
Fit(Matrix<T>, Vector<T>)
Fits the encoder by computing James-Stein shrinkage estimates.
public void Fit(Matrix<T> data, Vector<T> target)
Parameters
dataMatrix<T>The feature matrix to fit.
targetVector<T>The target values.
FitCore(Matrix<T>)
Fits the encoder (requires target via specialized Fit method).
protected override void FitCore(Matrix<T> data)
Parameters
dataMatrix<T>
FitTransform(Matrix<T>, Vector<T>)
Fits and transforms the data.
public Matrix<T> FitTransform(Matrix<T> data, Vector<T> target)
Parameters
dataMatrix<T>targetVector<T>
Returns
- Matrix<T>
GetFeatureNamesOut(string[]?)
Gets the output feature names after transformation.
public override string[] GetFeatureNamesOut(string[]? inputFeatureNames = null)
Parameters
inputFeatureNamesstring[]
Returns
- string[]
InverseTransformCore(Matrix<T>)
Inverse transformation is not supported.
protected override Matrix<T> InverseTransformCore(Matrix<T> data)
Parameters
dataMatrix<T>
Returns
- Matrix<T>
TransformCore(Matrix<T>)
Transforms the data using fitted encodings.
protected override Matrix<T> TransformCore(Matrix<T> data)
Parameters
dataMatrix<T>
Returns
- Matrix<T>