Class WOEEncoder<T>

Namespace: AiDotNet.Preprocessing.Encoders

Assembly: AiDotNet.dll

Encodes categorical features using Weight of Evidence (WOE).

public class WOEEncoder<T> : TransformerBase<T, Matrix<T>, Matrix<T>>, IDataTransformer<T, Matrix<T>, Matrix<T>>

Type Parameters

T: The numeric type for calculations (e.g., float, double).

Inheritance: object

TransformerBase<T, Matrix<T>, Matrix<T>>

WOEEncoder<T>

Implements: IDataTransformer<T, Matrix<T>, Matrix<T>>

Inherited Members: TransformerBase<T, Matrix<T>, Matrix<T>>.NumOps

TransformerBase<T, Matrix<T>, Matrix<T>>.Engine

TransformerBase<T, Matrix<T>, Matrix<T>>.IsFitted

TransformerBase<T, Matrix<T>, Matrix<T>>.ColumnIndices

TransformerBase<T, Matrix<T>, Matrix<T>>.SupportsInverseTransform

TransformerBase<T, Matrix<T>, Matrix<T>>.Fit(Matrix<T>)

TransformerBase<T, Matrix<T>, Matrix<T>>.Transform(Matrix<T>)

TransformerBase<T, Matrix<T>, Matrix<T>>.FitTransform(Matrix<T>)

TransformerBase<T, Matrix<T>, Matrix<T>>.InverseTransform(Matrix<T>)

TransformerBase<T, Matrix<T>, Matrix<T>>.GetFeatureNamesOut(string[])

TransformerBase<T, Matrix<T>, Matrix<T>>.FitCore(Matrix<T>)

TransformerBase<T, Matrix<T>, Matrix<T>>.TransformCore(Matrix<T>)

TransformerBase<T, Matrix<T>, Matrix<T>>.InverseTransformCore(Matrix<T>)

TransformerBase<T, Matrix<T>, Matrix<T>>.ValidateInputData(Matrix<T>)

TransformerBase<T, Matrix<T>, Matrix<T>>.EnsureFitted()

TransformerBase<T, Matrix<T>, Matrix<T>>.GetColumnsToProcess(int)

object.Equals(object)

object.Equals(object, object)

object.GetHashCode()

object.GetType()

object.MemberwiseClone()

object.ReferenceEquals(object, object)

object.ToString()

Remarks

Weight of Evidence is commonly used in credit scoring and binary classification. It measures the strength of the relationship between a category and the binary target. WOE = ln(Distribution of Events / Distribution of Non-Events)

Higher WOE values indicate categories more associated with the positive class, while lower (negative) values indicate association with the negative class.

For Beginners: WOE tells you how "good" or "bad" a category is for prediction: - WOE > 0: Category is more likely to have positive outcomes - WOE < 0: Category is more likely to have negative outcomes - WOE ≈ 0: Category has no predictive power

Example in loan default prediction:

"Employed" might have WOE = -0.5 (less likely to default)
"Unemployed" might have WOE = +0.8 (more likely to default)

Constructors

WOEEncoder(double, WOEHandleUnknown, int[]?)

Creates a new instance of WOEEncoder<T>.

public WOEEncoder(double regularization = 0.5, WOEHandleUnknown handleUnknown = WOEHandleUnknown.UseZero, int[]? columnIndices = null)

Parameters

regularization double: Regularization to add to counts to prevent division by zero. Defaults to 0.5.
handleUnknown WOEHandleUnknown: How to handle unknown categories. Defaults to UseZero.
columnIndices int[]: The column indices to encode, or null for all columns.

Properties

HandleUnknown

Gets how unknown categories are handled.

public WOEHandleUnknown HandleUnknown { get; }

Property Value

WOEHandleUnknown

Regularization

Gets the regularization parameter to prevent infinite WOE values.

public double Regularization { get; }

Property Value

double

SupportsInverseTransform

Gets whether this transformer supports inverse transformation.

public override bool SupportsInverseTransform { get; }

Property Value

bool

WOEValues

Gets the WOE values for each category.

public Dictionary<int, Dictionary<double, double>>? WOEValues { get; }

Property Value

Dictionary<int, Dictionary<double, double>>

Methods

CalculateInformationValue(Matrix<T>, Vector<T>)

Calculates Information Value (IV) for each feature.

public Dictionary<int, double> CalculateInformationValue(Matrix<T> data, Vector<T> target)

Parameters

data Matrix<T>: The feature matrix.
target Vector<T>: The binary target.

Returns

Dictionary<int, double>: Dictionary mapping column index to IV value.

Remarks

IV measures the overall predictive power of a feature. IV < 0.02: Not useful for prediction 0.02 < IV < 0.1: Weak predictor 0.1 < IV < 0.3: Medium predictor 0.3 < IV < 0.5: Strong predictor IV > 0.5: Suspicious (possible overfitting)

Fit(Matrix<T>, Vector<T>)

Fits the encoder by computing WOE values for each category.

public void Fit(Matrix<T> data, Vector<T> target)

Parameters

data Matrix<T>: The feature matrix to fit.
target Vector<T>: The binary target values (0 or 1).

FitCore(Matrix<T>)

Fits the encoder (requires binary target via specialized Fit method).

protected override void FitCore(Matrix<T> data)

Parameters

data Matrix<T>

FitTransform(Matrix<T>, Vector<T>)

Fits and transforms the data.

public Matrix<T> FitTransform(Matrix<T> data, Vector<T> target)

Parameters

data Matrix<T>
target Vector<T>

Returns

Matrix<T>

GetFeatureNamesOut(string[]?)

Gets the output feature names after transformation.

public override string[] GetFeatureNamesOut(string[]? inputFeatureNames = null)

Parameters

inputFeatureNames string[]

Returns

string[]

InverseTransformCore(Matrix<T>)

Inverse transformation is not supported.

protected override Matrix<T> InverseTransformCore(Matrix<T> data)

Parameters

data Matrix<T>

Returns

Matrix<T>

TransformCore(Matrix<T>)

Transforms the data by replacing categories with WOE values.

protected override Matrix<T> TransformCore(Matrix<T> data)

Parameters

data Matrix<T>

Returns

Matrix<T>

Table of Contents

Class WOEEncoder<T>

Type Parameters

Remarks

Constructors

WOEEncoder(double, WOEHandleUnknown, int[]?)

Parameters

Properties

HandleUnknown

Property Value

Regularization

Property Value

SupportsInverseTransform

Property Value

WOEValues

Property Value

Methods

CalculateInformationValue(Matrix<T>, Vector<T>)

Parameters

Returns

Remarks

Fit(Matrix<T>, Vector<T>)

Parameters

FitCore(Matrix<T>)

Parameters

FitTransform(Matrix<T>, Vector<T>)

Parameters

Returns

GetFeatureNamesOut(string[]?)

Parameters

Returns

InverseTransformCore(Matrix<T>)

Parameters

Returns

TransformCore(Matrix<T>)

Parameters

Returns