Table of Contents

Class Winsorizer<T>

Namespace
AiDotNet.Preprocessing.OutlierHandling
Assembly
AiDotNet.dll

Winsorizes data by replacing extreme values with percentile bounds.

public class Winsorizer<T> : TransformerBase<T, Matrix<T>, Matrix<T>>, IDataTransformer<T, Matrix<T>, Matrix<T>>

Type Parameters

T

The numeric type for calculations (e.g., float, double).

Inheritance
TransformerBase<T, Matrix<T>, Matrix<T>>
Winsorizer<T>
Implements
IDataTransformer<T, Matrix<T>, Matrix<T>>
Inherited Members

Remarks

Winsorizer is a statistical technique that limits extreme values in the data to reduce the effect of outliers. Unlike trimming (which removes outliers), Winsorization replaces them with less extreme values.

This is equivalent to OutlierClipper but follows the traditional Winsorization terminology where you specify the percentage of data to Winsorize at each tail.

For Beginners: Winsorization is named after biostatistician Charles Winsor. Instead of removing outliers, it replaces them with the nearest "normal" values: - If you Winsorize at 5%, the bottom 5% of values become equal to the 5th percentile - The top 5% of values become equal to the 95th percentile

This preserves sample size while reducing outlier impact.

Constructors

Winsorizer(double, double, WinsorizerLimitType, int[]?)

Creates a new instance of Winsorizer<T>.

public Winsorizer(double lowerLimit = 5, double upperLimit = 95, WinsorizerLimitType limitType = WinsorizerLimitType.Percentile, int[]? columnIndices = null)

Parameters

lowerLimit double

Lower limit. For percentile type: 0-50. For IQR type: multiplier (e.g., 1.5). Defaults to 5.

upperLimit double

Upper limit. For percentile type: 50-100. For IQR type: multiplier (e.g., 1.5). Defaults to 95.

limitType WinsorizerLimitType

Type of limits to use. Defaults to Percentile.

columnIndices int[]

The column indices to Winsorize, or null for all columns.

Properties

LimitType

Gets the type of limit (percentile or IQR).

public WinsorizerLimitType LimitType { get; }

Property Value

WinsorizerLimitType

LowerBounds

Gets the computed lower bounds for each feature.

public double[]? LowerBounds { get; }

Property Value

double[]

LowerLimit

Gets the lower limit value.

public double LowerLimit { get; }

Property Value

double

SupportsInverseTransform

Gets whether this transformer supports inverse transformation.

public override bool SupportsInverseTransform { get; }

Property Value

bool

UpperBounds

Gets the computed upper bounds for each feature.

public double[]? UpperBounds { get; }

Property Value

double[]

UpperLimit

Gets the upper limit value.

public double UpperLimit { get; }

Property Value

double

Methods

FitCore(Matrix<T>)

Computes the Winsorization bounds for each feature.

protected override void FitCore(Matrix<T> data)

Parameters

data Matrix<T>

The training data matrix.

GetFeatureNamesOut(string[]?)

Gets the output feature names after transformation.

public override string[] GetFeatureNamesOut(string[]? inputFeatureNames = null)

Parameters

inputFeatureNames string[]

Returns

string[]

InverseTransformCore(Matrix<T>)

Inverse transformation is not supported.

protected override Matrix<T> InverseTransformCore(Matrix<T> data)

Parameters

data Matrix<T>

Returns

Matrix<T>

TransformCore(Matrix<T>)

Winsorizes the data by replacing extreme values with bounds.

protected override Matrix<T> TransformCore(Matrix<T> data)

Parameters

data Matrix<T>

The data to transform.

Returns

Matrix<T>

The Winsorized data.