Class OneHotEncoder<T>
- Namespace
- AiDotNet.Preprocessing.Encoders
- Assembly
- AiDotNet.dll
Encodes categorical values as one-hot (binary) vectors.
public class OneHotEncoder<T> : TransformerBase<T, Matrix<T>, Matrix<T>>, IDataTransformer<T, Matrix<T>, Matrix<T>>
Type Parameters
TThe numeric type for calculations (e.g., float, double).
- Inheritance
-
OneHotEncoder<T>
- Implements
- Inherited Members
Remarks
OneHotEncoder transforms categorical values into binary indicator columns. Each unique category value becomes a separate column with 1s and 0s indicating presence. This encoding is required for many machine learning algorithms that cannot work directly with categories.
For Beginners: This encoder converts categories into binary columns: - Each unique value gets its own column - A 1 indicates the category is present, 0 means it's not
Example for colors [red, green, blue, red]: Becomes: [1, 0, 0] (red) [0, 1, 0] (green) [0, 0, 1] (blue) [1, 0, 0] (red)
Constructors
OneHotEncoder(bool, OneHotUnknownHandling, int[]?)
Creates a new instance of OneHotEncoder<T>.
public OneHotEncoder(bool dropFirst = false, OneHotUnknownHandling handleUnknown = OneHotUnknownHandling.Error, int[]? columnIndices = null)
Parameters
dropFirstboolIf true, drops the first category to avoid multicollinearity. Defaults to false.
handleUnknownOneHotUnknownHandlingHow to handle unknown categories. Defaults to Error.
columnIndicesint[]The column indices to encode, or null for all columns.
Properties
Categories
Gets the categories for each encoded column.
public List<double[]>? Categories { get; }
Property Value
DropFirst
Gets whether the first category is dropped (to avoid multicollinearity).
public bool DropFirst { get; }
Property Value
HandleUnknown
Gets how unknown categories are handled.
public OneHotUnknownHandling HandleUnknown { get; }
Property Value
NOutputFeatures
Gets the number of output features after transformation.
public int NOutputFeatures { get; }
Property Value
SupportsInverseTransform
Gets whether this transformer supports inverse transformation.
public override bool SupportsInverseTransform { get; }
Property Value
Methods
FitCore(Matrix<T>)
Learns the categories from the training data.
protected override void FitCore(Matrix<T> data)
Parameters
dataMatrix<T>The training data matrix where each column is a feature.
GetFeatureNamesOut(string[]?)
Gets the output feature names after transformation.
public override string[] GetFeatureNamesOut(string[]? inputFeatureNames = null)
Parameters
inputFeatureNamesstring[]The input feature names.
Returns
- string[]
The output feature names with category suffixes.
InverseTransformCore(Matrix<T>)
Reverses the one-hot encoding to get original category values.
protected override Matrix<T> InverseTransformCore(Matrix<T> data)
Parameters
dataMatrix<T>The one-hot encoded data.
Returns
- Matrix<T>
The original categorical values.
TransformCore(Matrix<T>)
Transforms the data by applying one-hot encoding.
protected override Matrix<T> TransformCore(Matrix<T> data)
Parameters
dataMatrix<T>The data to transform.
Returns
- Matrix<T>
The one-hot encoded data.