Class SelectKBest<T>
- Namespace
- AiDotNet.Preprocessing.FeatureSelection
- Assembly
- AiDotNet.dll
Selects the K highest-scoring features according to a scoring function.
public class SelectKBest<T> : TransformerBase<T, Matrix<T>, Matrix<T>>, IDataTransformer<T, Matrix<T>, Matrix<T>>
Type Parameters
TThe numeric type for calculations (e.g., float, double).
- Inheritance
-
SelectKBest<T>
- Implements
- Inherited Members
Remarks
SelectKBest computes a score for each feature based on the relationship between the feature and the target variable, then selects the top K features with the highest scores.
Built-in scoring functions include: - F-score for regression (linear relationship) - Mutual information (any relationship type)
For Beginners: Not all features are equally useful for prediction. SelectKBest helps you: - Reduce the number of features to improve model speed - Remove noisy features that might hurt model accuracy - Find the most informative features for understanding your problem
Example: From 100 features, select the 10 most related to your target.
Constructors
SelectKBest(int, SelectKBestScoreFunc, int[]?)
Creates a new instance of SelectKBest<T>.
public SelectKBest(int k = 10, SelectKBestScoreFunc scoreFunc = SelectKBestScoreFunc.FRegression, int[]? columnIndices = null)
Parameters
kintNumber of features to select. Defaults to 10.
scoreFuncSelectKBestScoreFuncScoring function to use. Defaults to FRegression.
columnIndicesint[]The column indices to consider, or null for all columns.
Properties
K
Gets the number of features to select.
public int K { get; }
Property Value
PValues
Gets the p-values for each feature (if applicable).
public double[]? PValues { get; }
Property Value
- double[]
ScoreFunc
Gets the scoring function used.
public SelectKBestScoreFunc ScoreFunc { get; }
Property Value
Scores
Gets the computed scores for each feature.
public double[]? Scores { get; }
Property Value
- double[]
SelectedFeatures
Gets the indices of selected features.
public int[]? SelectedFeatures { get; }
Property Value
- int[]
SupportsInverseTransform
Gets whether this transformer supports inverse transformation.
public override bool SupportsInverseTransform { get; }
Property Value
Methods
Fit(Matrix<T>, Vector<T>)
Fits the selector by computing feature scores based on the target.
public void Fit(Matrix<T> data, Vector<T> target)
Parameters
dataMatrix<T>The feature matrix to fit.
targetVector<T>The target values.
FitCore(Matrix<T>)
Fits the selector using variance-based scoring when no target is provided.
protected override void FitCore(Matrix<T> data)
Parameters
dataMatrix<T>
Remarks
When called without target values, this method falls back to variance-based feature selection. Features with higher variance are scored higher. For supervised selection, use Fit(Matrix<T> data, Vector<T> target) instead.
FitTransform(Matrix<T>, Vector<T>)
Fits the selector and transforms the data in one step.
public Matrix<T> FitTransform(Matrix<T> data, Vector<T> target)
Parameters
dataMatrix<T>targetVector<T>
Returns
- Matrix<T>
GetFeatureNamesOut(string[]?)
Gets the output feature names after transformation.
public override string[] GetFeatureNamesOut(string[]? inputFeatureNames = null)
Parameters
inputFeatureNamesstring[]
Returns
- string[]
GetSupportMask()
Gets a boolean mask indicating which features are selected.
public bool[] GetSupportMask()
Returns
- bool[]
InverseTransformCore(Matrix<T>)
Inverse transformation is not supported.
protected override Matrix<T> InverseTransformCore(Matrix<T> data)
Parameters
dataMatrix<T>
Returns
- Matrix<T>
TransformCore(Matrix<T>)
Selects the top K features from the data.
protected override Matrix<T> TransformCore(Matrix<T> data)
Parameters
dataMatrix<T>
Returns
- Matrix<T>