Table of Contents

Class KNearestNeighborsRegression<T>

Namespace
AiDotNet.Regression
Assembly
AiDotNet.dll

Implements K-Nearest Neighbors algorithm for regression, which predicts target values by averaging the values of the K closest training examples.

public class KNearestNeighborsRegression<T> : NonLinearRegressionBase<T>, INonLinearRegression<T>, IRegression<T>, IFullModel<T, Matrix<T>, Vector<T>>, IModel<Matrix<T>, Vector<T>, ModelMetadata<T>>, IModelSerializer, ICheckpointableModel, IParameterizable<T, Matrix<T>, Vector<T>>, IFeatureAware, IFeatureImportance<T>, ICloneable<IFullModel<T, Matrix<T>, Vector<T>>>, IGradientComputable<T, Matrix<T>, Vector<T>>, IJitCompilable<T>

Type Parameters

T

The numeric type used for calculations, typically float or double.

Inheritance
KNearestNeighborsRegression<T>
Implements
IFullModel<T, Matrix<T>, Vector<T>>
IModel<Matrix<T>, Vector<T>, ModelMetadata<T>>
IParameterizable<T, Matrix<T>, Vector<T>>
ICloneable<IFullModel<T, Matrix<T>, Vector<T>>>
IGradientComputable<T, Matrix<T>, Vector<T>>
Inherited Members
Extension Methods

Remarks

K-Nearest Neighbors (KNN) is a non-parametric and instance-based learning algorithm that makes predictions based on the similarity between the input and training samples. For regression, it computes the average of the target values of the K nearest neighbors to the query point. The algorithm doesn't build an explicit model but instead stores all training examples and performs computations at prediction time.

For Beginners: K-Nearest Neighbors is like asking your neighbors for advice.

Imagine you want to guess the price of a house:

  • You look at the K most similar houses to yours (the "nearest neighbors")
  • You take the average of their prices as your prediction

The "K" is just how many neighbors you consider. If K=3, you look at the 3 most similar houses.

This approach is:

  • Simple to understand: similar inputs should have similar outputs
  • Makes no assumptions about the data's structure
  • Works well when similar examples in your data actually have similar target values

Unlike most machine learning algorithms, KNN doesn't "learn" patterns during training - it simply remembers all examples and does the real work at prediction time by finding similar examples.

Constructors

KNearestNeighborsRegression(KNearestNeighborsOptions?, IRegularization<T, Matrix<T>, Vector<T>>?)

Initializes a new instance of the KNearestNeighborsRegression<T> class.

public KNearestNeighborsRegression(KNearestNeighborsOptions? options = null, IRegularization<T, Matrix<T>, Vector<T>>? regularization = null)

Parameters

options KNearestNeighborsOptions

Optional configuration options for the KNN algorithm.

regularization IRegularization<T, Matrix<T>, Vector<T>>

Optional regularization strategy to prevent overfitting.

Remarks

This constructor creates a new K-Nearest Neighbors regression model with the specified options and regularization strategy. If no options are provided, default values are used. If no regularization is specified, no regularization is applied.

For Beginners: This is how you create a new K-Nearest Neighbors model.

The key setting is K - the number of neighbors to consider when making predictions:

  • Smaller K values (like 1 or 2): More sensitive to noise in the data
  • Larger K values (like 10 or 20): Smoother predictions but might miss important patterns

If you don't specify any options, the model will use reasonable default settings.

Example:

// Create a KNN model with default settings
var knn = new KNearestNeighborsRegression<double>();

// Create a KNN model with custom options
var options = new KNearestNeighborsOptions { K = 5 };
var customKnn = new KNearestNeighborsRegression<double>(options);

Properties

SoftKNNTemperature

Gets or sets the temperature parameter for soft KNN mode.

public T SoftKNNTemperature { get; set; }

Property Value

T

The temperature for softmax attention. Lower values produce sharper attention. Default is 1.0.

SupportsJitCompilation

Gets whether this model supports JIT compilation.

public override bool SupportsJitCompilation { get; }

Property Value

bool

true when UseSoftKNN is enabled and training data is available; false otherwise.

Remarks

When UseSoftKNN is enabled, KNN can be exported as a differentiable computation graph using attention-weighted averaging. The training data is embedded as constants in the computation graph.

When UseSoftKNN is disabled, JIT compilation is not supported because traditional hard KNN requires dynamic neighbor selection that cannot be represented as a static computation graph.

UseSoftKNN

Gets or sets whether to use soft (differentiable) KNN mode for JIT compilation.

public bool UseSoftKNN { get; set; }

Property Value

bool

true to enable soft KNN mode with attention-weighted outputs for JIT support; false (default) for traditional hard K-nearest neighbors.

Remarks

Soft KNN: Instead of selecting exactly K nearest neighbors and averaging their labels, soft KNN computes attention weights over ALL training samples based on distances. This makes the algorithm differentiable and JIT-compilable.

Formula: weights = softmax(-distances / temperature)

Output: weighted_output = sum(weights * labels)

Trade-offs:

  • Soft KNN is differentiable and JIT-compilable
  • Results are smooth approximations of hard K selection
  • Lower temperature = sharper attention (closer to hard K selection)
  • Higher temperature = softer attention (considers more neighbors)

Computational Note: Soft KNN computes attention over ALL training samples, which can be expensive for large training sets. The JIT-compiled version embeds all support vectors as constants, so the computation graph size scales with training set size.

Methods

Clone()

Creates a shallow copy of this KNN model including its training data.

public override IFullModel<T, Matrix<T>, Vector<T>> Clone()

Returns

IFullModel<T, Matrix<T>, Vector<T>>

A new KNearestNeighborsRegression instance with the same configuration and training data.

Remarks

This method overrides the base class Clone to ensure that KNN-specific training data (_xTrain and _yTrain) is properly copied. Without this override, cloned models would lose their training data and fail when Predict is called.

For Beginners: This method creates a copy of your trained model.

Unlike CreateInstance which creates an empty model, Clone copies:

  • All the base class settings (support vectors, alphas, bias, options)
  • The training data that KNN needs to make predictions
  • The soft KNN settings if enabled

This is important because KNN stores all training examples and uses them at prediction time. A clone without training data would be unusable.

CreateInstance()

Creates a new instance of the KNearestNeighborsRegression with the same configuration as the current instance.

protected override IFullModel<T, Matrix<T>, Vector<T>> CreateInstance()

Returns

IFullModel<T, Matrix<T>, Vector<T>>

A new KNearestNeighborsRegression instance with the same options and regularization as the current instance.

Remarks

This method creates a new instance of the KNearestNeighborsRegression model with the same configuration options and regularization settings as the current instance. This is useful for model cloning, ensemble methods, or cross-validation scenarios where multiple instances of the same model with identical configurations are needed.

For Beginners: This method creates a fresh copy of the model's blueprint.

When you need multiple versions of the same type of model with identical settings:

  • This method creates a new, empty model with the same configuration
  • It's like making a copy of a recipe before you start cooking
  • The new model has the same settings but no trained data
  • This is useful for techniques that need multiple models, like cross-validation

For example, when testing your model on different subsets of data, you'd want each test to use a model with identical settings.

DeepCopy()

Creates a deep copy of this KNN model including its training data.

public override IFullModel<T, Matrix<T>, Vector<T>> DeepCopy()

Returns

IFullModel<T, Matrix<T>, Vector<T>>

A new KNearestNeighborsRegression instance with independent copies of all data.

Remarks

This method overrides the base class DeepCopy to ensure that KNN-specific training data is properly deep copied. The resulting model is completely independent of the original - modifications to one will not affect the other.

For Beginners: This creates a completely independent copy of your model.

While Clone shares some data with the original (for efficiency), DeepCopy creates entirely new copies of everything including:

  • All training feature vectors
  • All training labels
  • All model parameters

Use DeepCopy when you need to modify the copy without affecting the original.

Deserialize(byte[])

Loads a previously serialized K-Nearest Neighbors Regression model from a byte array.

public override void Deserialize(byte[] modelData)

Parameters

modelData byte[]

The byte array containing the serialized model.

Remarks

This method reconstructs a KNN model from a byte array that was previously created using the Serialize method. It restores the base class data, the number of neighbors (K), and the training data that is used for making predictions.

For Beginners: This method loads a previously saved model from a sequence of bytes.

Deserialization allows you to:

  • Load a model that was saved earlier
  • Use a model without having to retrain it
  • Share models between different applications

When you deserialize a model:

  • The value of K is restored
  • All training examples are loaded back into memory
  • The model is ready to make predictions immediately

Example:

// Load from a file
byte[] modelData = File.ReadAllBytes("knn.model");

// Deserialize the model
var knn = new KNearestNeighborsRegression<double>();
knn.Deserialize(modelData);

// Now you can use the model for predictions
var predictions = knn.Predict(newFeatures);

ExportComputationGraph(List<ComputationNode<T>>)

Exports the model's computation as a graph of operations.

public override ComputationNode<T> ExportComputationGraph(List<ComputationNode<T>> inputNodes)

Parameters

inputNodes List<ComputationNode<T>>

The input nodes for the computation graph.

Returns

ComputationNode<T>

The root node of the exported computation graph.

Remarks

When soft KNN mode is enabled, this exports the KNN model as a differentiable computation graph using SoftKNN(ComputationNode<T>, ComputationNode<T>, ComputationNode<T>, T?) operations. The training data (support vectors and labels) are embedded as constants in the graph.

Exceptions

NotSupportedException

Thrown when UseSoftKNN is false.

InvalidOperationException

Thrown when no training data is available.

GetModelType()

Gets the model type of the K-Nearest Neighbors Regression model.

protected override ModelType GetModelType()

Returns

ModelType

The model type enumeration value.

OptimizeModel(Matrix<T>, Vector<T>)

Optimizes the KNN model by storing the training data for later use in predictions.

protected override void OptimizeModel(Matrix<T> x, Vector<T> y)

Parameters

x Matrix<T>

A matrix where each row represents a sample and each column represents a feature.

y Vector<T>

A vector of target values corresponding to each sample in x.

Remarks

This method "trains" the KNN model by storing the training data for later use during prediction. Unlike many other machine learning algorithms, KNN doesn't build a parametric model during training. Instead, it simply stores the training data and uses it to compute predictions at runtime by finding the K nearest neighbors to each query point.

For Beginners: KNN doesn't really "learn" during training - it just memorizes the examples.

While most machine learning models try to extract patterns during training, KNN takes a different approach:

  1. It simply stores all the training examples (both features and target values)
  2. When asked to make a prediction, it does the actual work of finding similar examples

Think of it like studying for an exam by memorizing all the examples in a textbook, rather than trying to understand the underlying rules. When given a new problem, you solve it by finding the most similar examples from the ones you memorized.

This is why KNN is sometimes called a "lazy learner" - it doesn't do much work during training, but has to work harder at prediction time.

Predict(Matrix<T>)

Predicts target values for the provided input features using the trained KNN model.

public override Vector<T> Predict(Matrix<T> input)

Parameters

input Matrix<T>

A matrix where each row represents a sample to predict and each column represents a feature.

Returns

Vector<T>

A vector of predicted values corresponding to each input sample.

Remarks

This method predicts target values for new input data by finding the K nearest neighbors from the training data for each input sample and computing the average of their target values. Unlike parametric models, KNN is a distance-based method that does not apply data regularization transformations. The prediction relies entirely on the stored training data and distance calculations.

For Beginners: This method uses your trained model to make predictions on new data.

For each input example, it:

  1. Calculates how similar the new example is to each training example (using distance)
  2. Finds the K most similar training examples
  3. Takes the average of their target values as the prediction

This method handles multiple inputs at once, making a separate prediction for each one.

Example:

// Make predictions
var predictions = knn.Predict(newFeatures);

PredictSingle(Vector<T>)

Predicts the target value for a single input feature vector.

protected override T PredictSingle(Vector<T> input)

Parameters

input Vector<T>

The feature vector of the sample to predict.

Returns

T

The predicted value for the input sample.

Remarks

This method predicts the target value for a single input feature vector by finding the K nearest neighbors from the training data and computing the average of their target values. The distance between the input and each training sample is computed using Euclidean distance.

For Beginners: This method makes a prediction for a single new data point.

The prediction process works like this:

  1. Calculate the distance between the new point and every training example
  2. Find the K training examples with the smallest distances (the nearest neighbors)
  3. Calculate the average of their target values
  4. Return this average as the prediction

For example, if you want to predict a house price and K=3, this method would:

  • Find the 3 most similar houses from the training data
  • Calculate the average of their prices
  • Return that average as the predicted price

Serialize()

Serializes the K-Nearest Neighbors Regression model to a byte array for storage or transmission.

public override byte[] Serialize()

Returns

byte[]

A byte array containing the serialized model.

Remarks

This method converts the KNN model into a byte array that can be stored in a file, database, or transmitted over a network. The serialized data includes the base class data, the number of neighbors (K), and the training data that is used for making predictions.

For Beginners: This method saves your trained model as a sequence of bytes.

Serialization allows you to:

  • Save your model to a file
  • Store your model in a database
  • Send your model over a network
  • Keep your model for later use without having to retrain it

The serialized data includes:

  • The value of K (number of neighbors)
  • All the training examples (both features and target values)

Since KNN stores all training data, the serialized model can be quite large compared to other machine learning models.

Example:

// Serialize the model
byte[] modelData = knn.Serialize();

// Save to a file
File.WriteAllBytes("knn.model", modelData);