Table of Contents

Class KNeighborsOptions<T>

Namespace
AiDotNet.Models.Options
Assembly
AiDotNet.dll

Configuration options for K-Nearest Neighbors classifiers.

public class KNeighborsOptions<T> : ClassifierOptions<T>

Type Parameters

T

The data type used for calculations.

Inheritance
KNeighborsOptions<T>
Inherited Members

Remarks

K-Nearest Neighbors (KNN) is a simple, instance-based learning algorithm that classifies samples based on the majority class among their k nearest neighbors in the feature space.

For Beginners: KNN is like asking your neighbors for advice!

When you need to classify a new sample:

  1. Find the k training samples closest to it
  2. Look at what classes those neighbors belong to
  3. Predict the most common class among those neighbors

Example: To predict if a movie is "Action" or "Comedy":

  • Find 5 similar movies (based on runtime, budget, etc.)
  • If 4 are Action and 1 is Comedy, predict "Action"

Key settings:

  • K (N_Neighbors): How many neighbors to consider (default: 5)
  • Metric: How to measure distance (Euclidean, Manhattan, etc.)

Properties

Algorithm

Gets or sets the algorithm used to compute nearest neighbors.

public KNNAlgorithm Algorithm { get; set; }

Property Value

KNNAlgorithm

The search algorithm. Default is Auto.

Remarks

Auto chooses the best algorithm based on data characteristics. BruteForce computes all pairwise distances (O(n*d) per query). KDTree uses a tree structure for faster queries in low dimensions. BallTree is better for high-dimensional data.

For Beginners: How to find neighbors efficiently.

  • Auto: Let the algorithm choose (recommended)
  • BruteForce: Compare to every training sample (slow but accurate)
  • KDTree: Use a tree structure (fast for small dimensions)
  • BallTree: Better for many dimensions

LeafSize

Gets or sets the leaf size for tree-based algorithms.

public int LeafSize { get; set; }

Property Value

int

The leaf size for KDTree or BallTree. Default is 30.

Remarks

This affects the speed of tree construction and query, as well as memory requirements. Larger values create shallower trees.

Metric

Gets or sets the distance metric used to find nearest neighbors.

public DistanceMetric Metric { get; set; }

Property Value

DistanceMetric

The distance metric. Default is Euclidean.

Remarks

The choice of metric affects which points are considered "nearest." Euclidean distance works well for continuous features with similar scales. Manhattan distance can be better for high-dimensional data.

For Beginners: This determines how we measure "closeness."

  • Euclidean: Straight-line distance (like a bird flying)
  • Manhattan: Distance along axes (like walking city blocks)
  • Minkowski: Generalization of both (with parameter p)

Euclidean is the most common choice for most problems.

NNeighbors

Gets or sets the number of neighbors to use for classification.

public int NNeighbors { get; set; }

Property Value

int

The number of neighbors (k). Default is 5.

Remarks

Smaller values of k make the model more sensitive to noise but can capture local patterns. Larger values provide smoother decision boundaries but may miss local patterns.

For Beginners: K is the number of neighbors to ask for their opinion.

  • K = 1: Only look at the single closest neighbor (very sensitive to noise)
  • K = 5: Look at 5 closest neighbors (good balance)
  • K = 20: Look at 20 neighbors (smoother but may ignore local patterns)

Odd values are often preferred to avoid ties in binary classification.

P

Gets or sets the power parameter for the Minkowski metric.

public double P { get; set; }

Property Value

double

The Minkowski power parameter. Default is 2 (Euclidean).

Remarks

Only used when Metric is Minkowski. p = 1 is equivalent to Manhattan distance. p = 2 is equivalent to Euclidean distance.

Weights

Gets or sets the weight function used in prediction.

public WeightingScheme Weights { get; set; }

Property Value

WeightingScheme

The weighting scheme. Default is Uniform.

Remarks

Uniform weighting treats all neighbors equally. Distance weighting gives closer neighbors more influence on the prediction.

For Beginners: Should all neighbors have equal say?

  • Uniform: Every neighbor's vote counts equally
  • Distance: Closer neighbors count more (weight = 1/distance)

Distance weighting often works better because the closest neighbors are usually more relevant.