Back to ML Guide
Supervised Learning

K-Nearest Neighbors (KNN)

A simple, instance-based learning algorithm. Classifies data based on the majority vote of its neighbors.

2 min read

Theory & Concept

KNN is a "lazy learner"—it doesn't actually learn a model during training. Instead, it memorizes the training data.

K-Nearest Neighbors infographic showing how distance and neighbor voting work.

When you ask it to predict a new point:

  1. It calculates the distance from the new point to all stored training points.
  2. It finds the KK closest points ("neighbors").
  3. Classification: It takes a majority vote of the neighbors' classes.
  4. Regression: It takes the average of the neighbors' values.

Mathematical Intuition

The core is the Distance Metric. The most common is Euclidean Distance:

d(p,q)=i=1n(qipi)2d(p, q) = \sqrt{\sum_{i=1}^{n} (q_i - p_i)^2}

For categorical variables, Hamming distance is used.

The prediction for classification is:

y=mode({yi:xiNk(x)})y = \text{mode}(\{y_i : x_i \in N_k(x)\})

Where Nk(x)N_k(x) is the set of KK nearest neighbors.


Quick Readiness Check

Quick Readiness Check

Is this method a fit for your use case?

Best For

Small datasets, baselines, and systems where 'similarity' is intuitive (e.g., simple recommender systems).

Prerequisites

Scaling is Mandatory. Distances are dominated by large-scale features otherwise.

Strengths

Extremely simple. Zero training time. Makes no assumptions about data distribution.

Weaknesses

Slow Inference: O(N) for every prediction (must scan all data). Curse of Dimensionality issues.

Pro Tip

If asked 'What happens if K is too small or too large?': Small K (1) = High Variance (Overfitting). Large K = High Bias (Underfitting).


Code Snippet

from sklearn.neighbors import KNeighborsClassifier
from sklearn.preprocessing import StandardScaler
from sklearn.pipeline import make_pipeline
 
# 1. Pipeline
model = make_pipeline(StandardScaler(), 
                      KNeighborsClassifier(n_neighbors=5))
 
# 2. Train (Just stores data)
model.fit(X_train, y_train)
 
# 3. Predict (Calculates distances)
preds = model.predict(X_test)

Parameter Tuning Cheat Sheet

ParameterOptions / RangeEffect & Best Practice
n_neighbors (K)1 - 50Start with sqrt(N) or default 5. Always pick an odd number for binary classification to avoid ties.
weightsuniform, distanceuniform: All neighbors count equally. distance: Closer neighbors have more say (usually better).
metriceuclidean, manhattaneuclidean is standard. manhattan (taxicab) often better for high-dimensional data.