# KNN algorithm in machine learning – A Simple guide

KNN algorithm is the most simple and easy classification algorithm in supervised machine learning. K-NN is known as K- Nearest Neighbors. It is the basic and most popular algorithm and it is also called Lazy learner because of training speed and prediction time.

### Features of KNN algorithm

• The most common and simple classification algorithm, it can give highly effective results.
• KNN also used for Regression, we take the average of neighbors.
• Easy to interpretable.
• It is also used for filling missing values.

KNN algorithm totally depends on the majority of the neighborhoods or class labels. K – represents the number of neighbors. In the below picture we take k=5.

In our dataset, we have positive and negative data points. They have plotted as above and we took k as 5.

In the positive points, we have a circle, the green point surrounded by totally positive points so the green point is also a positive point.

In the negative points, we have a circle, the yellow point surrounded by totally negative points so the yellow point is also a negative point.

Now the problem is, what is that purple color point. It is surrounded by positive and negative points. so we can’t decide which class it is?

## Stp by step to KNN algorithm:

#### Step – 1:

Choose or select the number k of neighbors. This is the step we need to take care of it because it decides the algorithm’s performance.

Note: K must be an odd integer point, otherwise we can’t identify the new data point which class it belongs. If k =4, sometimes we get 2 positive and 2 negative.

#### Step – 2:

Now take the K nearest neighbors of new data point based on the Euclidian distance or Manhattan distance. Euclidian distance is the most common distance used in machine learning algorithms.

#### Step – 3:

Now count the class labels in k neighbors. The new data point belongs to the majority class. Let’s take an example, K = 5, and 2 are -negative points and 3 are positive points, then the new data point belongs to + positive point.

### Cautions :

• Affected by outliers
• Not well on a random dataset
• Time & space complexities are very high, we need to store all training data when the model in production.
• Not well for low latency applications

Ok! I hope you understand how the KNN algorithm works, check out all machine learning algorithms. please comment below if you have any quires about the K-NN algorithm. Thank you!

Don't miss out!