**Logistic regression** is a classification algorithm, not a regression technique. let’s take an example men and women are two categories. we can classify them based on features like **hair_length**, **height**, and **weight**.

so many people often confused about linear and logistic regression. If you don’t know what is linear regression please check here and get clear:** Linear regression in machine learning.**

let’s talk about a few features of logistic regression.

## Features of Logistic Regression

- Easy to understand
- Robust to outliers
- provide feature importance
- speed
- works well on non-linear data ( we will do feature transformation )

**Highly used in low latency applications because of the speed of logistic regression, training time also very low.**

In Logistic regression, we assume that the data is linearly separable. so that we have drawn a line or plane between the points. let’s assume that we have positive and negative points in our dataset and they are linearly separable.

Here, **w** – normal to the plane, ** π** is a plane that separates the green and red points. Green – **+Ve points** and red **-Ve points**.

Now dive into details of logistic regression. let’s look at the below picture

Here, ** d_{i} **– the distance between +ve

**(green) points to plane and**

*X*_{i}**– the distance between plane to -ve**

*d*_{j }**(red) points. Notice that 2 points are opposite side, lets called them as misclassified points. where**

*X*_{j}lets consider **||w|| **as a unit vector. so** d _{i} = w^{T}X_{i}** and

**w,x**are in same direction and

_{i }**dj = w**;

^{T}X_{j}**w,x**are in opposite direction.

_{j }- If
**w**> 0 then the model classified as +ve or^{T}X_{i}**green**point. - If
**w**< 0 then the model classified as -ve or^{T}X_{j}**redpoint**.

```
<script async src="https://pagead2.googlesyndication.com/pagead/js/adsbygoogle.js"></script>
<!-- content ad -->
<ins class="adsbygoogle"
style="display:block"
data-ad-client="ca-pub-1677416175485984"
data-ad-slot="5101101831"
data-ad-format="auto"
data-full-width-responsive="true"></ins>
<script>
(adsbygoogle = window.adsbygoogle || []).push({});
</script>
```

#### Objective:

Here our main objective is to minimize the misclassification and maximize the correct predictions as many as possible **y _{i}*w^{T}X_{i} >0**.

#### How LR -model classified the data :

**Case – 1**: The actual class label **y _{i}** is

**+ve**or

**green**point, and

**w, the green or +ve**point are on the same sides, then

**w**> 0, so

^{T}Xi**y**. The model classified as a

_{i}*w^{T}X_{i}> 0**+ve**point or green point. [

**+ * + = +**]. These are correctly classified.

**Case – 2:** The actual class label **y _{i}** is

**-ve**or

**red**point, and

**w, the red or -ve**point are on the same sides, then

**w**so

^{T}Xi < 0,**y**. The model classified as a

_{i}*w^{T}X_{i}> 0**+ve**point or green point. [

**– * – = +**] . These are correctly classified.

**Case – 3:** The actual class label **y _{i}** is

**-ve**or

**red**point, and

**w, the green or +ve**point are on opposite sides, then

**w**so

^{T}Xi <0,**y**. The model classified as a

_{i}*w^{T}X_{i}< 0**+ve**point or green point. [

**– * + = –**]. These are missclassified.

**Case – 3:** The actual class label **y _{i}** is

**+ve**or

**green**point, and

**w, the red or -ve**point are on opposite sides, then

**w**so

^{T}Xi >0,**y**. The model classified as a

_{i}*w^{T}X_{i}< 0**+ve**point or green point. [

**+ * – = –**]. These are misclassified.

#### Optimization:

Here D_{n} = [ X_{i}, Y_{i} ] is the dataset consists of features **X _{i} **and class labels

**Y**. The optimization equation of logistic regression is

_{i} X_{i}, Y_{i }is already given in the dataset. We need to maximize the sum using **W**.

- The value of
**y**is_{i}*w^{T}X_{i}**+ve**if correctly classified. - The value of
**yi*wTXi**is**-ve**if the point is misclassified.

**Note: **Outlier points can affect the signed distances, so sometimes we will get accuracy high but **W ^{*}** value is very low, and sometimes

**W**got high but accuracy very low.

^{*}The **squashing** or **sigmoid function** helps us to squash all values in the range of** 0 to **1. Instead of using signed distance we

- if the signed distance is small, then use as it is.
- if the signed distance is high, then make it small.

The sigmoid function always squashes the signed distances below 1 which are high values.

where **X = y _{i}*w^{T}X_{i} **.

#### Properties of the sigmoid function

- The
**minimum**value of the sigmoid function is**0.** - The
**maximum**value of the sigmoid function is**1.** - If
**x = 0**then**f(x) = 0.5**

Now the optimization function of logistic regression is like below.

**Or**

Here our task is to find W that maximizes our optimization function. We calculate **W _{i}** for different

**π**which gives the maximum

_{i}.**W**, That plane equation is our classifier.

I hope you guys, you get the topic of logistic regression, Please check more Machine learning Algorithms here: **Machine learning Algorithms**.

**Please Appreciate us through a comment or Subscribe to our newsletter below. If you have any doubts about the concept please feel free to ask below. Thank you!**