Remember our example for the linear regression, Y was composed by continuous values (grades of an exam). For a classification problem, Y takes class labels (e.g. good or bad, color, etc…). Let’s take another example: we want to find an algorithm that use the age to predict if they have a driving licence or not

Our inputs are the age of each person from our sample. The outputs are if they have or not a driving licence.

First, we have to convert Y into 0 or 1 in order to be able to use them mathematically. Let’s say 0 for no and 1 when they do have a driving licence.

Our goal is to find an algorithm that will classify correctly our x. In our example, we want to find an algorithm that will allow us to say if a person is likely to have a driving licence or not given his age.

In logistic regression, we assume that the dependent variable is a stochastic event (also called a dummy variable). Here it means that the response can be whether 0 or 1 (but not 0.3 for example). Thus, if we end up with a predicted Y equal to 0.3 as a prediction for a certain x, then logistic regression will make it to be 0 so without driving licence. In the logistic regression, we will therefore proceed this way:

if Predicted Y > 0.5 => final predicted Y = 1

and if predicted Y<0.5=> final predicted Y = 0

Logistic regression works almost like a linear regression. First the hyperplane that separate our data into two groups is linear thus the algorithm of the hyperplane looks like the one in linear regression:

Y = W0 + W1*X

The difference is in the fact that we want a predicted Y that is between 0 and 1 in order to make our prediction. To make it happen, the logistic regression use the log function which will allowed us to transform our predicted Y into a value of between 0 to 1.

This function, the sigmoid or logistic function is:

1 / (1 + e^{-Y} ) where Y = W0 + W1*X

And looks like that:

As you can see, all the predicted Y we would obtain would be between 0 and 1.

What we got is the likelihood that Y=1 on input x given the parameter W0 and W1.