# Logistic Regression: Calculating a Probability

Many problems require a probability estimate as output. Logistic regression is an extremely efficient mechanism for calculating probabilities. Practically speaking, you can use the returned probability in either of the following two ways:

• "As is"
• Converted to a binary category.

Let's consider how we might use the probability "as is." Suppose we create a logistic regression model to predict the probability that a dog will bark during the middle of the night. We'll call that probability:

$p(bark | night)$

If the logistic regression model predicts $$p(bark | night) = 0.05$$, then over a year, the dog's owners should be startled awake approximately 18 times:

\begin{align} startled &= p(bark | night) \cdot nights \\ &= 0.05 \cdot 365 \\ &~= 18 \end{align}

In many cases, you'll map the logistic regression output into the solution to a binary classification problem, in which the goal is to correctly predict one of two possible labels (e.g., "spam" or "not spam"). A later module focuses on that.

You might be wondering how a logistic regression model can ensure output that always falls between 0 and 1. As it happens, a sigmoid function, defined as follows, produces output having those same characteristics:

$$y = \frac{1}{1 + e^{-z}}$$

The sigmoid function yields the following plot: Figure 1: Sigmoid function.

If $$z$$ represents the output of the linear layer of a model trained with logistic regression, then $$sigmoid(z)$$ will yield a value (a probability) between 0 and 1. In mathematical terms:

$$y' = \frac{1}{1 + e^{-z}}$$

where:

• $$y'$$ is the output of the logistic regression model for a particular example.
• $$z = b + w_1x_1 + w_2x_2 + \ldots + w_Nx_N$$
• The $$w$$ values are the model's learned weights, and $$b$$ is the bias.
• The $$x$$ values are the feature values for a particular example.

Note that $$z$$ is also referred to as the log-odds because the inverse of the sigmoid states that $$z$$ can be defined as the log of the probability of the $$1$$ label (e.g., "dog barks") divided by the probability of the $$0$$ label (e.g., "dog doesn't bark"):

$$z = \log\left(\frac{y}{1-y}\right)$$

Here is the sigmoid function with ML labels:

Figure 2: Logistic regression output.

[]
[]