Probability

In lieu of diving into logistic regression. I am going to review probability.

What is probability?

Outcomes of interest versus all possible outcomes. Mathematically this is represented by:

P(A) = \frac{\text{outcomes of interest}}{\text{all possible outcomes}}

where P(A) is the probability. Numerical values for probability can range as a continuous variable from zero to one e.g(0.1, 0.99996, 0.23, 1).

For example a bag has five red marbles and six blue marbles. The probability of randomly picking a red marble, P(red marble), would be:

P(\text{picking a red marble}) = \frac{5}{11}

Five red marbles over all the possible marbles in the bag  five red marbles and six blue marbles = eleven marbles.

And, thus starting from a full bag of eleven marbles. Probability of randomly picking a blue marble would be:

P(\text{picking a blue marble}) = \frac{6}{11}

Probability can be of two types independent or dependent. Independent probability is not affected by other independent variables or probabilities. P(A) does not depend on the P(B), vice versa, where A and B are any independent probabilities.

Dependent probability commonly noted as P(A | B), probability of A given B. In other words, probability of A given B is true. Recycling the example above consider a bag has five red marbles and six blue marbles in addition two red marbles are magic and one blue marble is magic. Simplifying, A is picking a red marble and B picking a blue marble.

What is the probability of randomly picking a magic marble given that B is true?

P(\text{picking magic marble} | B) = \frac{1}{6}

One blue magic marble was in the bag. Given a blue marble was picked, remember only six blue marbles are in the bag. Probability of picking a magic marble given B is true, reduces to one magic marble out of six blue marbles.

I am reviewing probability as a primer for a deep dive into logistic regression. I wanted to mention one more fact that will help understand logistic regression, which is the probability of event A and B occurring, assuming A and B are independent events is define as:

P(A and B) = P(A) * P(B).

With probability as a base, I will start a series on defining and implementing a logistic regression algorithm.