In lieu of diving into logistic regression. I am going to review probability.
What is probability?
Outcomes of interest versus all possible outcomes. Mathematically this is represented by:
where P(A) is the probability. Numerical values for probability can range as a continuous variable from zero to one e.g(0.1, 0.99996, 0.23, 1).
For example a bag has five red marbles and six blue marbles. The probability of randomly picking a red marble, P(red marble), would be:
Five red marbles over all the possible marbles in the bag five red marbles and six blue marbles = eleven marbles.
And, thus starting from a full bag of eleven marbles. Probability of randomly picking a blue marble would be:
Probability can be of two types independent or dependent. Independent probability is not affected by other independent variables or probabilities. P(A) does not depend on the P(B), vice versa, where A and B are any independent probabilities.
Dependent probability commonly noted as P(A | B), probability of A given B. In other words, probability of A given B is true. Recycling the example above consider a bag has five red marbles and six blue marbles in addition two red marbles are magic and one blue marble is magic. Simplifying, A is picking a red marble and B picking a blue marble.
What is the probability of randomly picking a magic marble given that B is true?
One blue magic marble was in the bag. Given a blue marble was picked, remember only six blue marbles are in the bag. Probability of picking a magic marble given B is true, reduces to one magic marble out of six blue marbles.
I am reviewing probability as a primer for a deep dive into logistic regression. I wanted to mention one more fact that will help understand logistic regression, which is the probability of event A and B occurring, assuming A and B are independent events is define as:
With probability as a base, I will start a series on defining and implementing a logistic regression algorithm.