Bootstrap Regression

Two bootstrap techniques to get confidence intervals for regression coefficients are the case/pair bootstrap method and the residual bootstrap method. Case Bootstrap   The case or sometimes called pair is the response variables paired with the predictor variables . For B bootstrap samples Take a sample with replacement of size from  the corresponding bootstrap sample will be . […]

Definition of Statistics?

Depends on who you ask. I am taking formal statistics training at the moment and this is my observation… I see statistics defined as either the science of uncertainty or definition is extracting information from data. I found that if the person comes from a mathematical background they tend to gravitate towards the science of uncertainty. If they have a engineering […]

Leave One Out Cross Validation

Cross validation is one of many metrics for estimating out-of-sample error for predictive models. There are many flavors of cross validation hold-out, k-fold, leave one out (LOOCV), etc. I whipped up a neat little visualization script in R to help understand LOOCV. In Figure 1 the visual that shows how LOOCV works. The model is trained on the […]

Standard Error of the Mean – Derivation

The standard error (SE) is an amazingly useful statistical device for defining confidence intervals. In layman terms standard error is measure of how far a sample statistic is from it’s true value. This post will go through the process of deriving the SE of the mean. I have always wanted to dig deeper into where the […]


In lieu of diving into logistic regression. I am going to review probability. What is probability? Outcomes of interest versus all possible outcomes. Mathematically this is represented by: where P(A) is the probability. Numerical values for probability can range as a continuous variable from zero to one e.g(0.1, 0.99996, 0.23, 1). For example a bag […]