[Project 2] Day 5: Intro to Logistic Regression – Advance Mathematical Statistics (MTH 522) Assignments

Today I started learning logestic regression:

When you are analyzing datasets in which there are one or more classification independent variables that determine an outcome.
It is primarily used for binary classification problems, where the goal is to predict outcome, such as whether an email is spam and not spam, whether a customer will buy a product or not, or whether a student will pass or fail the exam.
This type of statistical model (also known as logit model) is often used for classification and predictive analytics.
Logistic regression estimates the probability of an event occurring, such as voted or didn’t vote, based on a given dataset of independent variables.
Since the outcome is a probability, the dependent variable is bounded between 0 and 1.
In logistic regression, a logit transformation is applied on the odds—that is, the probability of success divided by the probability of failure. This is also commonly known as the log odds, or the natural logarithm of odds, and this logistic function is represented by the following formulas:

Logit(pi) = 1/(1+ exp(-pi))

ln(pi/(1-pi)) = Beta_0 + Beta_1*X_1 + … + B_k*K_k

In this logistic regression equation, logit(pi) is the dependent or response variable and x is the independent variable.
The beta parameter, or coefficient, in this model is commonly estimated via maximum likelihood estimation (MLE).
This method tests different values of beta through multiple iterations to optimize for the best fit of log odds.
All of these iterations produce the log likelihood function, and logistic regression seeks to maximize this function to find the best parameter estimate.
Once the optimal coefficient (or coefficients if there is more than one independent variable) is found, the conditional probabilities for each observation can be calculated, logged, and summed together to yield a predicted probability.
For binary classification, a probability less than .5 will predict 0 while a probability greater than 0 will predict 1.
After the model has been computed, it’s best practice to evaluate the how well the model predicts the dependent variable, which is called goodness of fit.
The Hosmer–Lemeshow test is a popular method to assess model fit.