[Project 1] Day 6: Understanding Linear Regression

  • In Statistical analysis, ‘Regression’ is a method implemented to understand or determine the relationship between 2 or more variables . Using this relationship, we would be able to determine an unknown value which depends on ur predicting variable
    *Variable in statistics represents any quantity that can be measured or counted
    There are 2 main types of Variables:
    a) Categorical variables: the variables which represent groupings
    b) Quantitative variables: The variables which use numbers to represent total values.
    In the case of the CDC data we have 3 quantitative variables variables, namely % Diabetes, % Inactivity and % Obesity. The categorical variables  are ‘STATE’ and ‘COUNTY’ .
  • In regression ,  we classify the variables we want to analyze under ‘dependent variable’ and independent variable’.
  • As the name suggests, ‘Linear Regression’ assumes that there is a linear relationship between the variables. The end result would have us plot a straight line through the data points on a plot which would best describe the relationship.
  • Simple Linear Regression plots a strait 2 dimensional line to find the relationship between 2 variables.
    In our project we would be doing simple linear regression to show the relationships between  % Diabetes &  % Inactivity, % Diabetes & %  Obesity and % Inactivity &% Obesity respectively . This line would be represented by the equation y = β₀ + β₁*x₁ + ε.
  • Multiple linear Regression, on the other hand ,  would create a 3 dimensional plot i.e a plane. with the data of % Diabetes, % Inactivity and % Obesity.
  • Project work: On meeting with the group today, we have decided to explore our options after having calculated the R^2 value for the data. We have begun the ‘feature engineering process’ . On my part,  I started  feature scaling and am exploring the statistical tests which will be helpfull towards the project.

Leave a Reply

Your email address will not be published. Required fields are marked *