regression analysis using spss - stat modeller

45
Regression Analysis using SPSS Delivered by Hiren Kakkad | CEO & Co-founder Stat Modeller, Vadodara www.statmodeller.com

Upload: others

Post on 12-Jan-2022

11 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Regression Analysis using SPSS - Stat Modeller

Regression

Analysis using

SPSS

Delivered byHiren Kakkad | CEO & Co-founder

Stat Modeller, Vadodarawww.statmodeller.com

Page 2: Regression Analysis using SPSS - Stat Modeller

Correlation

• In real life, we frequently find that a group of two or more

variables move together

• When they move together, we can say they are

correlated.

• The variables may be (Y1, Y2), (X1, X2) or (X, Y)

Page 3: Regression Analysis using SPSS - Stat Modeller

Some Examples

Year of Experience

Breakdown

Credit Score

Salary

Equipment Life

Loan Amount

Independent Variables Dependent Variables

Page 4: Regression Analysis using SPSS - Stat Modeller

Types of Correlation

Curvilinear relationship

Page 5: Regression Analysis using SPSS - Stat Modeller

Correlation

• If we found relationship between x and y

variables, we can go for developing the

model which can give prediction for y

when x is known.

• That prediction model is known as

Regression Analysis

Page 6: Regression Analysis using SPSS - Stat Modeller

Regression Analysis

• Determine whether the independent variables explain a significant variation in the dependent variable

Whether a relationship exists

• Determine how much of the variation in the dependent variable can be explained by the independent variables

Strength of the relationship

• Determine the structure or form of the relationshipMathematical

equation

• Predict the values of the dependent variablePrediction of new

values

Page 7: Regression Analysis using SPSS - Stat Modeller

Regression Analysis

• Only one dependent and one independent variable

• Predict Co2 vs. Engine Size

• Independent variable (x): Engine size

• Dependent Variable (y): Co2 Emissions

Simple Linear

Regression

• One dependent variable and multiple independent variables

• Predict Co2 Vs. Engine Size and Cylinders

• Independent variables (xs): Engine size, Cylinders

• Dependent Variable (y): Co2 Emissions

Multiple Linear

Regression

Page 8: Regression Analysis using SPSS - Stat Modeller

Simple Linear Regression

Page 9: Regression Analysis using SPSS - Stat Modeller

Assumptions of Regression Analysis

• Random error 𝜀 is normally distributed

• The correlation between dependent variable y

and independent variable x should be very high

• Data is collected must be random

Page 10: Regression Analysis using SPSS - Stat Modeller

Regression Analysis Understanding

Page 11: Regression Analysis using SPSS - Stat Modeller

Regression Analysis Understanding

𝑦 = 𝛽0 + 𝛽1𝑥1 + 𝜀Dependent

Variable

Intercept Coefficient

Independent

Variable

Error

Ice-cream Sales

Temp.

𝐼𝑐𝑒 𝐶𝑟𝑒𝑎𝑚 𝑆𝑎𝑙𝑒𝑠 = 𝛽0 + 𝛽1 𝑇𝑒𝑚𝑝 + 𝜀

Page 12: Regression Analysis using SPSS - Stat Modeller

Data of Ice Cream Sales

Temperature °C (x)

Ice Cream Sales (y)

𝑥𝑖 − 𝑥2 𝑦𝑖 − 𝑦2 𝑥𝑖 − 𝑥2

(𝑦𝑖−𝑦2)𝑥𝑖 − 𝑥2 2 (𝑦𝑖−𝑦2)2

14.2 210 -4.5 -187.5 839.1 20.0 35156.316.4 320 -2.3 -77.5 176.3 5.2 6006.311.9 180 -6.8 -217.5 1473.6 45.9 47306.315.2 327 -3.5 -70.5 245.0 12.1 4970.318.5 401 -0.2 3.5 -0.6 0.0 12.322.1 518 3.4 120.5 412.7 11.7 14520.319.4 407 0.7 9.5 6.9 0.5 90.325.1 609 6.4 211.5 1358.9 41.3 44732.323.4 539 4.7 141.5 668.6 22.3 20022.318.1 416 -0.6 18.5 -10.6 0.3 342.322.6 440 3.9 42.5 166.8 15.4 1806.317.2 403 -1.5 5.5 -8.1 2.2 30.3

𝑥 = 18.675 𝑦 = 397.5 5328.5 177.0 174995.0

𝛽1 =Σ(𝑥𝑖− 𝑥)(𝑦𝑖 − 𝑦)

Σ(𝑥𝑖− 𝑥)2

𝛽1 =5328.5

177.0

𝛽1 = 30.10

𝛽0 = 𝑦 − 𝛽1 𝑥

𝛽0 = 397.5 − 30.10 * 18.675

𝛽0 = −164.70

𝐼𝑐𝑒 𝐶𝑟𝑒𝑎𝑚 𝑆𝑎𝑙𝑒𝑠 = −164.70 + 30.10 𝑇𝑒𝑚𝑝

Page 13: Regression Analysis using SPSS - Stat Modeller

Case Study

• This dataset contains a subset of the fuel economy data that the

EPA (Environmental Protection Agency) makes available

on http://fueleconomy.gov.

• It contains only models which had a new release every year

between 1999 and 2008.

www.statmodeller.com 13

Page 14: Regression Analysis using SPSS - Stat Modeller

Simple Linear Regression

ENGINESIZE CO2EMISSIONS

0 2.0 196

1 2.4 221

2 1.5 136

3 3.5 255

4 3.5 244

5 3.5 230

6 3.5 232

7 3.7 255

8 3.7 267

9 2.4 ???

Using above data, can we predict this value of CO2

Emissions?

Dependent VariableIndependent Variable

𝐶𝑜2 𝐸𝑚𝑖𝑠𝑠𝑖𝑜𝑛𝑠 = 𝛽0 + 𝛽1 𝐸𝑛𝑔𝑖𝑛𝑒 𝑆𝑖𝑧𝑒

Page 15: Regression Analysis using SPSS - Stat Modeller

Let’s Make Scatter Plot

ENGINESIZE CO2EMISSIONS

0 2.0 196

1 2.4 221

2 1.5 136

3 3.5 255

4 3.5 244

5 3.5 230

6 3.5 232

7 3.7 255

8 3.7 267

9 2.4 ???

Using above data, can we predict this value of CO2

Emissions?

Page 16: Regression Analysis using SPSS - Stat Modeller

Benefits of Linear Regression

• Very Fast

• No Parameter Tuning (unlike setting k in K-NN algorithm)

• Easy to understand, highly interpretable

Page 17: Regression Analysis using SPSS - Stat Modeller

Steps for the Simple Linear Regression

•Using Scatter Plot

•Using correlation coefficient

Check the X and Y Relationship

•Data is independent of order (runs test)

•Error term is normally distributed (k-s test or Shapiro test)

Check for the Assumptions •Estimate the value of

𝛽0 𝑎𝑛𝑑 𝛽1

Develop a model

•Derive 𝑟2 𝑣𝑎𝑙𝑢𝑒 of the model

• See the p-value of model and coefficient

Check Model Accuracy •Predict value using a

model

Use for Prediction

Page 18: Regression Analysis using SPSS - Stat Modeller

Multiple Linear Regression

Page 19: Regression Analysis using SPSS - Stat Modeller

Multiple Linear Regression

ENGINESIZE

CYLINDERS

FUELCONSUMPTION_COMB

CO2EMISSIONS

0 2.0 4 8.5 196

1 2.4 4 9.6 221

2 1.5 4 5.9 136

3 3.5 6 11.1 255

4 3.5 6 10.6 244

5 3.5 6 10.0 230

6 3.5 6 10.1 232

7 3.7 6 11.1 255

8 3.7 6 11.6 267

9 2.4 4 9.2 ???

Using above data, can we predict this value of CO2

Emissions?

Page 20: Regression Analysis using SPSS - Stat Modeller

Regression Analysis Understanding

𝑦 = 𝛽0 + 𝛽1𝑥1 + 𝛽2𝑥2 + ⋯+ 𝛽𝑘𝑥𝑘 + 𝜀Dependent

Variable

Intercept Coefficient

Independent

Variable

Error

𝐶𝑜2 𝐸𝑚𝑖𝑠𝑠𝑖𝑜𝑛 = 𝛽0 + 𝛽1 𝐸𝑛𝑔𝑖𝑛𝑒 𝑠𝑖𝑧𝑒 + 𝛽2 𝐶𝑦𝑙𝑖𝑛𝑑𝑒𝑟𝑠 + 𝛽3 𝐹𝑢𝑒𝑙 𝐶𝑜𝑛𝑠𝑢𝑚𝑝𝑡𝑖𝑜𝑛 𝐶𝑜𝑚𝑏 + 𝜀

Page 21: Regression Analysis using SPSS - Stat Modeller

Assumptions of Multiple Linear RegressionSource: http://www.restore.ac.uk/srme/www/fac/soc/wie/research-new/srme/modules/mod3/3/index.html

Page 22: Regression Analysis using SPSS - Stat Modeller

1. Linear Relationship

• The model is a roughly linear one. This is slightly different from simple linear

regression as we have multiple explanatory variables. This time we want the

outcome variable to have a roughly linear relationship with each of the

explanatory variables, taking into account the other explanatory variables in

the model.

Page 23: Regression Analysis using SPSS - Stat Modeller

2. Homoscedasticity

• Homoscedasticity

assumption means that the

variance around

the regression line is the same for

all values of the predictor variable

(X).

• We can check this by plot the

standardized residuals (error)

against the predicted values.

Page 24: Regression Analysis using SPSS - Stat Modeller

3. Outliers/influential cases

• As with simple linear regression, it is important to look out for

cases which may have a disproportionate influence over your

regression model.

Page 25: Regression Analysis using SPSS - Stat Modeller

4. Multicollinearity

• Multicollinearity exists when two or more

of the explanatory variables are highly

correlated.

• It also suggests that the two variables

may actually represent the same

underlying factor.

• It can be also checked by VIF values. VIF

>10 means, there is problem of

Multicollinearity.

Page 26: Regression Analysis using SPSS - Stat Modeller

5. Normally distributed residuals

• Error 𝑦 − 𝑦 should be normally

distributed

Page 27: Regression Analysis using SPSS - Stat Modeller

6. Autocorrelation

• Autocorrelation refers to the degree of

correlation between the values of the

same variables across different

observations in the data. ... In a

regression analysis, autocorrelation of

the regression residuals can also occur if

the model is incorrectly specified.

• A common method of testing for

autocorrelation is the Durbin-Watson test

Page 28: Regression Analysis using SPSS - Stat Modeller

Let’s run this in SPSS

Page 29: Regression Analysis using SPSS - Stat Modeller

Stat ModellerROBUST KIT OF SOLUTIONS

Page 30: Regression Analysis using SPSS - Stat Modeller

About Us

Stat Modeller is formed in 2019 providing services related to

training and consultancy for Operational Excellence, Application

of Statistical Tools and Data Science Tools to solve the problems

of various segments.

We have a team of experts who are having vast experience in

academic, industries, research, consulting etc.

Page 31: Regression Analysis using SPSS - Stat Modeller

Why Stat ModellerData analysis is an immense part of any problem solving or research. In industry as well as

in research, data plays a vital role. Data is collected in a large quantity. But the challenges

are which technique to be used and how?

To overcome these challenges, Stat Modeller provides the solutions to the industries,

organizations, institutes, universities and individuals who are looking for their data to

analyze with right techniques. Stat Modeller has a team of experts who are having vast

experience in industry, academic, research, consulting etc. who are committed to provide

reliable and quick service to our valuable clients. Client satisfaction is our ultimate

objective.

Page 32: Regression Analysis using SPSS - Stat Modeller

Services

Domain

Data Science

Business Transformation

in Industries

Research Projects

Training in Institutes/

Universities

Page 33: Regression Analysis using SPSS - Stat Modeller

Services

Data

Science

• Machine Learning

• R and R Studio

• Python

• SAS

• SPSS

• Minitab

• Excel and Advance Excel

Business Transformation

• Six Sigma

• Lean

• 5-S

• Kaizen

• Kanban

• QMS

• SPC and SQC and many more

Research Projects

• Research Projects

• Survey Analysis

• Marketing Research etc.

Institutes/ Universities

• Workshops

• Trainings

• Certification Course for Students etc.

Page 34: Regression Analysis using SPSS - Stat Modeller

Our Expertise

Page 35: Regression Analysis using SPSS - Stat Modeller

Clients in various Domain

Agro Economics

Agro Business Management

Dairy Economics

Home Science

Mechanical Engineering

Pharmaceutical Sciences

Financial Management

Management Studies

Business Studies

Marketing Management

Library Science

Page 36: Regression Analysis using SPSS - Stat Modeller

Workshop on Basics of SPSS at

BVM College of Engineering, Vallabh Vidyanagar

Page 37: Regression Analysis using SPSS - Stat Modeller

Workshop on Role of SPSS in Research at

DDU, Nadiad

Page 38: Regression Analysis using SPSS - Stat Modeller

3 Days Workshop on Basics of Python at

Department of Statistics, Sardar Patel University

Page 39: Regression Analysis using SPSS - Stat Modeller

2 + 1 Days Workshop on BASE SAS at

Department of Statistics, Sardar Patel University

Page 40: Regression Analysis using SPSS - Stat Modeller

Training on R at

Mumbai University

Training on R at

AERC, Vallabh Vidyanagar

Training on R at

FDP, SPU

Training on R at

HRDC, Gujarat University

Training on R at

Charusat University, Changa

Training on SPSS at

Charusat University, Changa

Page 41: Regression Analysis using SPSS - Stat Modeller

Mr. Hiren Kakkado CEO & Co-founder of Stat Modeller

o More than 8 years of industrial experience

o Certified Lean Six Sigma Black Belt

o Certified Auditor for ISO 9001

o Trained 3500+ participants

o Guided 100+ Improvement projects

o Assisted 25+ Research Projects

o Trainer for in R, Python, SPSS, Minitab, Power BI, Excel, Advanced Excel

Page 42: Regression Analysis using SPSS - Stat Modeller

Mr. Mehul Gandhio Business Associate of Stat Modeller

o More than 7 years of industrial experience

o Trained Lean Six Sigma Black Belt

o Certified Auditor for ISO 9001

o Trained 150+ participants

o Guided 5+ Improvement projects &

Process Time Study

o Trainer - Excel, Advanced Excel, 5S, Kaizen, Quality Tools, ISO 9001 and many more.

Page 43: Regression Analysis using SPSS - Stat Modeller

Contact us

D-503, Sharnam Happy Homes,

Sayaji Township Road,

Sayajipura,Vadodara - 390019

+91 9898233268

[email protected]

www.statmodeller.com

Page 44: Regression Analysis using SPSS - Stat Modeller

You can register onhttps://statmodeller.com/events/

Page 45: Regression Analysis using SPSS - Stat Modeller

Thank you

Any Questions?