continuous regression analysis – session 6 data collection and data analysis in information...

Post on 26-Dec-2015






Click to see full reader


Continuous Regression Analysis – Session 6Data Collection and Data Analysis in Information Systems ResearchPh.D. Seminar Presentation Martin Wolf (09.05.2008)Supervisor: Dr. Oliver Hinz

Chair of Business Administrationesp. Information Management Prof. Dr. Wolfgang KönigJohann Wolfgang Goethe University

Agenda (Session 7)

09.05.2008 Slide 2/44

Agenda (Session 7)

09.05.2008 Slide 3/44

Agenda (Part I)

1. Goals of Regression Analysis2. Underlying Assumptions3. Exemplary Regression Analysis (SPSS)4. Summary5. Questions

09.05.2008 Slide 4/44

Part I: 1. Goals 2. Assumptions 3. Exemplary Regression Analysis 4. Summary

Agenda (Part I)

1. Goals of Regression Analysis2. Underlying Assumptions3. Exemplary Regression Analysis (SPSS)4. Summary5. Questions

09.05.2008 Slide 5/44

Part I: 1. Goals 2. Assumptions 3. Exemplary Regression Analysis 4. Summary

Examines the linear dependency between one (bivariate regression) or more (multiple regression) independent variable(s) and one dependent variable (explanatory approach)

Application of least squares method to minimize error between sample data and linear model

Domain of Interest: analysis of time series, prediction of causal relationships, root cause analysis (e.g. individual differences – computer skill)

Goals of Regression Analysis


kiki xbby ,0 *ˆ (regression function)

09.05.2008 Slide 6/44

Part I: 1. Goals 2. Assumptions 3. Exemplary Regression Analysis 4. Summary


Least Squares Method

09.05.2008 Slide 7/44

(Source: Skiera 2005)

Part I: 1. Goals 2. Assumptions 3. Exemplary Regression Analysis 4. Summary

Regression coefficients

R²: Goodness of Fit

F-Ratio: Significance of the overall model

T-test: Significance of the regression coefficients

Regression Results

09.05.2008 Slide 8/44

Part I: 1. Goals 2. Assumptions 3. Exemplary Regression Analysis 4. Summary

Agenda (Part I)

1. Goals of a Regression Analysis2. Underlying Assumptions3. Exemplary Regression Analysis (SPSS)4. Summary5. Questions

09.05.2008 Slide 9/44

Part I: 1. Goals 2. Assumptions 3. Exemplary Regression Analysis 4. Summary

Linear dependency between independent variables and dependent variable

Dependent and independent variables have to be provided at metric level (except dummy variables)

Independent variables have to be uncorrelated (no multicollinearity)-> Collinearity Statistics, Tolerance >=0,1-> Correlation Matrix

Residuals have to be uncorrelated (no autocorrelation)-> Durbin-Watson-Coefficient ≈ 2

Underlying Assumptions (I)

09.05.2008 Slide 10/44

Part I: 1. Goals 2. Assumptions 3. Exemplary Regression Analysis 4. Summary

Residuals have to follow a normal distribution-> Kolmogorov-Smirnov Test-> Plots (normality, histogram)-> n>50 -> central limit theorem

No heteroscedasticity of the residuals-> e.g. White‘s general test for heteroscedasticity -> Plot (standardized residuals against stardardized predictors)

Data set has to represent a random sample

No outliers (check DFBETA, standard deviation as distance measure)

Underlying Assumptions (II)

09.05.2008 Slide 11/44

Part I: 1. Goals 2. Assumptions 3. Exemplary Regression Analysis 4. Summary

Agenda (Part I)

1. Goals of Regression Analysis2. Underlying Assumptions3. Exemplary Regression Analysis (SPSS)4. Summary5. Questions

09.05.2008 Slide 12/44

Part I: 1. Goals 2. Assumptions 3. Exemplary Regression Analysis 4. Summary

Exemplary Regression Analysis

Example Data Set: Consequences of a reduction of work time per week from 40 to 38,5 hours within 80 industries in Baden-Wurttemberg (1985)

Research Question: How does a change in work time influence the employment?


av85.10 ∆-employment (compared to 1984)

uv85.10 ∆-revenue (compared to 1984)


∆-over hours (compared to 1984)

azv dichotomous variable (reduction of work time)

09.05.2008 Slide 13/44

Part I: 1. Goals 2. Assumptions 3. Exemplary Regression Analysis 4. Summary

SPSS Syntax File

* Compute Linear Regression, Save Standardized Residuals.* Calculate Durbin-Watson Coefficient (Check for autocorrelation).* Calculate Collinearity Statistics (Check for multicollinearity).* Generate P-P Diagramme (Check for heteroscedasticity).* Display Model Summary.REGRESSION /MISSING LISTWISE /STATISTICS COEFF OUTS R ANOVA COLLIN TOL /CRITERIA=PIN(.05) POUT(.10) /NOORIGIN /DEPENDENT av85.10 /METHOD=ENTER uv85.10 stv85.10 azv /SAVE ZRESID /RESIDUALS DURBIN HIST(ZRESID) NORM(ZRESID) /SCATTERPLOT=(*ZRESID ,*ZPRED ).

* Kolmogorov-Smirnov Test of Residuals.* (Check if residuals follow a normal distribution).NPAR TESTS /K-S(NORMAL)=ZRE_1 /MISSING ANALYSIS.

09.05.2008 Slide 14/44

Part I: 1. Goals 2. Assumptions 3. Exemplary Regression Analysis 4. Summary

SPSS Output File

09.05.2008 Slide 15/44

Variables Entered/Removed(b)

Model Variables Entered

Variables Removed Method

1 azv, uv85.10, stv85.10(a)

. Enter

a All requested variables entered. b Dependent Variable: av85.10

Model Summary(b)

Model R R Square Adjusted R

Square Std. Error of the Estimate Durbin-Watson

1 ,709(a) ,502 ,482 ,04454 1,873

a Predictors: (Constant), azv, uv85.10, stv85.10 b Dependent Variable: av85.10 ANOVA(b)

Model Sum of

Squares df Mean Square F Sig. 1 Regression ,152 3 ,051 25,551 ,000(a)

Residual ,151 76 ,002 Total ,303 79

a Predictors: (Constant), azv, uv85.10, stv85.10 b Dependent Variable: av85.10

Part I: 1. Goals 2. Assumptions 3. Exemplary Regression Analysis 4. Summary

SPSS Output File

09.05.2008 Slide 16/44



Unstandardized Coefficients

Standardized Coefficients t Sig. Collinearity Statistics

B Std. Error Beta Tolerance VIF B Std. Error 1 (Constant) -,029 ,010 -2,979 ,004

uv85.10 ,389 ,059 ,564 6,606 ,000 ,900 1,111 stv85.10 -,354 ,137 -,241 -2,591 ,011 ,757 1,321 azv ,044 ,012 ,361 3,742 ,000 ,703 1,423

a Dependent Variable: av85.10 Collinearity Diagnostics(a)

Model Dimension

Eigenvalue Condition

Index Variance Proportions

(Constant) uv85.10 stv85.10 azv (Constant) uv85.10 1 1 2,036 1,000 ,05 ,03 ,04 ,06

2 1,154 1,329 ,01 ,38 ,17 ,03 3 ,666 1,748 ,01 ,56 ,23 ,13 4 ,144 3,764 ,93 ,02 ,56 ,78

a Dependent Variable: av85.10 Residuals Statistics(a)

Minimum Maximum Mean Std. Deviation N Predicted Value -,0943 ,1226 ,0086 ,04388 80 Residual -,07976 ,16050 ,00000 ,04369 80 Std. Predicted Value -2,345 2,598 ,000 1,000 80 Std. Residual -1,791 3,603 ,000 ,981 80

a Dependent Variable: av85.10

Part I: 1. Goals 2. Assumptions 3. Exemplary Regression Analysis 4. Summary

SPSS Output File

09.05.2008 Slide 17/44

One-Sample Kolmogorov-Smirnov Test


Residual N 80

Normal Parameters(a,b) Mean ,0000000 Std. Deviation ,98082889

Most Extreme Differences

Absolute ,074 Positive ,074 Negative -,036

Kolmogorov-Smirnov Z ,659 Asymp. Sig. (2-tailed) ,778

a Test distribution is Normal. b Calculated from data.

Part I: 1. Goals 2. Assumptions 3. Exemplary Regression Analysis 4. Summary

SPSS Output File

09.05.2008 Slide 18/44

Part I: 1. Goals 2. Assumptions 3. Exemplary Regression Analysis 4. Summary

SPSS Output File

09.05.2008 Slide 19/44

Part I: 1. Goals 2. Assumptions 3. Exemplary Regression Analysis 4. Summary

SPSS Output File

09.05.2008 Slide 20/44

Part I: 1. Goals 2. Assumptions 3. Exemplary Regression Analysis 4. Summary

Agenda (Part I)

1. Goals of Regression Analysis2. Underlying Assumptions3. Exemplary Regression Analysis (SPSS)4. Summary

09.05.2008 Slide 21/44

Part I: 1. Goals 2. Assumptions 3. Exemplary Regression Analysis 4. Summary

Regression Analysis is a means of root cause analysis and prediction, if linear dependency can be assumed

Requires an extensive random sample for a significant model(at least independent variables * 5)

Strict assumptions have to be fullfilled


11.02.2008 Folie 22/44

Part I: 1. Goals 2. Assumptions 3. Exemplary Regression Analysis 4. Summary

Cohen, Jacob; Cohen, Patricia; West, Stephen G.; Aiken, Leona S. (2003): Applied Multiple Regression/ Correlation Analysis for the Behavioral Sciences, 3rd Edition. Lawrence Erlbaum Associates, Publishers, New Jersey, USA.

Backhaus, Klaus; Erichson, Bernd; Plinke, Wulff; Weiber, Rolf (2003): Multivariate Analysemethoden, 10. Auflage. Springer Verlag, Berlin Heidelberg, Germany.

Chatterjee, Samprit; Hadi, Ali S.; Price, Bertram (2000): Regression Analysis by Example, Third Edition. John Wiley & Sons, Inc., New York, USA.

McClendon, MCKee J. (2002): Multiple Regression and Causal Analysis. Reissued by Waveland Press, Inc., Prospect Heights, Illinois,USA.


09.05.2008 Slide 23/44

Part I: 1. Goals 2. Assumptions 3. Exemplary Regression Analysis 4. Summary


Brosius, Felix (2006): SPSS 14. Das mitp-Standartwerk. Redline GmbH, Heidelberg, Germany.

Schnell, Rainer; Hill, Paul B.; Esser, Elke (1999): Methoden der empirischen Sozialforschung, 6. Auflage. R. Oldenbourg Verlag, München, Germany.

Part I: 1. Goals 2. Assumptions 3. Exemplary Regression Analysis 4. Summary

09.05.2008 Slide 24/44



Part I: 1. Goals 2. Assumptions 3. Exemplary Regression Analysis 4. Summary

09.05.2008 Slide 25/44

Agenda (Session 7)

09.05.2008 Slide 26/44

Agenda (Part II)

1. Background2. Research Question3. Utilized Model4. Results5. Summary (Pros and Cons)6. Questions

Part II: 1. Background 2. Research Question 3. Utilized Model 4. Results 5. Summary

09.05.2008 Slide 27/44

Agenda (Part II)

1. Background2. Research Question3. Utilized Model4. Results5. Summary (Pros and Cons)6. Questions

09.05.2008 Slide 28/35

Part II: 1. Background 2. Research Question 3. Utilized Model 4. Results 5. Summary

Introduction of a vessel traffic service (VTS) for the lower Mississippi in late 1977 in order to prevent rammings and collisions of vessels

VTS is an example of a Decision Support System (DSS)

Literature: utilization surrogate of success, only measured as dichotomous variable, no consistent results


09.05.2008 Slide 29/35

Part II: 1. Background 2. Research Question 3. Utilized Model 4. Results 5. Summary

Agenda (Part II)

1. Background2. Research Question3. Utilized Model4. Results5. Summary (Pros and Cons)6. Questions

Part II: 1. Background 2. Research Question 3. Utilized Model 4. Results 5. Summary

09.05.2008 Slide 30/44

Research Question

Is there a linear causal relationship between DSS Usage and System Performance(less vessel accidents)?

Part II: 1. Background 2. Research Question 3. Utilized Model 4. Results 5. Summary

09.05.2008 Slide 31/44

Agenda (Part II)

1. Background2. Research Question3. Utilized Model4. Results5. Summary (Pros and Cons)6. Questions

Part II: 1. Background 2. Research Question 3. Utilized Model 4. Results 5. Summary

09.05.2008 Slide 32/44

Utilization as an Intervening Variable

(Source: Trice and Treacy 1988)

Forward LinkagesBackward Linkages

Part II: 1. Background 2. Research Question 3. Utilized Model 4. Results 5. Summary

09.05.2008 Slide 33/44

Utilized Linear Regression Model

Part II: 1. Background 2. Research Question 3. Utilized Model 4. Results 5. Summary

09.05.2008 Slide 34/44

Agenda (Part II)

1. Background2. Research Question3. Utilized Model4. Results5. Summary (Pros and Cons)6. Questions

Part II: 1. Background 2. Research Question 3. Utilized Model 4. Results 5. Summary

09.05.2008 Slide 35/44

Model Summary

Explanatory Variables Coefficients

Lagged Accidents Rate (-1)


Length of DSS Use -7,6515*

Traffic Level 0,0599

DSS Utilization -5,0437*

River Stage -0,3084

Dec-Jan Weather 1,0529*

Oct-Nov Weather 1,1119*

R² 0,4283

D-W Statistic 1,9933

D.F. 133

F-Ratio 11,0727** p<0,01; * p<0.05

Part II: 1. Background 2. Research Question 3. Utilized Model 4. Results 5. Summary

09.05.2008 Slide 36/44

Significant negative correlation of DSS utilization, length of DSS Use with objective performance criterion (number of vessel accidents)

ResultsPart II: 1. Background 2. Research Question 3. Utilized Model 4. Results 5. Summary

09.05.2008 Slide 37/44

Agenda (Part II)

1. Background2. Research Question3. Utilized Model4. Results5. Summary (Pros and Cons)6. Questions

Part II: 1. Background 2. Research Question 3. Utilized Model 4. Results 5. Summary

09.05.2008 Slide 38/44

Objective justification of DSS introduction(IT is an enabler)

Utilization of a broad model

Relatively high fit of the model

High significance of the model

ProsPart II: 1. Background 2. Research Question 3. Utilized Model 4. Results 5. Summary

09.05.2008 Slide 39/44

No exact specification of the used dimensions of the coefficients (-> standardized coefficients)

Peak utilization was aggregated for DSS usage

No specification how weather indicator was derived

Assumptions were not addressed

Momentum already showed decreasing trend

ConsPart II: 1. Background 2. Research Question 3. Utilized Model 4. Results 5. Summary

09.05.2008 Slide 40/44

Blanc and Kozar (1990): An Empirical Investigation of the Relationship Between DSS Usage and System Performance: A Case Study of a Navigation Support System. In: MISQ, 14(3), pp. 263-277.

LiteraturePart II: 1. Background 2. Research Question 3. Utilized Model 4. Results 5. Summary

09.05.2008 Slide 41/44

Agenda (Part II)

1. Background2. Research Question3. Utilized Model4. Results5. Summary (Pros and Cons)6. Questions

Part II: 1. Background 2. Research Question 3. Utilized Model 4. Results 5. Summary

09.05.2008 Slide 42/44



Part II: 1. Background 2. Research Question 3. Utilized Model 4. Results 5. Summary

09.05.2008 Slide 43/44

Thank you very much for your attention!

Part II: 1. Background 2. Research Question 3. Utilized Model 4. Results 5. Summary

09.05.2008 Slide 44/44

top related