testing u- shaped, inverted u- shaped or other nonlinear relationships...
TRANSCRIPT
WIEN| 28 MAY 2015
TESTING U- SHAPED, INVERTED U- SHAPED OR OTHER NONLINEAR RELATIONSHIPS IN BUSINESS
- Sönke Albers Professor of Marketing and Innovation Kühne Logistics University, Hamburg
AGENDA
2
• Rationale for U-shape or inverted U-shape
• Testing for U-shapes
• Alternative regressions (Weighting inverse to frequency)
• Alternative functional forms
• Nonparametric methods:
• Trend lines in Excel
• Spline Regressions
• Kernel regression
• Local polynomial smoothing
AGENDA
3
• Rationale for U-shape or inverted U-shape
• Testing for U-shapes
• Alternative regressions (Weighting inverse to frequency)
• Alternative functional forms
• Nonparametric methods:
• Trend lines in Excel
• Spline Regressions
• Kernel regression
• Local polynomial smoothing
RATIONALE FOR U- SHAPE
4
Operations Cost curves are typically assumed u-shaped that have somewhere a minimum.
CSR Barnett and Salomon (SMJ 2012) found a u-
shaped relationship between corporate financial performance (ROA) and corporate social performance.
Happiness Happiness is u-shaped with respect to age
(Frijters and Beatton, 2012). Barnett Michael L. and Robert M. Salomon (2012): Does it pay to be really good? Addressing the shape of the relationship between social and financial performance, Strategic Management Journal, 33 (11), 1304-1320. Frijters, Paul and Tony Beatton (2012): The mystery of the U-shaped relationship between happiness and age, Journal of Economic Behavior & Organization, 82 (2/3), 525-542.
RATIONAL FOR INVERTED U- SHAPE
5
Profit Financial Performance depends on sales and costs and thus must exhibit a maximum.
Sales Manchanda and Chintagunta (MarkLett 2004)
argue that a sales response function might exhibit a super-saturation where sales declines after a certain point because of too much selling effort.
Manchanda, Puneet and Pradeep K. Chintagunta (2004): Responsiveness of Physician Prescription Behavior to Salesforce Effort: An Individual Level Analysis, Marketing Letters, 15 (2-3), 129-145.
SATISFACTION–LOYALTY FUNCTIONAL FORMS AND MODERATING FACTORS
6
Functional Form Characteristics Existence Moderating factors Moderating factors
Product category and market characteristics
Customer economic and demographic variables
1 Linear Ngobo (1999) Anderson (1994) Homburg and Giering (2001)
Streukens and Ruyter (2004) Seiders et al. (2005) Magi (2003) Seiders et al. (2005)
2 Concave Jones and Sasser (1995) None None
Ngobo (1999)
3 Convex Jones and Sasser (1995) None None
Keiningham et al. (2003)
4 S-shaped Ngobo (1999) None None
5 Inverse S-shaped Homburg et al. (2005) None None
Keiningham et al. (2003) Nonlinear (undefined) Mittal and Kamakura (2001) None Mittal and Kamakura (2001)
Dong, Songting, Min Ding, Rajdeep Grewal, and Ping Zhao (2011): Functional forms of the satisfaction–loyalty relationship, International Journal of Research in Marketing, 28 (1), 38–50
AGENDA
7
• Rationale for U-shape or inverted U-shape
• Testing for U-shapes
• Alternative regressions (Weighting inverse to frequency)
• Alternative functional forms
• Nonparametric methods:
• Trend lines in Excel
• Spline Regressions
• Kernel regression
• Local polynomial smoothing
TYPICAL TEST FOR U- SHAPE
8
In most empirical work trying to identify U shapes, the researcher includes a nonlinear (usually quadratic) term in an otherwise standard regression model. • If this term is significant and, in addition, • the estimated extremum is within the data range, it is common to conclude that there is a U-shaped relationship.
Jo Thori Lind and Halvor Mehlum (2010): With or Without U? The Appropriate Test for a U-Shaped Relationship, OXFORD BULLETIN OF ECONOMICS AND STATISTICS, 72 (1), 109-118
QUADRATIC REGRESSION LEADS TO SIGNIFICANT RELATIONSHIP BUT DOES NOT REPRESENT SHAPE
9
y = 72,798x0,3664 R² = 0,2345
y = 80,032ln(x) + 11,537 R² = 0,2062
y = -0,011x2 + 3,2774x + 183,76 R² = 0,1973
0,00
100,00
200,00
300,00
400,00
500,00
600,00
700,00
0 20 40 60 80 100 120 140 160 180 200
Sales
QUADRATIC REGRESSION FOR EXAMPLE DATA
10
Regression Statistics Multiple R 0,44417807 R Square 0,19729416 Adjusted R Square 0,19568392
Standard Error 95,7654713
Observations 1000
ANOVA
df SS MS F Significance F Regression 2 2247350,79 1123675,4 122,524509 2,6386E-48 Residual 997 9143512,42 9171,0255 Total 999 11390863,2
Coefficients Standard Error t Stat P-value Optimum Intercept 183,760289 10,1702193 18,0684686 2,4001E-63 148,99 Budget 3,27737825 0,32226047 10,1699667 3,4904E-23 Budget2 -0,01099834 0,00235221 -4,67575754 3,3313E-06
TESTING OF SIGNIFICANCE OF QUADRATIC TERM
11
Testing the significance of the quadratic term might be misleading because the quadratic and linear term are highly correlated and therefore we get non-essential correlation which leads to inflated standard errors. A better test is whether the additional variance of the introduction of a quadratic term is significant. Other ways of avoiding are transforming the polynomial terms.
TRANSFORMING QUADRATIC TERMS
12
It is recommended to mean-center the original and quadratic variables which removes non-essential correlation and thus leads to meaningful standard errors. However, the interpretation of the coefficients is changing. Another alternative is to orthogonalize the polynomial variables. In case of a quadratic term it can be done by regressing the linear term on the quadratic term and using only the residuals which are orthogonal to the linear term. In case of a polynomial of degree larger than two the Orthpoly command in Stata can be used in order to get a set of orthogonal variables.
RESULTS OF DIFFERENT TRANSFORMATIONS OF THE QUADRATIC TERM
13
Regression Statistics df SS MS Multiple R 0,44417807 Regression 2 2247350,79 1123675,395 R Square 0,19729416 Residual 997 9143512,42 9171,025498 Adjusted R Square 0,19568392 Total 999 11390863,2
Standard Error 95,7654713 F Significance
F Observations 1000 122,5245089 2,6386E-48
Coefficients Standard Error t Stat P-value Intercept 183,760289 10,1702193 18,0684686 0,0000 Budget 3,27737825 0,32226047 10,1699667 0,0000 Budget2 -0,01099834 0,002352205 -4,67575754 0,0000
Coefficients Standard Error t Stat P-value Intercept 317,169735 3,313744506 95,7133944 0,0000 Budget centered 2,2072784 0,143486002 15,3832316 0,0000 Budget2 centered -0,01099834 0,002352205 -4,67575754 0,0000
Coefficients Standard Error t Stat P-value Intercept 218,84814 6,864418523 31,8815263 0,0000 Budget 1,89176875 0,126629319 14,9394213 0,0000 Budget2 orthogonal -0,01099834 0,002352205 -4,67575754 0,0000
QUADRATIC RELATIONSHIP LEADS TO WRONG CONCLUSION
14
y = 0,0101x3 - 0,5157x2 + 45,799x + 21,503
y = 0,1528x2 + 34,691x + 57,834
0
1000
2000
3000
4000
5000
6000
7000
8000
0 10 20 30 40 50 60 70
Sale
s per
Phy
sici
an
Frequency
Moving Average
Albers, Sönke (2012): Optimizable and Implementable Aggregate Response Modeling for Marketing Decision Support, International Journal of Research in Marketing, 29 (2), 111-122, p. 112
SUFFICIENT TEST FOR U- OR INVERTED U- SHAPE
15
The typical test with a quadratic test only provides a necessary condition for a u-shape but not a sufficient one. Sufficient tests are given when • A visual inspection suggests such a relationship (Albers 2012) • When the data of the smaller (larger) than optimal values
exhibit a negative (positive) slope in the regression (Kostyshak 2015) testable with a utest in Stata
Albers, Sönke (2012): Optimizable and Implementable Aggregate Response Modeling for Marketing Decision Support, International Journal of Research in Marketing, 29 (2), 111-122 Kostyshak, Scott (2015): Non-parametric Testing of U-shaped Relationships, Working Paper, Princeton
U- TEST IN STATA FOR U- OR INVERTED U- SHAPE
16
. utest Budget Budget2 Specification: f(x)=x^2 Extreme point: 148.9942 Test: H1: Inverse U shape vs. H0: Monotone or U shape Lower bound Upper bound Interval 1.7 195.72 Slope 3.239984 -1.027812 t-value 10.28821 -1.613219 P>t 5.78e-24 .0535067 Overall test of presence of a Inverse U shape: t-value = 1.61 P>t = .0535
QUADRATIC REGRESSION FOR EXAMPLE DATA
17
Regression Statistics Multiple R 0,44417807 R Square 0,19729416 Adjusted R Square 0,19568392
Standard Error 95,7654713
Observations 1000
ANOVA
df SS MS F Significance F Regression 2 2247350,79 1123675,4 122,524509 2,6386E-48 Residual 997 9143512,42 9171,0255 Total 999 11390863,2
Coefficients Standard Error t Stat P-value Optimum Intercept 183,760289 10,1702193 18,0684686 2,4001E-63 148,99 Budget 3,27737825 0,32226047 10,1699667 3,4904E-23 Budget2 -0,01099834 0,00235221 -4,67575754 3,3313E-06
y = 4,2087x - 321,88 R² = 0,4342
0,00
100,00
200,00
300,00
400,00
500,00
600,00
170,00 175,00 180,00 185,00 190,00 195,00 200,00
AGENDA
18
• Rationale for U-shape or inverted U-shape
• Testing for U-shapes
• Alternative regressions (Weighting inverse to frequency)
• Alternative functional forms
• Nonparametric methods:
• Trend lines in Excel
• Spline Regressions
• Kernel regression
• Local polynomial smoothing
ALTERNATIVE REGRESSION FOR DETECTING SHAPES
19
Although OLS has good properties for predicting the response variable given a random draw of independent variables, these properties do not directly translate to estimating the global shape of a regression function when the distribution of the independent variable is not uniform (Albers 2012, Kostyshak 2015). => A solution is using an inverse frequency weighting with GLS (Albers 2012, Albers 2015, Kostyshak 2015)
Albers, Sönke (2012): Optimizable and Implementable Aggregate Response Modeling for Marketing Decision Support, International Journal of Research in Marketing, 29 (2), 111-122 Albers, Sönke (2015): What Drives Publication Productivity at German Universities?, Schmalenbach Business Review, 67 (1, 2015), 6-33 Kostyshak, Scott (2015): Non-parametric Testing of U-shaped Relationships, Working Paper, Princeton
INVERSE FREQUENCY WEIGHTED REGRESSION
20
Sales Coef. Std. Err. t
Budget 2.357705 0.323625 7.29 0Budget2 -0.0055941 0.001877 -2.98 0.003_cons 215.4278 11.68436 18.44 0
Optimum 210.731396 Outside data interval
AGENDA
21
• Rationale for U-shape or inverted U-shape
• Testing for U-shapes
• Alternative regressions (Weighting inverse to frequency)
• Alternative functional forms
• Nonparametric methods:
• Trend lines in Excel
• Spline Regressions
• Kernel regression
• Local polynomial smoothing
ALTERNATIVE FUNCTIONAL FORMS
22
Name Function Elasticity
Constant Elasticity
Semi-logarithmic
Diminishing Elasticity
Modified Exponential
Log-Reciprocal
Albers, Sönke (2012): Optimizable and Implementable Aggregate Response Modeling for Marketing Decision Support, International Journal of Research in Marketing, 29 (2), 111-12, p. 113
LOG- LOG- OLS- REGRESSION IS DIFFERENT FROM NLS
23
Estimating constant elasticity response functions with a linearized log-log function can lead to dramatically different results for the optimal call level compared to a nonlinear estimation
Functional Form Optimal No. Calls
Log-Log 41.60
Constant Elasticity 8.29
OPTIMUM IMPLICATIONS OF DIFFERENT FUNCTIONAL FORMS
24
Optimal value of Constant Elasticity
(OLS)
Semilog
Log-reciprocal
Constant Elasticity
(NLS)
Modified Exponentia
l Function evaluated Constant Elasticity (OLS) 41.60 30.91 -236.08 -635.11 -2,775.27 Semilog 228.67 255.37 -38.94 -519.37 -2,993.94
Log-reciprocal -48.85 -80.12 90.44 -97.79 -2,292.02 Constant Elasticity (NLS) 8.29 14.52 37.62 43.67 -23.19 Modified Exponential 3.16 6.88 34.30 61.04 117.39
Comparison of profits computed for the functional forms (in rows) at the optimal values of the functional forms (in columns).
Albers, Sönke (2012): Optimizable and Implementable Aggregate Response Modeling for Marketing Decision Support, International Journal of Research in Marketing, 29 (2), 111-12, p. 114
AGENDA
25
• Rationale for U-shape or inverted U-shape
• Testing for U-shapes
• Alternative regressions (Weighting inverse to frequency)
• Alternative functional forms
• Nonparametric methods:
• Trend lines in Excel
• Spline Regressions
• Kernel regression
• Local polynomial smoothing
TREND LINES IN EXCEL
26
SPLINE REGRESSION - 1-
27
The spline model is generally defined as piecewise polynomials of degree n whose function values and first n-1 derivatives agree at the points where they join. The spline linear model is a seamless compound function consisting of a series of linear equations. Although discrete, the neighboring functions meet at the knots. There is a R package mgcv authored by Simon Wood. This command in Stata allows spline regressions: bspline, xvar(age) knots(0 25 50 75 100) p(3) gen(_bs3k)
SPLINE REGRESSION - 2-
28
Multivariate adaptive regression splines (MARS) is a form of regression analysis introduced by Jerome H. Friedman in 1991. It is a non-parametric regression technique and can be seen as an extension of linear models that automatically models nonlinearities and interactions between variables. The software "MARS" is distributed by Salford Systems.
SPLINE REGRESSION OF DIFFERENT ORDER
29
Linear Splines
Cubic Splines allow continuous mapping of non-polynomial shapes of data
LOWESS SMOOTHER IN STATA
30
0.00
200.
0040
0.00
600.
00S
ales
0.00 50.00 100.00 150.00 200.00Budget
bandwidth = .8
Running mean smoother
LOCAL POLYNOMIAL SMOOTHING IN STATA
31
0.00
200.
0040
0.00
600.
00S
ales
0.00 50.00 100.00 150.00 200.00Budget
kernel = epanechnikov, degree = 3, bandwidth = 25.43
Local polynomial smooth
CONCLUSION
32
• In theory there are many relationships which should have a u-shape or inverted u-shape, in particular for cost minima and profit maxima
• Typical testing of these shapes with the help of estimating a quadratic regression and checking the significance of the quadratic term is not sufficient.
• The optimum should be within the data interval, a regression of the smaller (larger) values of the optimum should exhibit different signs of the slopes, and functional form should show face validity by inspection.
• An alternative is the inverse frequency weighted regression.
• There are alternatives of non-parametric estimation with the help of Excel trend lines, regression splines, and local polynomial smoothing.
33
Thank you very much for your attention!