testing u- shaped, inverted u- shaped or other nonlinear relationships...

WIEN| 28 MAY 2015

TESTING U- SHAPED, INVERTED U- SHAPED OR OTHER NONLINEAR RELATIONSHIPS IN BUSINESS

- Sönke Albers Professor of Marketing and Innovation Kühne Logistics University, Hamburg

AGENDA

2

• Rationale for U-shape or inverted U-shape

• Testing for U-shapes

• Alternative regressions (Weighting inverse to frequency)

• Alternative functional forms

• Nonparametric methods:

• Trend lines in Excel

• Spline Regressions

• Kernel regression

• Local polynomial smoothing

AGENDA

3










RATIONALE FOR U- SHAPE

4

Operations Cost curves are typically assumed u-shaped that have somewhere a minimum.

CSR Barnett and Salomon (SMJ 2012) found a u-

shaped relationship between corporate financial performance (ROA) and corporate social performance.

Happiness Happiness is u-shaped with respect to age

(Frijters and Beatton, 2012). Barnett Michael L. and Robert M. Salomon (2012): Does it pay to be really good? Addressing the shape of the relationship between social and financial performance, Strategic Management Journal, 33 (11), 1304-1320. Frijters, Paul and Tony Beatton (2012): The mystery of the U-shaped relationship between happiness and age, Journal of Economic Behavior & Organization, 82 (2/3), 525-542.

RATIONAL FOR INVERTED U- SHAPE

5

Profit Financial Performance depends on sales and costs and thus must exhibit a maximum.

Sales Manchanda and Chintagunta (MarkLett 2004)

argue that a sales response function might exhibit a super-saturation where sales declines after a certain point because of too much selling effort.

Manchanda, Puneet and Pradeep K. Chintagunta (2004): Responsiveness of Physician Prescription Behavior to Salesforce Effort: An Individual Level Analysis, Marketing Letters, 15 (2-3), 129-145.

SATISFACTION–LOYALTY FUNCTIONAL FORMS AND MODERATING FACTORS

6

Functional Form Characteristics Existence Moderating factors Moderating factors

Product category and market characteristics

Customer economic and demographic variables

1 Linear Ngobo (1999) Anderson (1994) Homburg and Giering (2001)

Streukens and Ruyter (2004) Seiders et al. (2005) Magi (2003) Seiders et al. (2005)

2 Concave Jones and Sasser (1995) None None

Ngobo (1999)

3 Convex Jones and Sasser (1995) None None

Keiningham et al. (2003)

4 S-shaped Ngobo (1999) None None

5 Inverse S-shaped Homburg et al. (2005) None None

Keiningham et al. (2003) Nonlinear (undefined) Mittal and Kamakura (2001) None Mittal and Kamakura (2001)

Dong, Songting, Min Ding, Rajdeep Grewal, and Ping Zhao (2011): Functional forms of the satisfaction–loyalty relationship, International Journal of Research in Marketing, 28 (1), 38–50

AGENDA

7










TYPICAL TEST FOR U- SHAPE

8

In most empirical work trying to identify U shapes, the researcher includes a nonlinear (usually quadratic) term in an otherwise standard regression model. • If this term is significant and, in addition, • the estimated extremum is within the data range, it is common to conclude that there is a U-shaped relationship.

Jo Thori Lind and Halvor Mehlum (2010): With or Without U? The Appropriate Test for a U-Shaped Relationship, OXFORD BULLETIN OF ECONOMICS AND STATISTICS, 72 (1), 109-118

QUADRATIC REGRESSION LEADS TO SIGNIFICANT RELATIONSHIP BUT DOES NOT REPRESENT SHAPE

9

y = 72,798x0,3664 R² = 0,2345

y = 80,032ln(x) + 11,537 R² = 0,2062

y = -0,011x2 + 3,2774x + 183,76 R² = 0,1973

0,00

100,00

200,00

300,00

400,00

500,00

600,00

700,00

0 20 40 60 80 100 120 140 160 180 200

Sales

QUADRATIC REGRESSION FOR EXAMPLE DATA

10

Regression Statistics Multiple R 0,44417807 R Square 0,19729416 Adjusted R Square 0,19568392

Standard Error 95,7654713

Observations 1000

ANOVA

df SS MS F Significance F Regression 2 2247350,79 1123675,4 122,524509 2,6386E-48 Residual 997 9143512,42 9171,0255 Total 999 11390863,2

Coefficients Standard Error t Stat P-value Optimum Intercept 183,760289 10,1702193 18,0684686 2,4001E-63 148,99 Budget 3,27737825 0,32226047 10,1699667 3,4904E-23 Budget2 -0,01099834 0,00235221 -4,67575754 3,3313E-06

TESTING OF SIGNIFICANCE OF QUADRATIC TERM

11

Testing the significance of the quadratic term might be misleading because the quadratic and linear term are highly correlated and therefore we get non-essential correlation which leads to inflated standard errors. A better test is whether the additional variance of the introduction of a quadratic term is significant. Other ways of avoiding are transforming the polynomial terms.

TRANSFORMING QUADRATIC TERMS

12

It is recommended to mean-center the original and quadratic variables which removes non-essential correlation and thus leads to meaningful standard errors. However, the interpretation of the coefficients is changing. Another alternative is to orthogonalize the polynomial variables. In case of a quadratic term it can be done by regressing the linear term on the quadratic term and using only the residuals which are orthogonal to the linear term. In case of a polynomial of degree larger than two the Orthpoly command in Stata can be used in order to get a set of orthogonal variables.

RESULTS OF DIFFERENT TRANSFORMATIONS OF THE QUADRATIC TERM

13

Regression Statistics df SS MS Multiple R 0,44417807 Regression 2 2247350,79 1123675,395 R Square 0,19729416 Residual 997 9143512,42 9171,025498 Adjusted R Square 0,19568392 Total 999 11390863,2

Standard Error 95,7654713 F Significance

F Observations 1000 122,5245089 2,6386E-48

Coefficients Standard Error t Stat P-value Intercept 183,760289 10,1702193 18,0684686 0,0000 Budget 3,27737825 0,32226047 10,1699667 0,0000 Budget2 -0,01099834 0,002352205 -4,67575754 0,0000

Coefficients Standard Error t Stat P-value Intercept 317,169735 3,313744506 95,7133944 0,0000 Budget centered 2,2072784 0,143486002 15,3832316 0,0000 Budget2 centered -0,01099834 0,002352205 -4,67575754 0,0000

Coefficients Standard Error t Stat P-value Intercept 218,84814 6,864418523 31,8815263 0,0000 Budget 1,89176875 0,126629319 14,9394213 0,0000 Budget2 orthogonal -0,01099834 0,002352205 -4,67575754 0,0000

QUADRATIC RELATIONSHIP LEADS TO WRONG CONCLUSION

14

y = 0,0101x3 - 0,5157x2 + 45,799x + 21,503

y = 0,1528x2 + 34,691x + 57,834

0

1000

2000

3000

4000

5000

6000

7000

8000

0 10 20 30 40 50 60 70

Sale

s per

Phy

sici

an

Frequency

Moving Average

Albers, Sönke (2012): Optimizable and Implementable Aggregate Response Modeling for Marketing Decision Support, International Journal of Research in Marketing, 29 (2), 111-122, p. 112

SUFFICIENT TEST FOR U- OR INVERTED U- SHAPE

15

The typical test with a quadratic test only provides a necessary condition for a u-shape but not a sufficient one. Sufficient tests are given when • A visual inspection suggests such a relationship (Albers 2012) • When the data of the smaller (larger) than optimal values

exhibit a negative (positive) slope in the regression (Kostyshak 2015) testable with a utest in Stata

Albers, Sönke (2012): Optimizable and Implementable Aggregate Response Modeling for Marketing Decision Support, International Journal of Research in Marketing, 29 (2), 111-122 Kostyshak, Scott (2015): Non-parametric Testing of U-shaped Relationships, Working Paper, Princeton

U- TEST IN STATA FOR U- OR INVERTED U- SHAPE

16

. utest Budget Budget2 Specification: f(x)=x^2 Extreme point: 148.9942 Test: H1: Inverse U shape vs. H0: Monotone or U shape Lower bound Upper bound Interval 1.7 195.72 Slope 3.239984 -1.027812 t-value 10.28821 -1.613219 P>t 5.78e-24 .0535067 Overall test of presence of a Inverse U shape: t-value = 1.61 P>t = .0535

QUADRATIC REGRESSION FOR EXAMPLE DATA

17

Regression Statistics Multiple R 0,44417807 R Square 0,19729416 Adjusted R Square 0,19568392

Standard Error 95,7654713

Observations 1000

ANOVA

df SS MS F Significance F Regression 2 2247350,79 1123675,4 122,524509 2,6386E-48 Residual 997 9143512,42 9171,0255 Total 999 11390863,2

Coefficients Standard Error t Stat P-value Optimum Intercept 183,760289 10,1702193 18,0684686 2,4001E-63 148,99 Budget 3,27737825 0,32226047 10,1699667 3,4904E-23 Budget2 -0,01099834 0,00235221 -4,67575754 3,3313E-06

y = 4,2087x - 321,88 R² = 0,4342

0,00

100,00

200,00

300,00

400,00

500,00

600,00

170,00 175,00 180,00 185,00 190,00 195,00 200,00

AGENDA

18










ALTERNATIVE REGRESSION FOR DETECTING SHAPES

19

Although OLS has good properties for predicting the response variable given a random draw of independent variables, these properties do not directly translate to estimating the global shape of a regression function when the distribution of the independent variable is not uniform (Albers 2012, Kostyshak 2015). => A solution is using an inverse frequency weighting with GLS (Albers 2012, Albers 2015, Kostyshak 2015)

Albers, Sönke (2012): Optimizable and Implementable Aggregate Response Modeling for Marketing Decision Support, International Journal of Research in Marketing, 29 (2), 111-122 Albers, Sönke (2015): What Drives Publication Productivity at German Universities?, Schmalenbach Business Review, 67 (1, 2015), 6-33 Kostyshak, Scott (2015): Non-parametric Testing of U-shaped Relationships, Working Paper, Princeton

INVERSE FREQUENCY WEIGHTED REGRESSION

20

Sales Coef. Std. Err. t

Budget 2.357705 0.323625 7.29 0Budget2 -0.0055941 0.001877 -2.98 0.003_cons 215.4278 11.68436 18.44 0

Optimum 210.731396 Outside data interval

AGENDA

21










ALTERNATIVE FUNCTIONAL FORMS

22

Name Function Elasticity

Constant Elasticity

Semi-logarithmic

Diminishing Elasticity

Modified Exponential

Log-Reciprocal


LOG- LOG- OLS- REGRESSION IS DIFFERENT FROM NLS

23

Estimating constant elasticity response functions with a linearized log-log function can lead to dramatically different results for the optimal call level compared to a nonlinear estimation

Functional Form Optimal No. Calls

Log-Log 41.60

Constant Elasticity 8.29

OPTIMUM IMPLICATIONS OF DIFFERENT FUNCTIONAL FORMS

24

Optimal value of Constant Elasticity

(OLS)

Semilog

Log-reciprocal

Constant Elasticity

(NLS)

Modified Exponentia

l Function evaluated Constant Elasticity (OLS) 41.60 30.91 -236.08 -635.11 -2,775.27 Semilog 228.67 255.37 -38.94 -519.37 -2,993.94

Log-reciprocal -48.85 -80.12 90.44 -97.79 -2,292.02 Constant Elasticity (NLS) 8.29 14.52 37.62 43.67 -23.19 Modified Exponential 3.16 6.88 34.30 61.04 117.39

Comparison of profits computed for the functional forms (in rows) at the optimal values of the functional forms (in columns).


AGENDA

25










TREND LINES IN EXCEL

26

SPLINE REGRESSION - 1-

27

The spline model is generally defined as piecewise polynomials of degree n whose function values and first n-1 derivatives agree at the points where they join. The spline linear model is a seamless compound function consisting of a series of linear equations. Although discrete, the neighboring functions meet at the knots. There is a R package mgcv authored by Simon Wood. This command in Stata allows spline regressions: bspline, xvar(age) knots(0 25 50 75 100) p(3) gen(_bs3k)

SPLINE REGRESSION - 2-

28

Multivariate adaptive regression splines (MARS) is a form of regression analysis introduced by Jerome H. Friedman in 1991. It is a non-parametric regression technique and can be seen as an extension of linear models that automatically models nonlinearities and interactions between variables. The software "MARS" is distributed by Salford Systems.

SPLINE REGRESSION OF DIFFERENT ORDER

29

Linear Splines

Cubic Splines allow continuous mapping of non-polynomial shapes of data

LOWESS SMOOTHER IN STATA

30

0.00

200.

0040

0.00

600.

00S

ales

0.00 50.00 100.00 150.00 200.00Budget

bandwidth = .8

Running mean smoother

LOCAL POLYNOMIAL SMOOTHING IN STATA

31

0.00

200.

0040

0.00

600.

00S

ales

0.00 50.00 100.00 150.00 200.00Budget

kernel = epanechnikov, degree = 3, bandwidth = 25.43

Local polynomial smooth

CONCLUSION

32

• In theory there are many relationships which should have a u-shape or inverted u-shape, in particular for cost minima and profit maxima

• Typical testing of these shapes with the help of estimating a quadratic regression and checking the significance of the quadratic term is not sufficient.

• The optimum should be within the data interval, a regression of the smaller (larger) values of the optimum should exhibit different signs of the slopes, and functional form should show face validity by inspection.

• An alternative is the inverse frequency weighted regression.

• There are alternatives of non-parametric estimation with the help of Excel trend lines, regression splines, and local polynomial smoothing.

33

Thank you very much for your attention!

testing u- shaped, inverted u- shaped or other nonlinear relationships...

Documents