statistics in finance - m&a and gdp growth
DESCRIPTION
Statistical research on M&ATRANSCRIPT
Group Coursework Submission Form
Specialist Masters Programme
Please list all names of group members:
(Surname, first name)
1. Wei Bo
2. Agrawal Minakshi
3. Lemercier Jean
GROUP NUMBER:
MSc in: Finance
Module Code: SMM 248
Module Title: STATISTICS IN FINANCE
Lecturer: Professor Ana-Maria Fuertes Submission Date: 09 DECEMBER 2013
Declaration:
By submitting this work, we declare that this work is entirely our own except those parts duly identi-
fied and referenced in my submission. It complies with any specified word limits and the requirements
and regulations detailed in the coursework instructions and any other relevant programme and module
documentation. In submitting this work we acknowledge that we have read and understood the regula-
tions and code regarding academic misconduct, including that relating to plagiarism, as specified in the
Programme Handbook. We also acknowledge that this work will be subject to a variety of checks for
academic misconduct.
We acknowledge that work submitted late without a granted extension will be subject to penalties, as
outlined in the Programme Handbook. Penalties will be applied for a maximum of five days lateness,
after which a mark of zero will be awarded.
Marker’s Comments (if not being marked on-line):
Deduction for Late Submission: FinalMark: %
18
11/27/2013
Statistics in Finance
Mergers & Acquisition effects on
Economic Growth
Bo Wei – Minakshi Agrawal – Jean Lemercier
~ 3 ~
Bo Wei – Minakshi Agrawal – Jean Lemercier
Introduction
Mergers and Acquisition global volumes amount to USD2.1 trillion for the first nine months of 2013, representing a
17% increase from 2012 levels. If this USD2.1 trillion figure is far from the peak of 2007 M&A volumes which were in
excess of $4 trillion, many economists and experts forecast that these volumes will continue to increase over the
next few years. Variation in M&A volumes are observed to follow patterns or waves, where volumes increase
substantially (1990s, 2001, 2008 – see graph) and then suddenly drop. Consequently, various literary works have
been increasingly focusing on explaining the determinants of change in M&A volumes, may it be fundamental factors
such as industrial/economic/regulatory shocks explained in the article “What drives merger waves?” (Harford, 2004)
and/or papers such as “The Free Cash Flow Theory of Takeovers: A Financial Perspective on Mergers and
Acquisitions and the Economy” where other factors such as agency costs, excess free cash flows and attempts of
market timing (Jensen, 1987).
The research on consequences of M&A activity and post-merger results at the firm level shows that there is a
widespread argument whether M&A creates value after taking cost (bid-premium) into account – however the
consensus is often that targeted companies performance improves post-acquisition (M. Healy, 1990). The commonly
used reasoning for justifying acquired companies’ outperformance is that M&A activities create value by constricting
agency costs and creating synergies between companies.
If acquisitions lead to sustainable long-term productivity gains at the firm level, one could argue that acquisitions at
the aggregate level may have an impact on economic growth. Literature displays two different theories. The first one
explains why M&A activity can induce GDP growth, thanks to increased productivity created though synergies and
slashing agency costs. The second theory support the idea that M&A transaction are detrimental to the economy in
the sense that they mostly result in more market control for the acquiring company, which rules out smaller
companies who lack in scale and size to remain viable or give them an incentive to cut down on R&D to remain
competitive. If the relationship between these two variables is hard to define, it seems as though a correlation could
reasonably be expected.
The aim of this paper is to investigate whether changes in volumes of M&A activity are correlated with growth at the
aggregate level – i.e economic growth.
~ 4 ~
Bo Wei – Minakshi Agrawal – Jean Lemercier
Motivation It was reported that nine thousand billion dollars was spent by North American and Western European firms on
mergers and acquisitions (M&A’s) between 1995 and 1999. This is about seven times the GDP of the United Kingdom
by an incomprehensive comparison. There are numerous literatures talking about determinants and consequences
of merger and acquisition. However, the impact of M&A on economic growth is seldom explored. Monitoring the
relationship between M&A activity and economic growth could prove valuable to forecast economic growth or
recessions. In our model, we use the data from 2001 to 2013, including the periods of financial crisis. We also want
to know the performance of M&A after the financial crisis, emphasizing the relationship between M&A and
economic growth.
Data Description
Our model will feature the following variables
Y : Real GDP growth (QoQ, %, United Kingdom)
X1 M&A Volumes in the United Kingdom (% Change in sterling volume, QoQ)
X2 Deal count (% Change QoQ)
X3 Average premium paid over the period, (%, Quarterly)
X4 Bank of England base rate (%)
All the data has been taken from Bloomberg.
X1 X2 X3 X4 Y
Mean 0.174249 -0.002463 0.194067 3.029412 0.368627
Median -0.043002 0.001887 0.185500 4.000000 0.500000
Maximum 2.929415 0.269231 0.466800 5.750000 1.300000
Minimum -0.774920 -0.320059 0.068600 0.500000 -2.500000
Std. Dev. 0.733997 0.117647 0.076554 2.059519 0.790314
Skewness 1.305859 -0.257670 1.069145 -0.298630 -1.680696
Kurtosis 5.242891 3.496132 4.606199 1.319012 6.506831
Jarque-Bera 25.18472 1.087412 15.19833 6.762682 50.14324
Probability 0.000003 0.580593 0.000501 0.034002 0.000000
Sum 8.886678 -0.125589 9.897400 154.5000 18.80000
Sum Sq. Dev. 26.93761 0.692037 0.293024 212.0809 31.22980
Observations 51 51 51 51 51 Table 1 - Summary statistics
There are 51 observations in our model from the year 2001 to 2013.
Mean: The mean GDP growth over the period is 0.36%, which corresponds to an average yearly growth of
approximately 1.4%. The average premium paid over the 2001-13 period amounts to 19.4%, which is consistent with
the economic theory: on average market premium for listed stocks are positive and fluctuate between 20-50%
depending on markets and business cycles (On the role of acquisition premium in acquisition research, Tomi
LAAMANEN, 2007).
Jarque Bera: The Jarque-Bera can be used to test the hypothesis that the observations follow a normal distribution.
When the Jarque Bera test result is close to zero, it means that the sample is likely to follow a normal distribution. X2
has the lowest Jarque-Bera, which means it is very likely to follow a normal distribution, as opposed to other factors
~ 5 ~
Bo Wei – Minakshi Agrawal – Jean Lemercier
related to M&A (X3 and X1). This could potentially mean that the number of deal varies in a more “normal” manner,
with symmetrical number of positive and negative values and low tails, compared with the premium paid for
example (see Figure 1). It seems rational as many companies still need to acquire others in times of negative GDP
growth, but are indeed more reluctant to pay high premiums. However we have to take into account the small size
of our sample which could induce a bias (T=51) – the same test should be conducted on a larger time period to be
more conclusive.
Std.Deviation: X4 has the highest standard deviation which shows that the Bank of England base rate fluctuated
more than others during this period, whereas X3 (average premium paid over the period) with the lowest standard
deviation seems relevant constant. Perhaps this could be explained by the violent cut in rates due to the fact that
our sample features the 2007/08 crisis (in two quarters the rates have been cut from 5% to 0.5%).
X1 X2 X3 X4 Y
X1 1.000000 0.140336 0.024210 0.001954 -0.200199
X2 0.140336 1.000000 -0.084778 -0.058191 0.174809
X3 0.024210 -0.084778 1.000000 -0.286419 -0.438661
X4 0.001954 -0.058191 -0.286419 1.000000 0.304387
Y -0.200199 0.174809 -0.438661 0.304387 1.000000 Table 2 - Correlation matrix
The correlation matrix gives low numbers for our variables: this is quite positive for our model as we know that
having multicollinearity would induce a bias in our model. This means that we have been selecting variables that are
not too related between each other. However, even if the pairwise correlations are low (highest correlation (X3,
X4)=0.28) it is not enough to prove that there is no multicollinearity for a multiple linear regression model such as
ours as one of the variable could be poorly correlated with one other, but highly correlated when taking into account
2 or more other variables.
~ 6 ~
Bo Wei – Minakshi Agrawal – Jean Lemercier
Table 5 –Coefficients of Regression with less observations (51-6)
Table 6 –Coefficients of Regression with less regressors (3)
Model Specification Dependent Variable: Y
Method: Least Squares
Date: 12/03/13 Time: 13:47
Sample (adjusted): 1 51
Included observations: 51 after adjustments
Variable Coefficient Std. Error t-Statistic Prob.
C 0.878025 0.349749 2.510442 0.0156
X1 -0.235082 0.134480 -1.748077 0.0871
X2 1.260784 0.844930 1.492176 0.1425
X3 -3.677868 1.339949 -2.744783 0.0086
X4 0.082003 0.049681 1.650595 0.1056
R-squared 0.297841 Mean dependent var 0.368627
Adjusted R-squared 0.236783 S.D. dependent var 0.790314
S.E. of regression 0.690436 Akaike info criterion 2.189908
Sum squared resid 21.92830 Schwarz criterion 2.379303
Log likelihood -50.84265 Hannan-Quinn criter. 2.262281
F-statistic 4.878048 Durbin-Watson stat 0.931785
Prob(F-statistic) 0.002306
Table 3 - Linear regression
Is there a multicollinearity problem in your regression?
Multicollinearity overestimates the Standard Error (S.E) of the explanatory variables. As a result, the t-
statistic of the regressors is underestimated when there is multicollinearity (as T-statistic for the null
hypothesis equals  /S.E (Â)) and as a direct consequence the null hypothesis for the regressors tends to not
be rejected, although the joint hypothesis (F-statistic) rejects the null hypothesis. In our case, two out of the
four regressors reject the null hypothesis at the 10% significance level (two-sided test) namely X1 & X3. In
addition, X4 is very close to not being rejected at the 10% level. Therefore the T statistic “symptom” does
not prove to be relevant in our case to detect whether our model suffers from multicollinearity or not.
In a model with multicollinearity, parameters estimates change notably when observations are
excluded/added. We tested by deleting 6 observations (Observation N°5,10,15,20,25,30) and running a new
regression, the coefficient did not change significantly (see below).
In a model with multicollinearity, the parameter estimates change significantly when one parameter is
dropped. In our model it is not the case (see below).
Variable Coefficient
C 0.878025
X1 -0.235082
X2 1.260784
X3 -3.677868
X4 0.082003
Variable Coefficient
C 0.890713
X1MINUS6 -0.230501
X2MINUS6 1.292053
X3MINUS6 -3.757550
X4MINUS6 0.088370
Variable Coefficient
C 0.932787
X1 -0.206127
X3 -3.899481
X4 0.075433
Table 4 – Linear Regression Coefficients
~ 7 ~
Bo Wei – Minakshi Agrawal – Jean Lemercier
After running an auxiliary regression with X4: C X1 X2 X3, the R^2 result it 0.08, which is lower than 0.8, meaning there is no apparent multicollinearity. In conclusion, our model does not seem to have any significant multicollinearity and therefore does not need to be corrected for this.
Significance of the R²
Let’s test the significance of our model’s R² (29%):
Ho: X1=X2=X3=X4=0 or R²=0 / H1: X1 or X2 or X3 or X4 > 0 (at least one of them) and R²>0
F-statistic = [(RSSr-RSSu)/J] / [RSSu/(T-K)] = 4.87 (p-value = 0.0029 or 0.29%)
We reject the null hypothesis at any significance level (10, 5, 1%), R² is significant (in other words, at least one of the
regressors is significant). We can therefore say that the “fit” of our model is 29% since the null hypothesis has been
rejected, this number is significant. This makes sense since three of our individual regressors pass the t statistic.
~ 8 ~
Bo Wei – Minakshi Agrawal – Jean Lemercier
Is the model well specified (correct functional form and no omitted variable?)
Testing the functional form
Let’s check whether the change in Y over each regressor appears to be constant or not. If it does not appear to be
constant, then a nonlinear model could potentially better capture their behaviour.
From the four scatterplots, it is quite hard to infer the validity of the functional form of our model. However, we can
see that for X2 and X3, namely the Deal Count and the Average premium paid over the period, a curve pattern seems
to appear for extreme values (see curves for the two scatterplots). The pattern remains however predominantly
linear (see line for Deal count).
Conclusion on the functional form: The “suspicion” of a wrong functional form seems quite low. We have cleared out
the possibility of an interaction model, but it is still possible that a non-linear model explain better our variable.
-3
-2
-1
0
1
2
-1 0 1 2 3
Volume of M&A percentage change QoQ
UK
GD
P r
ea
l g
row
th Q
oQ
-3
-2
-1
0
1
2
.0 .1 .2 .3 .4 .5
Average premium over the period (%)
UK
GD
P r
ea
l g
row
th Q
oQ
-3
-2
-1
0
1
2
-.4 -.3 -.2 -.1 .0 .1 .2 .3
Deal Count % change QoQ
UK
GD
P r
ea
l g
row
th Q
oQ
-3
-2
-1
0
1
2
0 1 2 3 4 5 6
UK BOE Base rate (%)
UK
GD
P r
ea
l g
row
th Q
oQ
~ 9 ~
Bo Wei – Minakshi Agrawal – Jean Lemercier
Omitted variable(s)
It is indeed very hard to assess which variable we missed on which could induce biased coefficients and biased error
term (autocorrelation of Et).
Many different factors explain (at least) GDP, both quantitative (infrastructure level, commodity prices, productivity
gains…) and qualitative (education levels, consumer business confidence...) to the extent that it is nearly impossible
to know which variable could be missing in our model.
We will have to run the RAMSEY RESET test to make sure there is no functional form/omitted variable issue.
Ramsey RESET test
Ramsey RESET Test
Equation: LINEARREGRESSION1
Specification: Y C X1 X2 X3 X4
Omitted Variables: Powers of fitted values from 2 to 3 Value df Probability
F-statistic 6.721729 (2, 44) 0.0028
Likelihood ratio 13.59719 2 0.0011
The Ramsey Regression Specification Error Test is used to identify incorrect functional form or omitted variable. It
adds two new terms to the regression, one term is the previously estimated Y^2 and the other is the previously es-
timated Y^3. We then run an F-statistic, in order to check if the two new terms are all together significant or not. In
our case they are highly significant since the F statistic deliver a p value of 0.0028 (0.28%) – the test rejects the
“joint” null hypothesis. This means that our model is imperfect because the variables would better capture the de-
pendent variable in a non-linear model and/or there are omitted variables.
Before “correcting“ for this possible bias using either GLS (Generalised Least Squares) or Newey West robust stand-
ard errors (corrected Standard errors that take the possible bias into account), we will try to estimate some non-
linear models to see if they better capture the changes in our dependent variable (Y).
~ 10 ~
Bo Wei – Minakshi Agrawal – Jean Lemercier
Non Linear Regressions
1) y=c+β1X1^3+β2X2^3+β3X3^3+β4logX4
Variable Coefficient Std. Error t-Statistic Prob. C 0.551001 0.124268 4.433962 0.0001
X1^3 -0.085530 0.024040 -3.557864 0.0009
X2^3 16.57297 12.15265 1.363733 0.1793
X3^3 -19.69698 5.552571 -3.547363 0.0009
LOGX4 0.164905 0.081524 2.022776 0.0489
R-squared 0.463842 Mean dependent var 0.368627
Adjusted R-squared 0.417220 S.D. dependent var 0.790314
S.E. of regression 0.603326 Akaike info criterion 1.920176
Sum squared resid 16.74411 Schwarz criterion 2.109571
Log likelihood -43.96450 Hannan-Quinn criter. 1.992550
F-statistic 9.948903 Durbin-Watson stat 1.270721
Prob(F-statistic) 0.000007
2) y=c+β1X1+β2X2+β3X3^3+β4logX4
Variable Coefficient Std. Error t-Statistic Prob. C 0.574841 0.133602 4.302622 0.0001
X1 -0.284838 0.125149 -2.275991 0.0275
X2 1.411744 0.780537 1.808684 0.0770
X3^3 -23.29767 5.825209 -3.999456 0.0002
LOGX4 0.156002 0.086490 1.803695 0.0778 R-squared 0.393733 Mean dependent var 0.368627
Adjusted R-squared 0.341014 S.D. dependent var 0.790314
S.E. of regression 0.641561 Akaike info criterion 2.043068
Sum squared resid 18.93361 Schwarz criterion 2.232463
Log likelihood -47.09824 Hannan-Quinn criter. 2.115442
F-statistic 7.468533 Durbin-Watson stat 0.980911
Prob(F-statistic) 0.000101
3) y=c+β1X1^3+β2X2^3+β3X3^3+β4X4^3
Variable Coefficient Std. Error t-Statistic Prob. C 0.555785 0.156478 3.551850 0.0009
X1^3 -0.080232 0.024765 -3.239687 0.0022
X2^3 16.47492 12.56506 1.311170 0.1963
X3^3 -20.58295 5.751660 -3.578611 0.0008
X4^3 0.001790 0.001535 1.165911 0.2497 R-squared 0.432910 Mean dependent var 0.368627
Adjusted R-squared 0.383598 S.D. dependent var 0.790314
S.E. of regression 0.620486 Akaike info criterion 1.976266
Sum squared resid 17.71012 Schwarz criterion 2.165661
Log likelihood -45.39478 Hannan-Quinn criter. 2.048639
F-statistic 8.778961 Durbin-Watson stat 1.198208
Prob(F-statistic) 0.000024
~ 11 ~
Bo Wei – Minakshi Agrawal – Jean Lemercier
In these three non-linear regressions, we have mostly used the cubic factor as it captures the difference between
negative and positive changes when the square power does not account for this. In addition, the log function has
been used on interest rate (X4). It seems as though or non-linear model better explain our dependent variable as the
adjusted R^2 (overall fit of our model, adjusted for degrees of freedom) is higher for these three regressions than it
is for our linear model.
The non-linear regression No1 (y=c+β1X1^3+β2X2^3+β3X3^3+β4logX4) seem to be more instructive, with the highest
adjusted R^2 (0.41) and the regressors seem to even explain more as they reject more clearly the null hypothesis
(except X2). As a result, the joint hypothesis (F-stats) rejects the null hypothesis, meaning our R-square of 0.46 is ac-
tually significant. This model shows significantly higher coefficient than the linear model; this can easily explained by
the nature itself of the model: the cube of a percentage (regressor X1, X2, X3) is significantly lower than the initial
percentage, which causes to increase the coefficient.
Perhaps this improvement in R^2 could be explained by the fact that the Ramsey RESET test used on the linear mod-
el indicated a possible wrong functional form (to be more specific, a nonlinear model presence), it would then make
sense to have a higher fit with a non-linear model.
However it is possible that there are still omitted variable in this nonlinear model, which would induce a bias in the
significance of our variables. In order to verify this, we will look at Error autocorrelation.
Auto Correlated Errors
Auto Correlation of errors breaks the assumption No3 of the regression model. The direct effect of this is that it
creates a bias in the Error term, making it appear artificially smaller which in turn improves the fit of our model
through lower Standard Error and higher t-statistic sometimes misleadingly rejecting the null hypothesis.
For all these reasons it is crucial to try and identify the presence of autocorrelation in this non-linear model
(y=c+β1X1^3+β2X2^3+β3X3^3+β4logX4). In order to do so, the first possible element that can give an indication of Error
autocorrelation is the scatterplot of the Error term against the previous (t-1) Error term:
-2.4
-2.0
-1.6
-1.2
-0.8
-0.4
0.0
0.4
0.8
1.2
-3 -2 -1 0 1 2
RESIDNONLINEAR
RE
SID
NO
NL
INE
AR
(-1
)
-2.4
-2.0
-1.6
-1.2
-0.8
-0.4
0.0
0.4
0.8
1.2
-3 -2 -1 0 1 2
RESIDNONLINEAR
RE
SID
NO
NL
INE
AR
(-4
)
~ 12 ~
Bo Wei – Minakshi Agrawal – Jean Lemercier
In addition to the scatterplot of E(t) and E(t-1), the scatterplot of E(t-4) against E(t) has been used as we are using
quarterly data (error terms using quarterly data are often correlated with past year error terms, Et-4, as a result of
seasonal effect). It appears that the errors seem to be auto correlated to a certain extent: this is more apparent for
the first graph E(t-1) as the points are clustered around the straight line.
-2.4
-2.0
-1.6
-1.2
-0.8
-0.4
0.0
0.4
0.8
1.2
5 10 15 20 25 30 35 40 45 50
Y Residuals
The residual graph seem to confirm this possible autocorrelation; instead of having successively unrelated positive
and negative error terms, they tend to be correlated with the previous one (see graph).
Even if the scatterplot and the Error graph gave us a short insight of the possible incidence of auto-correlation, this
needs to be further checked through a test.
Successive negative
error terms
~ 13 ~
Bo Wei – Minakshi Agrawal – Jean Lemercier
Autocorrelation Function (ACF)
The Autocorrelation Function correlogram test for autocorrelation of residuals up to a given order. The two bands
represent the confidence interval, determining whether the autocorrelation is significant and reject the null
hypothesis for a given order (Ho: Autocorrelation of order X = 0).
Here we can clearly see that there is autocorrelation of order 1 and 2 as the autocorrelation exceed the confidence
interval for the two given level.
LM Test
In order to have a more precise idea of the significance of the autocorrelation, we have run the LM test. This test
runs a regression including the different “lagged” error terms, and give us the probability that each error term is
significant to explain the other or not.
Test Equation:
Dependent Variable: RESID
Method: Least Squares
Date: 12/08/13 Time: 15:56
Sample: 1 51
Included observations: 51
Presample missing value lagged residuals set to zero. Variable Coefficient Std. Error t-Statistic Prob. C -0.037652 0.124674 -0.302006 0.7645
X1^3 0.010583 0.027250 0.388386 0.7002
X2^3 -17.19601 13.60013 -1.264400 0.2147
X3^3 1.803476 5.835312 0.309062 0.7592
LOG(X4) 0.001848 0.082724 0.022341 0.9823
RESID(-1) 0.406643 0.187369 2.170283 0.0371
RESID(-2) 0.347684 0.174126 1.996737 0.0539
RESID(-3) -0.161794 0.184243 -0.878158 0.3860
RESID(-4) -0.229345 0.189469 -1.210459 0.2345
RESID(-5) -0.006312 0.196980 -0.032042 0.9746
RESID(-6) 0.229713 0.199338 1.152380 0.2572
RESID(-7) -0.174142 0.194161 -0.896891 0.3761
RESID(-8) -0.099336 0.215218 -0.461559 0.6473
RESID(-9) 0.126907 0.200615 0.632590 0.5312
RESID(-10) 0.007037 0.202569 0.034741 0.9725
RESID(-11) -0.118916 0.194220 -0.612273 0.5444
RESID(-12) 0.080295 0.190656 0.421152 0.6763
~ 14 ~
Bo Wei – Minakshi Agrawal – Jean Lemercier
Resid (-1) or E(t-1) is highly significant as it rejects the null hypothesis at the 5 and 10% level – there is undoubtedly
an autocorrelation of order 1. The autocorrelation of order 2 is only significant at the 10% but still needs to be
corrected for in our model. It is quite interesting to note that the two lagged error terms are positively correlated
with the error term, with coefficients of 0.57 and 0.32 – this coincides with what we have seen in the scatterplot of
E(t) and E(t-1), with E(t-1) increasing as E(t) was aiming higher.
Possible causes of the spotted Autocorrelation
There are many possible reasons for the Autocorrelation: a wrong functional form of our model, omitted variables,
inertia, overlapping effects…
In our case, a positive correlation is not uncommon as our dependent variable Y is GDP growth; the general
consensus for these kind of macroeconomic variable is that there is often the presence of business cycles which
provokes inertia/errors autocorrelation. In addition, our data may be impacted by seasonality, which the facto would
create an auto correlation of errors even if there were no omitted variable, functional form issue and so forth.
Furthermore, it is likely that we solved (at least partially) our functional form issue as the fit of the nonlinear model is
better (see nonlinear regression chapter). This would imply that it is more likely that the autocorrelation comes
from omitted variables or true autocorrelation.
Correcting our model (Autocorrelation issue)
There are mainly two ways to account for autocorrelation in our model: use GLS (Generalised Least Squares) or
Newey West robust standard errors. In our case, taking into account the low number of observation we have in hand
(51) it is better to use the Generalised Least Square than to keep our model and adjust the standard error. This
means that the coefficient as well will be corrected for autocorrelation.
Before adjusting for autocorrelation Variable Coefficient Std. Error t-Statistic Prob.
C 0.551001 0.124268 4.433962 0.0001
X1^3 -0.085530 0.024040 -3.557864 0.0009
X2^3 16.57297 12.15265 1.363733 0.1793
X3^3 -19.69698 5.552571 -3.547363 0.0009
LOGX4 0.164905 0.081524 2.022776 0.0489
R-squared 0.463842 Mean dependent var 0.368627
Adjusted R-squared 0.417220 S.D. dependent var 0.790314
S.E. of regression 0.603326 Akaike info criterion 1.920176
Sum squared resid 16.74411 Schwarz criterion 2.109571
Log likelihood -43.96450 Hannan-Quinn criter. 1.992550
F-statistic 9.948903 Durbin-Watson stat 1.270721
Prob(F-statistic) 0.000007
~ 15 ~
Bo Wei – Minakshi Agrawal – Jean Lemercier
After adjusting for autocorrelation (GLS) Dependent Variable: Y
Method: Least Squares
Date: 12/08/13 Time: 16:45
Sample (adjusted): 3 51
Included observations: 49 after adjustments
Convergence achieved after 42 iterations Variable Coefficient Std. Error t-Statistic Prob. C 0.351675 0.280397 1.254204 0.2167
X1^3 -0.039173 0.018705 -2.094249 0.0423
X2^3 10.21791 12.45011 0.820708 0.4164
X3^3 -12.78975 4.693083 -2.725235 0.0093
LOG(X4) 0.206902 0.193196 1.070945 0.2903
AR(1) 0.490130 0.167984 2.917728 0.0056
AR(2) 0.146960 0.155062 0.947751 0.3487 R-squared 0.630049 Mean dependent var 0.353061
Adjusted R-squared 0.577199 S.D. dependent var 0.802626
S.E. of regression 0.521893 Akaike info criterion 1.668854
Sum squared resid 11.43962 Schwarz criterion 1.939114
Log likelihood -33.88693 Hannan-Quinn criter. 1.771391
F-statistic 11.92145 Durbin-Watson stat 1.885022
Prob(F-statistic) 0.000000
After using the GLS, the coefficient and the standard error of our variable changes: this is because our regressor t-
statistic was artificially inflated because of the autocorrelation of order 1 and 2. After taking this into account, we
can see that X1 is still significant at the 5 and 10% level, and X3 at any level. However interest rates are no longer
significant.
The most important insight the GLS model gives us is that in our model the UNITED KINGDOM base rate may not be
useful in our model. If we did not account for autocorrelation we could have been using this “irrelevant variable”.
Hence our final model will only feature X1 and X3 as irrelevant variable such as X2 and X4 may artificially increase
the t statistic of our other variable.
Final model chosen, adjusted for Auto Correlation (GLS)
Variable Coefficient Std. Error t-Statistic Prob.
C 0.536207 0.262746 2.040778 0.0473
X1^3 -0.035361 0.017981 -1.966591 0.0556
X3^3 -13.94418 4.366606 -3.193368 0.0026
AR(1) 0.536202 0.152236 3.522171 0.0010
AR(2) 0.176200 0.151127 1.165910 0.2499
The R-squared of the regression and the adjusted R-squared shoot higher with the autocorrelation adjusted (GLS)
regression. This is normal as the autocorrelation variables are added into the model – this does not mean that our
model really improved as we are only interested in the real impact of the X1 X2 X3 X4 to explain Y and not into the
explanatory power of the autocorrelation.
~ 16 ~
Bo Wei – Minakshi Agrawal – Jean Lemercier
Hypothesis Testing Interaction models
We have seen in the last paragraph that the interest rate factor seems to be insignificant after taking autocorrelation
into account. This is quite unexpected as literature shows that interest rates have a significant effect on M&A levels
(Factors affecting international mergers and acquisitions, Reed&Babool, 2003). Perhaps an interaction model could
be interesting to estimate taking into account the impact of interest rate on the average premium paid : The
regression would take the following form: Y: C X1^3 X3^3 (X4*X3^3) AR(1) AR(2) (with X3 being the average
premium paid and X4 the United Kingdom base rate).
Likewise, an interaction model taking into account the possible relationship between the premium paid and the
global M&A volumes could intuitively prove valuable : Y: C X1^3 X3^3 (X1^3)*(X3^3)
Variable Coefficient Std. Error t-Statistic Prob.
X4*X3^3 5.464 2.51 2.17 0.0353
X1^3*X3^3 2.30 3.15 0.73 0.46
It appears that the first interaction model, using the relationship between interest rates and average premium paid,
improves our model as the variable added (X4*X3^3) is significant at the 5 and 10% level. In turn it slightly improves
the adjusted R^2.
On the other side, the possible effect of the second interaction model proves to be insignificant after running the
relevant regression (the X1^3*X3^3 coefficient appear to be insignificant at the 5,10 and 1% levels).
Checking the normality of the Error term
Variable Coefficient Std. Error t-Statistic Prob. C 0.551001 0.124268 4.433962 0.0001
X1^3 -0.085530 0.024040 -3.557864 0.0009
X2^3 16.57297 12.15265 1.363733 0.1793
X3^3 -19.69698 5.552571 -3.547363 0.0009
LOGX4 0.164905 0.081524 2.022776 0.0489 R-squared 0.463842 Mean dependent var 0.368627
Adjusted R-squared 0.417220 S.D. dependent var 0.790314
S.E. of regression 0.603326 Akaike info criterion 1.920176
Sum squared resid 16.74411 Schwarz criterion 2.109571
Log likelihood -43.96450 Hannan-Quinn criter. 1.992550
F-statistic 9.948903 Durbin-Watson stat 1.270721
Prob(F-statistic) 0.000007
~ 17 ~
Bo Wei – Minakshi Agrawal – Jean Lemercier
0
2
4
6
8
10
12
14
-2.0 -1.5 -1.0 -0.5 0.0 0.5 1.0
Series: ResidualsSample 1 51Observations 51
Mean 6.59e-17Median -0.004087Maximum 1.167325Minimum -2.024367Std. Dev. 0.578690Skewness -0.563006Kurtosis 4.529200
Jarque-Bera 7.663500Probability 0.021672
Non-normality of Ԑt poses no problem for large T because in that case t-statistic follows N (0, 1). However, we only have 52 observations which is a relatively small T. Let’s test the normality of Ԑt
H0: Ԑt follows a normal distribution
H1: Ԑt does not follow a normal distribution
From the histogram of Ԑt, we can see that it is not bell-shaped. Additionally, Jarque-Bera equals 7.66300 which is
much bigger than 0. Therefore, reject the null hypothesis. Ԑt does not follow a normal distribution because of a
relatively small T.
~ 18 ~
Bo Wei – Minakshi Agrawal – Jean Lemercier
Dummy Variables
Binary variable in the explanatory variable
Before using any dummy variable, it is important to recall what the initial data used comprises:
Y : Real GDP growth (QoQ, %, UNITED KINGDOM)
X1 M&A Volumes in the United Kingdom (% Change in sterling volume, QoQ) Used in the final model
X2 Deal count (% Change QoQ)
X3 Average premium paid over the period, (%, Quarterly) Used in the final model
X4 Bank of England base rate (%)
Some interesting dummy variable we could have used could be a dummy variable reacting to a specific industry. For
example we could have used a dummy variable taking the value 1 if the average premium paid over the period in the
financial industry had increased or 0 if the average premium paid over the period in the financial industry had
increased (or any other industry). This way we could have measured if any industry has a higher correlation with GDP
growth than another. However we were unable to use any of this due to lack of data.
The dummy variable we have created (DummyCrisis) is a structural break dummy variable taking the value 1 if the
values recorded in the sample are after quarter 1 2007 (Observation 23, beginning of the Global financial crisis) and
0 if the values have been recorded before quarter 1 2007 (the remaining observations). This variable will be useful to
assess whether the relationship between M&A premiums/volumes and GDP growth in the United Kingdom has
significantly changed after the crisis (Q1 2007).
Dependent Variable: Y
Method: Least Squares
Date: 12/08/13 Time: 20:03
Sample (adjusted): 1 51
Included observations: 51 after adjustments Variable Coefficient Std. Error t-Statistic Prob. DUMMYCRISIS 0.187783 0.189022 0.993444 0.3257
X1^3 0.162804 0.131406 1.238943 0.2217
X3^3 -23.37007 8.561167 -2.729776 0.0090
X4*X3^3 11.62079 3.395521 3.422389 0.0013 DUMMYCRI-
SIS*(X1^3+X3^3+X4*X3^3) -0.253133 0.134286 -1.885030 0.0658 R-squared 0.176834 Mean dependent var 0.368627
Adjusted R-squared 0.105255 S.D. dependent var 0.790314
S.E. of regression 0.747566 Akaike info criterion 2.348905
Sum squared resid 25.70730 Schwarz criterion 2.538300
Log likelihood -54.89708 Hannan-Quinn criter. 2.421278
Durbin-Watson stat 0.826232
In order to check whether the crisis changed the relationship between M&A and GDP growth in the UNITED
KINGDOM, we need to reject the joint hypothesis Ho: B (Dummy Crisis) = B (Dummy Crisis*X1^3+X3^3+X4*X3^3) = 0.
In our case, the second hand of the null hypothesis B (Dummy Crisis*X1^3+X3^3+X4*X3^3) = 0 rejects the null
hypothesis at the 10% level (p value = 0.658). We can therefore conclude that the global financial crisis starting in
2007 significantly changed the relationship between M&A volumes/acquisition premium and the GDP growth in the
United Kingdom.
~ 19 ~
Bo Wei – Minakshi Agrawal – Jean Lemercier
Binary variable expressed in the dependent variable
The second dummy variable use is related to the original Y (Real GDP growth in the UNITED KINGDOM). It
takes the value 1 if there is positive GDP growth in the UNITED KINGDOM and 0 if there is negative GDP
growth. This way we are able to see if the model can explain for positive or negative GDP growth values.
Dependent Variable: DUMMY_GDP_GROWTH
Method: Least Squares
Date: 12/08/13 Time: 19:33
Sample (adjusted): 1 51
Included observations: 51 after adjustments Variable Coefficient Std. Error t-Statistic Prob. C 0.853358 0.071067 12.00771 0.0000
X1^3 -0.027651 0.015500 -1.783992 0.0809
X3^3 -9.851929 4.031901 -2.443495 0.0184
X4*X3^3 2.417120 1.840633 1.313201 0.1955 R-squared 0.175018 Mean dependent var 0.784314
Adjusted R-squared 0.122360 S.D. dependent var 0.415390
S.E. of regression 0.389148 Akaike info criterion 1.025469
Sum squared resid 7.117490 Schwarz criterion 1.176985
Log likelihood -22.14947 Hannan-Quinn criter. 1.083368
F-statistic 3.323651 Durbin-Watson stat 1.344303
Prob(F-statistic) 0.027532
Dependent Variable: DUMMY_GDP_GROWTH
Method: ML - Binary Logit (Quadratic hill climbing)
Date: 12/08/13 Time: 19:35
Sample (adjusted): 1 51
Included observations: 51 after adjustments
Convergence achieved after 5 iterations
Covariance matrix computed using second derivatives Variable Coefficient Std. Error z-Statistic Prob. C 1.795447 0.530267 3.385928 0.0007
X1^3 -0.185624 0.226831 -0.818334 0.4132
X3^3 -64.68640 44.73894 -1.445864 0.1482
X4*X3^3 15.57773 14.18400 1.098260 0.2721 McFadden R-squared 0.144657 Mean dependent var 0.784314
S.D. dependent var 0.415390 S.E. of regression 0.389180
Akaike info criterion 1.048804 Sum squared resid 7.118656
Schwarz criterion 1.200320 Log likelihood -22.74451
Hannan-Quinn criter. 1.106703 Restr. log likelihood -26.59108
LR statistic 7.693147 Avg. log likelihood -0.445971
Prob(LR statistic) 0.052798
In this case the dummy variable is used as a dependent variable. In order to assess whether our model explains
well the period of positive or negative GDP growth, we use the Hit rat and Pseudo R^2 values for both the model
and the logit model :
Linear model
Pseudo R^2 = 0.175
HIT Rate = 0.807
~ 20 ~
Bo Wei – Minakshi Agrawal – Jean Lemercier
Logit model
Pseudo R^2 = 0.174
HIT Rate = 0.807
The first observation we can have is that the hit rate and the pseudo R^2 are very different. This means that,
when both the estimated and the actual data indicate a positive (negative) GDP growth (i.e a value above
(below) 0.5), the gap between the actual data and the estimated one is quite large.
In other words, when there is actual positive (negative) growth, the estimated possibility of positive GDP growth
is slightly above (below) 0.5 and not close to 1 (0).
We can as well notice that the difference between the logit and the linear model is very small or even negligible.
This is because when regressing the linear model we only have one value outside of the 0;1 interval. This
means that in this respect the logit model will not be significantly different from our linear model.
~ 21 ~
Bo Wei – Minakshi Agrawal – Jean Lemercier
Conclusion
The relationship between mergers & acquisition and GDP growth seem to be in accordance with literature as some
of the variables we have used (Average premium paid over the period & Change in M&A volumes) are significant to
explain the relationship with GDP growth. Interestingly enough, the most significant explanatory variable (Premium
of the M&A deals) is negatively correlated with GDP growth (the coefficient in our linear regression is -3.67). This
would mean that when the average paid premium over the period grows by 1%, the GDP growth reacts negatively
and decreases by 3.67%. This points in the direction that mergers and acquisition solely increases the market control
of the acquiring company and rules out smaller firm out of the industry, which in turn leads to reduced innovation
and spending in R&D, negatively impacting GDP growth.
This conclusion seem logical from our data, especially after having taken into account the possible bias impacting the
model, yet there are many other aspects which still have to been taken into account.
Even if there is an apparent relationship (R^2 or explanatory power of our model >0), we should not forget that this
only implies that the explanatory variables and the dependent variable behave (or change/move) in the same
direction over the period used in the sample. It does not mean that there is any causality between our explanatory
variable and the dependent variable, in other words, the explanatory power of our model does not mean in any way
that mergers and acquisition explain GDP growth.
In addition, it is possible that some biases remain in our analyses. As discussed earlier in the development of the
project, omitted variable could artificially reinforce our model and drive us to retain some irrelevant variables to
explain GDP growth.