handout trx
TRANSCRIPT
1 | G a n e s h M a n j h i
Workshop on Econometric packages
at Indian Council for Research on
International Economic Relations
EVIEWS
29 MaY 2008
Programme: Eviews Training by Ganesh Manjhi
Ph.D, JNU, New Delhi-67
2 | G a n e s h M a n j h i
OUTLINE OF CONTENTS:
I. INTRODUCTORY SESSION
II. TIME SERIES ANALYSIS & FORECASTING
III. ECONOMETRIC METHODS WITH EViews
I. INTRODUCTORY SESSION
Target: This session will equip one to take decisions based on time series data. The regression techniques covered in this session will be particularly useful for people interested in forecasting and relating/predicting a variable to/from a single or a set of explanatory variables. It covers the basic elements of ordinary least squares (OLS) models as well as time series econometrics and forecasting.
Prerequisite: Attendees are familiar with MS Office especially MS Excel, and have basic knowledge of statistics - descriptive statistics, both numerical (mean, standard deviation, standard error, etc.) and graphical (histogram, scatter plot, etc.), hypothesis testing and confidence interval.
Sub Contents:
Wokfile basics: This involves the creation of a new work file or loading in the memory an existing one. Along with explanation of Data management.
Basic Data Analysis: a brief description of statistical graphs from series and groups and descriptive statistics.
The classical Linear Regression Model: this will involve methods of estimation- least squares, maximum likelihood, dummy variables, and autocorrelation and hetroskedasticity
Duration: 5-6 pm, 1 hour, 29th May
Refer Handout 1
3 | G a n e s h M a n j h i
II TIME SERIES ANALYSIS & FORECASTING
Target: Providing basic knowledge of Structural Time Series Models using eviews. Attendees will able to estimate and analyze econometric models using eviews.
Prerequisite: It entails the knowledge of contents mentioned below and things covered in Session-I.
Sub Contents:
Univariate Methods: Exponential Smoothing, Decomposition, ARIMA Modeling
Multivariate Methods: Single / Multiple Regression Model, 2SLS
Duration: 1 Hour, date has yet to be decided
III ECONOMETRIC METHODS WITH EViews
Sub Contents:
Dummy variable
Granger Causality Test
Maximum Likelihood Estimation
Co integration analysis and vector auto regression (VAR) modeling
4 | G a n e s h M a n j h i
Handout
(A)Window Basics
Create workfile and data entering.
Plotting the series, run OLS regressions and computing descriptive statistics
***********************************************************************
To create a workfile, click File/New/Workfile, as shown in the following dialogue
box
5 | G a n e s h M a n j h i
The workfile will come as shown below:
As data is time series, click on dated regular frequency, and enter the frequency
start: date and end: date. Click OK
Next to enter data by import: Save the excel sheet as text file, click on
File/Import/Read Text-Lotus-Excel
6 | G a n e s h M a n j h i
Name for series OR number if named is in file such as we can write “9” in our
case
The imported data will appear in the workfile as follows:
7 | G a n e s h M a n j h i
(B)Multivariate Methods
o We have the following variables in the workfile:
Cons- Cement Consumption Demand
gdp – Gross domestic product
cngdp – Construction share in GDP
wpi – whole sale price index
plr – Prime lending rate as proxy for interest rate
prdn – Total production
capin – capacity installed
gdcf – gross domestic capital formation
To save the workfile click File/Save As on the main tool bar
To plot the graph
Highlight the cons and plr on the workfile and double click
Select Open Group
Click on View/Graph/Line on the Group Spreadsheet
Similarly, you can plot the scatter diagram. To plot the regression line
along with scatter plot: click View/Graph/Scatter/Scatter with regression
8 | G a n e s h M a n j h i
For the descriptive Statistics:
Highlight the two series on the workfile and double click.
Click Open Group
Click View/Descriptive statistics in the Group spreadsheet
To run the OLS regression:
Click on Objects/New Object/Equation or Quick/Estimate Equation on the
main toolbar. The following dialogue box will appear. Type in the
dependent variable on the LHS (cons) and then the constant (c) and the
independent variable (gdp, wpi, plr) and click OK.
Cons =f (c, gdp, wpi, plr)
9 | G a n e s h M a n j h i
The Estimation Output View of the equation would appear
The fitted values can also be viewed along with the actual values of cons and the
residual plot by clicking View/Actual, Fitted, Residual Table.
10 | G a n e s h M a n j h i
(i) Estimation of Regression Models
11 | G a n e s h M a n j h i
o Two variables Regression Models.
o Multiple Regression Models.
Obtain the descriptive Statistics for the series
To regress cons on the constant plr [Cons = f(c, plr)] :
The F-statistics at the bottom right of the table gives the joint significance of the
coefficients (excluding the constant) in the regression. Since there is only one
slope coefficient, the F-statistic is equal to the square of the t-statistics of plr
We can also observe the serial correlation by Durbin-Watson statistics(0.859)
R-squared is the coefficient of variation shows goodness of fit of the model.
12 | G a n e s h M a n j h i
The Multiple Regression Model:
To generate the first difference, lag series and growth series: Click the Genr
button on the workfile toolbar and type:
wpi1=wpi(-1)
gdpgr=(gdp-gdp(-1))/gdp(-1)
cngdp=cngdp(-1)
To regress cons on the constant gdp plr wpi [cons = f(c, gdp, plr, wpi)]:
Click Quick /Estimate Equation and specify the equation in the dialog box.
Click OK to get the estimation output
Alternatively, special functions can be used directly in the equation. For instance,
to get the percentage change (growth) in GDP, the @PCH(gdp) can be used
13 | G a n e s h M a n j h i
Click View/Actual, Fitted, Residual/Graph to view the regression graphically
To test whether more two variables/more than two variables are jointly significant
we do the Wald F-test as follows:
Click View/Coefficient Test/Wald coefficient restrictions
14 | G a n e s h M a n j h i
The Wald F-stat will look as shown below and the observing the p-value we
see that wpi and growth rate of gdp are jointly significant.
15 | G a n e s h M a n j h i
(ii) Multicollinearity
How to deal with the problem of multicollinearity.
*****************************************************************
o Click Quick/Estimate Equation to run the following regression
Cons = f(gdp, cngdp, gdcf, wpi, plr, cngdp)
In the regression result we see that, only the cngdp is significant The R2 is
very high but the variables are not significant. Furthermore, the highly
significant overall F-statistics and low individual t-statistics indicate
collinearity. In order to check this, the correlation matrix of the all the
variables is looked at. Highlight all the variables considered in the above
equation in the workfile, Open Group, View/Correlations
16 | G a n e s h M a n j h i
The high correlation coefficient of 0.98 between cngdp and gdp makes the regression
unable to identify the effects of each of these variables separately
Drop gdp and re-run the regressions. The results thus obtained are:
We see that results are improving over all. The Adjusted-R2 , AIC, Schwartz
etc. are giving better results now
We can further fit this model in a better way after observing the serial
correlation and heteroskedasticity.
17 | G a n e s h M a n j h i
(iii) Serial Correlation
o How to deal with issues related to serial correlation
***********************************************************************
Proceeding further with the same result as above
Since the Durbin-Watson statistics is close to 2 so it indicates no first order serial
correlation, but we can have higher order serial correlation. So to see the serial
correlation we plot residuals and also check by LM test at various lag.
18 | G a n e s h M a n j h i
Observing the F-statistics, we reject the null hypothesis of no serial correlation, so
we get 4th
order serial correlation. To correct the 4th
order serial correlation we
add ar(4).
Observing the F-statistics, we cannot reject the null of no serial correlation at 1%
level of significance.
19 | G a n e s h M a n j h i
We can also check the residual to see whether result improved.
Serial Correlation with Lagged dependent variable as Regressors
The estimation output of the equation cons=f(c, cngdp, cngdp1, gdp1, plr, wpi,
gdcf) appears as:
In the presence of the lagged dependent variable as one of the regressors, the
Durbin h statistic is used to test for serial correlation.
Create a coefficient result vector. For this, type coef(10) result in the
command window.
To store the h statistics in the first row of the result vector, use the
following command:
coef result(1)=(1-@dw/2)*(@regobs/(1-@regobs*@covariance(7,7)))^.5
The h-statistics -0.91 is lesser than the critical value of the normal distribution at 5% level
of significance -1.96. Hence, the null hypothesis of no serial correlation is not rejected.
20 | G a n e s h M a n j h i
In case the term inside the square root becomes negative, it is not possible to use
the h-statistic for carrying out the test for serial correlation . Alternatively,
In the command window type genr res=resid
Run the following regression
res c res(-1) cngdp, cngdp1, gdp1, plr, wpi, gdcf
For no serial correlation, the coefficient of res (-1) should not be
significantly different from zero. Observing the p-value, the hypothesis
is not rejected
21 | G a n e s h M a n j h i
(iv) Heteroskedasticity
o How to deal with heteroskedasticity
************************************************************************
Select the variables gdp cngdp, cons, gdcf, plr, wpi and Open as Group to view
the graph of selected series.
To include a time trend in cement consumption function. For this use special
function @trend( ) . For example, @trend(92.01) in a series with value 0 in
1992.01, value 1 in 1992:02, and so on.
Click Quick/Estimate Equation and specify the following in the dialogue box:
Cons c @trend(92.01) cngdp gdcf, wpi, plr
The specification estimates of cons function with trend
22 | G a n e s h M a n j h i
The estimation output appears as:
To view the regression graphically, click View/Actual, Fitted, Residual Graph.
The following graph appears:
23 | G a n e s h M a n j h i
Observation: Heteroskedasticity can be traced by observing that the residuals
fluctuate more widely in the middle periods.
To carry out the White‟s test without specifying the form of heteroskedasticity.
Click View/Residual Test/White Heteroskedasticity (cross term). The following
result appears:
The upper window gives the test statistics under the null hypothesis of
homoskedasticity and the associated p-values. F-statistics is the Wald version of
24 | G a n e s h M a n j h i
the test and Obs*R-sqaured is the Lagrange multiplier (LM) version of the test.
Observing the p-values, the null hypothesis of homoskedasticity is not rejected.
In the presence of the heteroskedasticity the standard errors from the OLS are
incorrect.
To get consistent standard errors. Re-estimate the cons function and at the
same time click on Options. Therein click on heteroskedasticity Consistent
Covariance.
25 | G a n e s h M a n j h i
Click OK, following estimation output appears:
Observing the results, we get expected standard errors have changed.
To generate the weighting series for the efficient cons estimation , use the
following command in the command window:
Genr w= 1/ (@trend (92.1)) ^0.5
Re-estimate the cons function and at the same time click Options and mark the
weighted LS/TSLS icon and specify the weight as w.
26 | G a n e s h M a n j h i
The estimation Output appears as:
Observe the E-views give the result for both the weighted and unweighted
statistics. Here we get better result in the case of WLS.
27 | G a n e s h M a n j h i
(v) Granger Causality Test
Select the variables and Open as Group and View/Granger Causality select the
lag as shown below in the Lag Specification
You get the output as follows:
28 | G a n e s h M a n j h i
(vi) Forecasting with a Single-Equation Regression Model
Change the sample as 1992:Q2 to 2001:Q4
Click Quick/Estimate Equation to estimate the following equation:
Demand: DDt = c0 + c1WPIt+ c2 GDPt-1 + c3 CNGDPt + c4 PLRt+ et
Check for the serial correlation, heteroskedasticity to get the efficient estimates.
However multicollinearity is not the problem for forecasting.
Forecast Option
To forecast from the model:
DDt = c0 + c1WPIt+ c2 GDPt-1 + c3 CNGDPt + c4 PLRt+ et
Click on the Forecast button:
Give the sample range for forecast as 2002:Q1 to 2002:Q4
Give the name for the forecast series, say consf1
Click OK. Note that the forecasting method used in this case in the Static
method as there are no lagged dependent variables on the RHS of the
equation (E-view’s users Guide, Chapter 15)
The following table of the Forecast Evaluations would appear:
29 | G a n e s h M a n j h i
To plot the forecasted series together with the actual series:
Set the sample range to the forecasting period i.e. 2000:01 to 2002:04
Highlight the two series in the work file.
Open two series as Open/Group and View/Graph. The graph show that
forecasts initially over predicts and but after 2002:02 it under predict. That
can also be observed from the bias proportion(0.15) in the box above
To plot the actual and the forecasted series along with the forecast interval:
Change the sample range to 2001:01 to 2002:04
30 | G a n e s h M a n j h i
Generate the upper and lower bounds of the forecasts interval by the
following commands:
Genr up=consf+2*se
Genr low =consf-2*se
Change the sample range by typing smpl 2000:01 to 2002:04
Highlight cons, consf, up, low, Open Group and View/Graph
31 | G a n e s h M a n j h i
(vii)Simultaneous Equation Estimation
Simultaneous Equation Estimation
How to estimate a system of equation
How to carry Hausman test
How to do 2SLS
***********************************************************************
Demand: DDt = c0 + c1WPIt+ c2 GDPt-1 + c3 CONTNt + c4 PLRt+ et Supply: SSt= a0 + a1WPIt+ a2 WPIt-1 +c3 CONTNt+ c4 CAPt+ ut Equilibrium: DD=SS
This is an example of simultaneous equation model. Here we are taking WPI, DD and SS
as endogenous variables and rest of the variables we are considering is either exogenous
or predetermined. Where, SS = Cement Consumption Demand, SS = Cement Supply.
Further to find the equilibrium price level, the WPI equation can be written as follows:
WPI = f (WPI (-1), GDP, GDPt-1, CONTNt, PLRt, CAPt)
To test for the endogeneity and simultaneity with respect to WPI variable
First regress WPI on WPI(-1), GDP, GDPt-1, CONTNt, PLRt, c
The estimation output appears as:
32 | G a n e s h M a n j h i
Click Genr to the workfile window and type res=resid. Alternatively, type genr
res=resid in the command window. This is done to carry out the Hausman Test by
an auxiliary regression.
Next, click Quick/Estimate Equation, to estimate the following equation:
DD=f(WPI, GDPt-1, CONTNt, PLRt, res)
The estimation output appears as:
33 | G a n e s h M a n j h i
RES in the regression should not be significantly different from zero under the
null hypothesis that WPI is exogenous. Observe that the p-value for res is
significantly different from zero at 5% level of significance. It means WPI is
endogenous at 5% level of significance. If the null hypothesis is rejected then
the OLS estimates of the Cement Demand gives bias and inconsistent results.
o How to do 2SLS?
Regress WPI on all the exogenous variables in the system. This has been
done above in estimating the first regression
Next obtain the fitted values from this regression. For this type the
following in the command window
GENR WPIHAT=WPI-RES
Estimate the following equation:
DD C WPIHAT GDPt-1, CONTNt, PLRt,
The results of the estimation are given as:
34 | G a n e s h M a n j h i
Compare these estimation results with that when the Hausman test is
carried out by auxiliary regression. Observe, the standard errors from this
regression are not correct.
o To Obtain the correct standard errors of the 2SLS estimates,
Click Estimate in the equation window.
In the estimation settings, give the method as 2SLS.
Specify all the exogenous variables in the system including the constant in
the instrument list.
In the present case these are
C GDPt-1, CONTNt, PLRt ,CAPt The following window will appear
35 | G a n e s h M a n j h i
Click OK and get the following results:
Here we get the correctly calculated standard error as final result. From here we can
do the static recursive forecasting as we have done for the single equation model and
calculate the 95% confidence band etc.
36 | G a n e s h M a n j h i
(C)Univariate Methods
(i)Decomposition
o Modelling and Forecasting with the Classical Decomposition(Multiplicative)
Method
********************************************************************
To open the series, Double Click on the “cons”
Next, click “proc”, then “seasonal adjustment”, and “OK”. Then, select ratio to
moving average-Multiplicative”. Write “cons” for the adjusted series
(deseasonalised series) and then click “ok”.
37 | G a n e s h M a n j h i
The output result will look like
Since the sum of the „scaling factors‟ is 4.0079, adjust it to 4, i.e., equal to the
number of seasons in a year. For this write
scalar c1a = (1.047819/4.0079)*4
scalar c2a = (0.921718/4.0079)*4
scalar c3a = (0.961611/4.0079)*4
scalar c4a = (1.076753/4.0079)*4
in this command window. This will give you the „adjusted scaling factors‟, where
c1a =1.045754, c2a = 0.919901, c3a = 0.959716, c4a = 1.074631
To get the seasonal indexes, first generate a dummy seasonal factor for each
season in the following way:
38 | G a n e s h M a n j h i
To Calculate seasonal index, click on “genr”
The seasonalindex series will look like
39 | G a n e s h M a n j h i
To deseasonalize the time series by dividing it by the adjusted seasonal indexes,
write “consd = cons/seasonalindex
Estimate the trend cyclical regression equation using the deseasonalised data
(consd). Before running the regression, we need to generate a trend variable by
the commands: choose “GENR”, type trend=@trend(1992.02) in the dialogue
box. To get the value of trend equals 1 for quarter 2 of year 1992, generate
another series by writing trend1 = trend +1 in the dialogue box.
40 | G a n e s h M a n j h i
Fitted Trend (= a + bt) can be calculated by writing the following in the dialogue
box “fitted_trend = Consd-resid”
Multiply the fitted trend values by their appropriate seasonal factors to
compute the fitted value, that is, write “Fittedvalue =
fitted_trend*seasonalindex” in the generate window.
41 | G a n e s h M a n j h i
Write “residual = cons-fittedvalue” in the generate window to calculate errors
and to measure the accuracy of fit.
42 | G a n e s h M a n j h i
To measure the accuracy of fit, i.e. Root Mean Squared Error, double click
“residual”, then click “view” and select “descriptive statistics”, “Histogram
and stats”. RMSE will be standard deviation of the residual series.
To get Adjusted R-square, first get the standard deviation of the “cons” series
by following the similar steps given for residuals. The value of the standard
deviation will be 5.403. Then write
43 | G a n e s h M a n j h i
“Scalar RBAR2 = 1-(0.879528/5.402913)^2”
In the command window, Adjusted R-square is equal to 0.97350015, which is
quite high. Of the original variance of the “cons” series = (5.402913)2, more than
97% has been removed by decomposing it into seasonal and trend components.
The RMSE is 0.879528, which shows this is very good fit.
44 | G a n e s h M a n j h i
Tests of Stationarity and Cointegration
o How to carry unit root tests
o How to check for Cointegration using Johansen Methods Procedures
************************************************************************
To carry out the ADF(Augmented Dickey fuller) and PP (Phillips Perron) test
consider three different regression equations:
p
i
tititt ytyy2
1210 (1)
p
i
tititt yyy2
110 (2)
p
i
tititt yyy2
11 (3)
For sample size of 100 the complete set of test statistics are as follows:
Summary of the Dickey-Fuller Tests
Model
Hypothesis
Test Statistics Critical Values for
95% and 99%
confidence
Intervals
Model (1)
0 -3.45 and -4.40
02 3 6.49 and 8.73
020 2 4.88 and 6.50
Model (2) 0 -2.89 and -3.51
00 1 4.71 and 6.70
Model (3) 0 -1.95 and -2.60
To carry out the unit root tests (Augmented Dickey-Fuller Tests) for cement
consumption demand cons, doubles click on cons highlight right click/Open/Unit
root. We will start from a more general model i.e Model(1) and thus include a
constant and trend in the ADF test equation with the optimally chosen lag values :
45 | G a n e s h M a n j h i
The ADF Test statistics reported at the top of the window is the t-statistic of cons (-
4) in the test regression. The t-statistic reported under the null of unit root does not
have a normal distribution but is simulated critical value. The unit root test is one-
tailed test with the null hypothesis of a unit root against the alternative stationary
process (i.e. a root less than unity). We see that for the cons, we cannot reject the
null of unit root hypothesis even at 10% level of significance.
The same exercise you have to carry for Model (2) (i.e. with intercept but without
trend) and Model (3) (i.e. without intercept without trend) till you reject the null of
unit root. If you are not able to reject the null till the Model (3), then you have a
non-stationary series, but if you reject the null of unit root at Model(1) then you
stop there and declare the series non-stationary and so on…..
46 | G a n e s h M a n j h i
To test for the joint significance of unit root and the trend, we carry out a random
walk test. This test is more stringent than unit root test.
Run the test equation by Quick/Estimate Equation and specify the test
equation
The result which is similar to ADF test equation is
To test for the joint significance of @trend (1992:01) and cons (-1), Click
View/Coefficient Tests/Redundant Variables and type in the variables under
the test in the dialog box.
47 | G a n e s h M a n j h i
The result is
The F-statistics ( 3 ) under the null of random walk does not follow the
standard F-distribution and the P-value reported is not applicable. From the
tables above the 5% critical value reported is 6.49 (from table above) and the
estimated F-statistic is 4.91, so we cannot reject the null hypothesis of
presence of stochastic trend (unit root) and thus conclude that the series is
non-stationary.
Similarly, to sequentially test (ADF) we proceed to carry out , 1 , :
Double click on cons and View/Unit Root Test. Choose the option intercept
(i.e. Model (2)). The default lag length chosen by Eviews for this series is 4.
The following result thus can be obtained:
48 | G a n e s h M a n j h i
We see that the null of unit root the absence of deterministic trend cannot be
rejected even at 10% critical value, where the 10% critical value is -3.51 (for a
sample size of 100)
And so on…….!!!
Similarly you can do PP perron test taking the same critical values as in ADF case.
Some more unit root test options available in Eviews are: KPSS (Kwiatkowsky-
phillips-Schmidt-Shin), DFGLS, ERSPO (Eliot-Rothenberg-Stock Point-Optimal)
etc.
Repeat the same unit root test process for other variables considered, such as gdp,
plr, wpi. If all the variables considered are non-stationary at level, then we expect
that series will be stationary at 1st difference and the order of integration will be one
and hence we can carry to do cointegration test to get the long rum relationship
among all the variables. If some of the series are stationary at the level then that
variable will be introduced as exogenous variable.
o VAR Model:
Highlight the variables cons, gdp, plr, wpi Open/as VAR the dialog box will
appear as follows:
49 | G a n e s h M a n j h i
To choose the lag interval for endogenous variables we take the maximum lag and
observe the values for AIC and Schwarz SC. The lag values with minimum AIC
and Schwarz SC give the optimal lag interval. However we can select the lag
interval as 1-4 2-4 3-6 depending on your requirements.
After selecting the lag values and correctly specifying the variables Click OK ,
results will appear as follows:
50 | G a n e s h M a n j h i
Diagnostics Views:
Once you estimated the VAR equation, a set of Diagnostics views are provided
under the menu View/Lag Structure and View/Residual Tests in the VAR window
In the VAR results window Click on View/Lag Structure/AR roots table,
following result appears:
51 | G a n e s h M a n j h i
Reports the inverse roots of the characteristic AR polynomial. The estimated
VAR is stable (stationary) if all roots have modulus less than one and lie inside the
unit circle. In our case one root is outside the unit circle, so VAR is not stable. If the
VAR is not stable, certain results (such as impulse response standard errors) are not
valid. There will be kp roots, where k is the number of endogenous variables and p
is the largest lag. If you estimate a VEC with r cointegrating relations, k-r roots
should be equal to unity.
To carry pair wise Granger Causality Test and tests whether endogenous variables
can be treated as exogenous look at the Wald F-statistics in the result window for
each equation.
OR, One can also try to do this test by highlighting the variables cons, gdp, plr,
wpi, Right Click/Open as Group/Granger Causality, specify the same lag value as
in the VAR. and observe the result whether endogenous variable can be treated as
exogenous
To see serial correlation, normality and heteroskedasticity click View/Residual
Tests, and observe the result whether all these problems available or not. If serial
correlation and heteroskedasticity is available then remove by re-checking the lag
values and estimating by GLS/WLS respectively.
The representation of the VAR results can be done by View/Representations,
following result appears:
52 | G a n e s h M a n j h i
Impulse Response Graph:
Click on Impulse following dialog box appears click on required options
Click on the impulse definitions and choose the appropriate options in the
dialog box shown below:
53 | G a n e s h M a n j h i
Generalized Impulses as described by Pesaran Shin (1998) construct a set of
innovations that does not depend on VAR ordering. However in the case of
Cholesky dof adjustment is a set of innovations that depend on VAR ordering. (See
page no.730-731, Eview5 Users Guide) for dof sdjustment and no adjustment
explanation.
54 | G a n e s h M a n j h i
While impulse response functions trace the effects of a shock to one endogenous
variable on to the other variables in the VAR, variance decomposition separates the
variation in an endogenous variable into the component shocks to the VAR. Thus, the
variance decomposition provides information about the relative importance of each
random innovation in affecting the variables in the VAR.
To obtain the variance decomposition, select View/Variance Decomposition... from
the VAR object toolbar. You should provide the same information as for impulse
responses above. Here also we get Generalized and Orthogonalized Variance
Decomposition. Cholesky give the orthogonalised variance decomposition where
ordering of the variable are important.
.
o Cointegration Test:
To carry out the Cointegration test Highlight the variables cons, gdp, plr wpi
Open as Group/View/Cointegration test. The following dialog box appears and
click on Summary option to get all the results together.
55 | G a n e s h M a n j h i
We get one cointegrating vector for the option “No Intercept No Trend” and two
cointegrating vector for “Intercept No Trend”. To check whether the summary
table giving the correct result, it can be confirmed it from individual option
cointegration test by clicking on the single option such as click on option “No
Intercept No Trend”, results appears as follows:
56 | G a n e s h M a n j h i
From the different options of Johansen Cointegration Test we can select our
appropriate cointegrating vector on the basis of expected sign. Suppose we select
option one then we can carry similar exercise for cointegration as in VAR.
57 | G a n e s h M a n j h i
o
o
(ii) ARIMA
o How to plot autocorrelation functions and to determine the presence of a unit root.
o How to determine the order of the ARIMA models using sample autocorrelation and
partial autocorrelation functions
o How to estimate ARIMA models and to use them for forecasting.
***********************************************************************
Set the sample size to 1992:02 to 2002:04
To plot graph highlight cons right click -View/Line Graph
58 | G a n e s h M a n j h i
Click View/Correlogram to see the correlogram in levels. (The correlogram is shown
only up to 20 lags). We see that the correlogram of cons
To plot and compute the autocorrelations of the first difference of cons generate the
first difference of the series by clicking on genr and name it dcons. Click on d (cons)
View/Line Graph. We see that mean of the series appears to be constant, although the
variance is unusually high during 1997-99.
59 | G a n e s h M a n j h i
To view the correlogram in first difference Open cons View/Correlogram/click on
option first difference
Check the size of ACF and PACF at various lag lengths. We see that the sample
autocorrelation function is much smaller in magnitude with lag 1, 3, 5, 7 shows
pattern of change in signs. It declines slowly and loosely consistent with stationary
series.
Similarly, plot the autocorrelations of the second difference. The results does not
appear to be qualitatively different from those for d(cons), thus indicating over
differencing, suggesting that the order of integration is d=1
Observing the autocorrelation function for d(cons). We see that it begins decaying
after k=1(value of -0.752), thus exhibiting moving properties that are second or third
60 | G a n e s h M a n j h i
order or more as ACF function giving significantly different autocorrelations. On the
other hand as partial correlations remain significantly different from zero for the few
lags (k) values. Hence little autoregressive term would be sufficient but require more
moving average terms. However we can start with ARIMA (2, 1, 2), ARIMA (3, 1, 2)
and so on.
To fit ARIMA model
Change the sample size 1992:02 to 2001:04
Click Quick/Estimate Equation and specify the model ( ARIMA(2,1, 2)) with
each term separately in the dialog box as
d(cons) c ar(1) ar(2) ma(1) ma(2) The result is:
For the diagnostic check, view the model fit by View/Actual, Fitted,
Residual/Graph
61 | G a n e s h M a n j h i
Check the residual autocorrelation function by View/Residual Tests/Correlagram-
Q-Statistics. For a correctly specified model the residuals should be white noise.
This implies that the autocorrelations and partial autocorrelations should be all
zero. To check this Eviews gives the Ljung-Box Q-Statistics which follows a Chi
Square Distribution, given by
)()2( 1
2
kT
r
TTQ
p
k
k
62 | G a n e s h M a n j h i
The Q-Statistic for the null hypothesis that there is no serial correlation up to
order 20 is 48.82 with a P-value of zero indicating serial correlation in error terms
and misspecification.
Try out ARIMA models of different lag length specifications like (2, 1, 4), (2, 1, 6),
(3, 1, 2), (3, 1, 4), (3, 1, 6), (4, 1, 4)….etc. After some experiment we opt ARIMA (4,
1, 4) model to generate ex post forecasts over various horizons. To estimate this
model, click Quick/Estimate Equation and type:
d(cons) c ar(1) ar(2) ar(3) ar(4) ma(1) ma(2) ma(3) ma(4)
Note that the 4th
order AR and MA terms are significant.
For the diagnostic check, view the model fit by View/Actual, Fitted, and
Residual/Graph.
Check the residual autocorrelation function by View/Residual
Estimates/Correlogram-Q-Statistics. The Q-Statistic for up to 20 lags is
approximately 13.55, which is smaller than that from the ARIMA (2, 1, 2) model.
As the figure shows, none of the autocorrelations and partial autocorrelations is
individually significant (except lag 7), nor is the sum of the 20 autocorrelations, as
shown by the Q-Statistic. In other words, the correlograms of both autocorrelations
and partial autocorrelation give the impression that the residuals are purely random.
Hence there is no need to look any other ARIMA model. So we use model ARIMA
(4, 1, 4) for forecasting purposes.
To generate ex post forecasts over horizons using ARIMA(4, 1, 4)
To obtain up to 4th
- quarter ahead forecast
63 | G a n e s h M a n j h i
Click Forecast in the ARIMA(4, 1, 4) window
Give the forecasts sample range 2002:01 to 2002:04
The forecast obtained (call it consf) is for cons and not for d(cons) and the
forecasting method used is DYNAMIC (it‟s multi-step forecast and the forecasted
values the dependent variable are used to determine the forecast for the 1st –
quarter.
The bias proportion 0.04 indicates that the forecasts consistently track the actual
series. This can be seen graphically by plotting cons and consf. Change the
sample size 2001:01 to 2002:04, highlight cons and consf Open as Group and
View/Graph line
The plot shows that the model over predicts in the 1st two quarters but under
predicts in the last two quarters for the forecasting range
64 | G a n e s h M a n j h i
Similarly we can forecast for the 2nd
, 3rd
and 4th
quarters ahead forecasts and get
the accuracy measures for each period to compare across the models.
(ii) Binary/Dummy Variable
o How to analyze the result of a variable having qualitative in nature such as gender,
educational status etc. (ANOVA)
o How to analyze the qualitative and quantitative variables together.( ANCOVA)
***********************************************************************
We have considered the following variables:
Variables Dummy=1 O otherwise
gender If male If female
Race_White If White Otherwise
Race_Black If Black Otherwise
Race_Asian If Asian Otherwise
Experience If Experience>=6 Otherwise
Experiecne_1 If Experience>=5 and Otherwise
65 | G a n e s h M a n j h i
Experience<=7
Experience_2 If Experience>=8 Otherwise
Age If Age>=25 Otherwise
Age_1 If Age>=25 and Age<=30 Otherwise
The objective of the study is to analyze relationship between earnings and
characteristics of employees. Although we can have many more characteristics but
here we use only years of experience, gender, race, age and the age squares. The
basic model is
Earnings = β1 + β2*gdum + β3*race + β4*age + β5*experience + εt
To see whether earning depends on experience. Highlight earning and experience
and Open as Group. Quick/Estimate Equation, following result appears:
Results show that the coefficient of the experience is significantly different from zero.
Hence earnings do depend on experience.
Similarly to see whether earnings depend on gender/race/both. Here we take female
as control group and after including the race dummy control group will be black
female and so on…… Highlight earning and gdum Open as Equation, following
result appears. Observing the result we see that male earning is not significantly
different from female.
66 | G a n e s h M a n j h i
In the first box above we see that male earning is not significantly different from the
female counterpart. Similarly even after including the race dummy individual
coefficients are not significantly different from the earning of black female in the
second result box.