krolzig markov-switching vector autoregressions_ modelling, statistical inference, and application...

Upload: sephard

Post on 06-Jul-2018

215 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/18/2019 Krolzig Markov-Switching Vector Autoregressions_ Modelling, Statistical Inference, And Application to Business Cycle A

    1/374

    Hans–Martin Krolzig

    Markov–Switching

    Vector AutoregressionsModelling, Statistical Inference, and Applicationto Business Cycle Analysis

  • 8/18/2019 Krolzig Markov-Switching Vector Autoregressions_ Modelling, Statistical Inference, And Application to Business Cycle A

    2/374

  • 8/18/2019 Krolzig Markov-Switching Vector Autoregressions_ Modelling, Statistical Inference, And Application to Business Cycle A

    3/374

    To my parents, Grete and Walter 

  • 8/18/2019 Krolzig Markov-Switching Vector Autoregressions_ Modelling, Statistical Inference, And Application to Business Cycle A

    4/374

  • 8/18/2019 Krolzig Markov-Switching Vector Autoregressions_ Modelling, Statistical Inference, And Application to Business Cycle A

    5/374

    Preface 

    This book contributes to recent developments on the statistical analysis of multiple

    time series in the presence of regime shifts. Markov-switching models have be-

    come popular for modelling non-linearities and regime shifts, mainly, in univariateeconomic time series. This study is intended to provide a systematic and opera-

    tional approach to the econometric modelling of dynamic systems subject to shifts

    in regime, based on the Markov-switching vector autoregressive model. The study

    presents a comprehensive analysis of the theoretical properties of Markov-switching

    vector autoregressive processes and the related statistical methods. The statistical

    concepts are illustrated with applications to empirical business cycle research.

    This monograph is a revised version of my dissertation which has been accepted by

    the Economics Department of the Humboldt-University of Berlin in 1996. It con-

    sists mainly of unpublished material which has been presented during the last years

    at conferences and in seminars. The major parts of this study were written while I

    was supported by the Deutsche Forschungsgemeinschaft (DFG), Berliner Graduier-

    tenkolleg Angewandte Mikro ¨ okonomik and Sonderforschungsbereich 373 at the Free

    University and Humboldt-University of Berlin. Work was finally completed in the

    project The Econometrics of Macroeconomic Forecasting founded by the  Economic

    and Social Research Council  (ESRC) at the Institute of Economics and Statistics,

    University of Oxford. It is a pleasure to record my thanks to these institutions for

    their support of my research embodied in this study.

    The author is indebted to numerous individuals for help in the preparation of this

    study. Primarily, I owe a great debt to Helmut Lütkepohl, who inspired me for mul-

    tiple time series econometrics, suggested the subject, advised and encouraged my

    vii

  • 8/18/2019 Krolzig Markov-Switching Vector Autoregressions_ Modelling, Statistical Inference, And Application to Business Cycle A

    6/374

    viii   Preface 

    research. The many hours Helmut Lütkepohl and Jürgen Wolters spent in discussing

    the issues of this study have been an immeasurable help.

    The results obtained and their presentation have been profoundly affected by the in-

    spiration of and interaction with numerous colleagues in Berlin and Oxford. Of the

    many researchers from whom I have benefited by discussing with them various as-

    pects of the work presented here, I would like especially to thank Ralph Friedmann,

    David Hendry and D.S. Poskitt.

    I wish to express my sincere appreciation of the helpful discussions, suggestions

    and comments of the audiences at the  7th World Congress of the Econometric So-

    ciety, the  SEDC 1996 Annual Meeting, the ESEM96 , the  American Wintermeeting

    of the Econometric Society 1997 , the 11th Annual Congress of the European Eco-

    nomic Association, the Workshop Zeitreihenanalyse und stochastische Prozesse and

    the Pfingsttreffen 1996 of the Deutsche Statistische Gesellschaft , the Jahrestagungen

    1995 and 1996  of the Verein f ¨ ur Socialpolitik , and in seminars at the Free-University

    Berlin, the Humboldt-University of Berlin, the University College London and Nuf-

    field College, Oxford.

    Many people have helped with the reading of the manuscript. Special thanks go

    to Paul Houseman, Marianne Sensier, Dirk Soyka and Don Indra Asoka Wijewick-

    rama; they pointed out numerous errors and provided helpful suggestions.

    I am very grateful to all of them, but they are of course, absolved from any respons-

    ibility for the views expressed in the book. Any errors that may remain are my own.

    Finally, I am greatly indebted to my parents and friends for their support and en-

    couragement while I was struggling with the writing of the thesis.

     Hans-Martin KrolzigOxford, March 1997 

  • 8/18/2019 Krolzig Markov-Switching Vector Autoregressions_ Modelling, Statistical Inference, And Application to Business Cycle A

    7/374

    Contents

    Prologue 1

    1 The Markov–Switching Vector Autoregressive Model 6

    1.1 General Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

    1.2 Markov-Switching Vector Autoregressions . . . . . . . . . . . . . . . . . . . . 10

    1.2.1 The Vector Autoregression . . . . . . . . . . . . . . . . . . . . . . . . . . 101.2.2 Particular MS–VAR Processes . . . . . . . . . . . . . . . . . . . . . . . 13

    1.2.3 The Regime Shift Function . . . . . . . . . . . . . . . . . . . . . . . . . . 14

    1.2.4 The Hidden Markov Chain . . . . . . . . . . . . . . . . . . . . . . . . . . 16

    1.3 The Data Generating Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

    1.4 Features of MS-VAR Processes and Their Relation to Other Non-

    linear Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

    1.4.1 Non-Normality of the Distribution of the Observed Time

    Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

    1.4.2 Regime-dependent Variances and Conditional Heteroske-

    dasticity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 231.4.3 Regime-dependent Autoregressive Parameters: ARCH and

    Stochastic Unit Roots . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

    1.5 Conclusion and Outlook . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

    1.A Appendix: A Note on the Relation of SETAR to MS-AR Processes 27

    2 The State-Space Representation 29

    2.1 A Dynamic Linear State-Space Representation for MS-VAR Pro-

    cesses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

    2.1.1 The Gaussian Measurement Equation . . . . . . . . . . . . . . . . . 33

    2.1.2 The Non–Normal VAR(1)–Representation of the Hidden

    Markov Chain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

    2.1.3 Linearity of the State-Space Representation . . . . . . . . . . . . 34

    ix

  • 8/18/2019 Krolzig Markov-Switching Vector Autoregressions_ Modelling, Statistical Inference, And Application to Business Cycle A

    8/374

    x   Contents

    2.1.4 Markov Property of the State-Space Representation . . . . . . 35

    2.2 Specification of the State–Space Representation . . . . . . . . . . . . . . . . 38

    2.3 An Unrestricted State-Space Representation . . . . . . . . . . . . . . . . . . . 41

    2.4 Prediction-Error Decomposition and the Innovation State-Space

    Form . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 422.5 The MS-VAR Model and Time–Varying Coefficient Models . . . . . 45

    3 VARMA-Representation of MSI-VAR and MSM-VAR Processes 49

    3.1 Linearly Transformed Finite Order VAR Representations . . . . . . . . 50

    3.2 VARMA Representation Theorems . . . . . . . . . . . . . . . . . . . . . . . . . . 55

    3.2.1 VARMA Representation of Linearly Transformed Finite

    Order VAR Representations . . . . . . . . . . . . . . . . . . . . . . . . . 55

    3.2.2 ARMA Representation of a Hidden Markov Chain . . . . . . 56

    3.2.3 VARMA Representations of MSI(M )–VAR(0) Processes . 56

    3.2.4 VARMA Representations of MSI(M )–VAR( p) Processes . 57

    3.2.5 VARMA Representations of MSM(M )–VAR( p) Processes 58

    3.3 The Autocovariance Function of MSI–VAR and MSM-VAR Pro-

    cesses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59

    3.3.1 The ACF of the Regime Generating Process . . . . . . . . . . . . 60

    3.3.2 The ACF of a Hidden Markov Chain Process . . . . . . . . . . . 61

    3.3.3 The ACF of MSM–VAR Processes . . . . . . . . . . . . . . . . . . . 62

    3.3.4 The ACF of MSI-VAR Processes . . . . . . . . . . . . . . . . . . . . . 64

    3.4 Outlook . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66

    4 Forecasting MS–VAR Processes 67

    4.1 MSPE-Optimal Predictors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 684.2 Forecasting MSM–VAR Processes . . . . . . . . . . . . . . . . . . . . . . . . . . . 70

    4.3 Forecasting MSI–VAR Processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73

    4.4 Forecasting MSA–VAR Processes . . . . . . . . . . . . . . . . . . . . . . . . . . . 74

    4.5 Summary and Outlook . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77

    5 The BLHK Filter 79

    5.1 Filtering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80

    5.2 Smoothing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84

    5.A Supplements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89

    5.A.1 Conditional Moments of Regime . . . . . . . . . . . . . . . . . . . . . 89

    5.A.2 A Technical Remark on Hidden Markov-Chains: The

    MSI/MSIH(M )-VAR(0) Model . . . . . . . . . . . . . . . . . . . . . . 90

  • 8/18/2019 Krolzig Markov-Switching Vector Autoregressions_ Modelling, Statistical Inference, And Application to Business Cycle A

    9/374

    Contents   xi

    6 Maximum Likelihood Estimation 91

    6.1 The Likelihood Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92

    6.2 The Identification Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94

    6.3 Normal Equations of the ML Estimator . . . . . . . . . . . . . . . . . . . . . . . 97

    6.3.1 Derivatives with Respect to the VAR Parameters . . . . . . . . 986.3.2 Derivatives with Respect to the Hidden Markov-Chain

    Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99

    6.3.3 Initial State . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101

    6.4 The EM Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103

    6.4.1 Estimation of  γ   . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105

    6.4.2 Estimation of  σ  under Homoskedasticity . . . . . . . . . . . . . . . 109

    6.4.3 Estimation of  σ  under Heteroskedasticity . . . . . . . . . . . . . . 110

    6.4.4 Convergence Criteria . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111

    6.5 Extensions and Alternatives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112

    6.5.1 The Scoring Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1136.5.2 An Adaptive EM Algorithm (Recursive Maximum Likeli-

    hood Estimation) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115

    6.5.3 Incorporating Bayesian Priors . . . . . . . . . . . . . . . . . . . . . . . . 117

    6.5.4 Extension to General State-Space Models with Markovian

    Regime Shifts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118

    6.6 Asymptotic Properties of the Maximum Likelihood Estimator . . . . 120

    6.6.1 Asymptotic Normal Distribution of the ML Estimator . . . . 120

    6.6.2 Estimation of the Asymptotic Variance–Covariance Matrix 122

    6.7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124

    7 Model Selection and Model Checking 125

    7.1 A Bottom-up Strategy for the Specification of MS–VAR Models . . 126

    7.2 ARMA Representation Based Model Selection . . . . . . . . . . . . . . . . 132

    7.3 Model Checking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135

    7.3.1 Residual Based Model Checking . . . . . . . . . . . . . . . . . . . . . 135

    7.3.2 The Coefficient of Determination . . . . . . . . . . . . . . . . . . . . . 137

    7.4 Specification Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137

    7.4.1 Likelihood Ratio Tests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138

    7.4.2 Lagrange Multiplier Tests . . . . . . . . . . . . . . . . . . . . . . . . . . . 139

    7.4.3 Wald Tests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141

    7.4.4 Newey-Tauchen-White Test for Dynamic Misspecification 142

    7.5 Determination of the Number of Regimes . . . . . . . . . . . . . . . . . . . . . 144

  • 8/18/2019 Krolzig Markov-Switching Vector Autoregressions_ Modelling, Statistical Inference, And Application to Business Cycle A

    10/374

    xii   Contents

    7.6 Some Critical Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147

    8 Multi-Move Gibbs Sampling 148

    8.1 Bayesian Analysis via the Gibbs Sampler . . . . . . . . . . . . . . . . . . . . . 150

    8.2 Bayesian Analysis of Linear Markov-Switching Regression Models 152

    8.3 Multi–Move Gibbs Sampling of Regimes . . . . . . . . . . . . . . . . . . . . . 155

    8.3.1 Filtering and Smoothing Step . . . . . . . . . . . . . . . . . . . . . . . . 156

    8.3.2 Stationary Probability Distribution and Initial Regimes . . . 157

    8.4 Parameter Estimation via Gibbs Sampling . . . . . . . . . . . . . . . . . . . . 158

    8.4.1 Hidden Markov Chain Step . . . . . . . . . . . . . . . . . . . . . . . . . 158

    8.4.2 Inverted Wishart Step . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160

    8.4.3 Regression Step . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162

    8.5 Forecasting via Gibbs Sampling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166

    8.6 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168

    9 Comparative Analysis of Parameter Estimation in Particular MS-VARModels 170

    9.1 Analysis of Regimes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172

    9.2 Comparison of the Gibbs Sampler with the EM Algorithm . . . . . . . 174

    9.3 Estimation of VAR Parameters for Given Regimes. . . . . . . . . . . . . . 175

    9.3.1 The Set of Regression Equations . . . . . . . . . . . . . . . . . . . . . 175

    9.3.2 Maximization Step of the EM Algorithm . . . . . . . . . . . . . . . 177

    9.3.3 Regression Step of the Gibbs Sampler . . . . . . . . . . . . . . . . . 180

    9.3.4 MSI Specifications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182

    9.3.5 MSM Specifications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184

    9.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1869.A Appendix: Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187

    10 Extensions of the Basic MS-VAR Model 202

    10.1 Systems with Exogenous Variables . . . . . . . . . . . . . . . . . . . . . . . . . . 202

    10.2 Distributed Lags in the Regime . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205

    10.2.1 The MSI(M, q )-VAR( p) Model . . . . . . . . . . . . . . . . . . . . . . 205

    10.2.2 VARMA Representations of MSI(M, q )–VAR( p) Processes 206

    10.2.3 Filtering and Smoothing . . . . . . . . . . . . . . . . . . . . . . . . . . . . 208

    10.3 The Endogenous Markov-Switching Vector Autoregressive Model 208

    10.3.1 Models with Time-Varying Transition Probabilities . . . . . . 208

    10.3.2 Endogenous Selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 211

    10.3.3 Filtering and Smoothing . . . . . . . . . . . . . . . . . . . . . . . . . . . . 212

  • 8/18/2019 Krolzig Markov-Switching Vector Autoregressions_ Modelling, Statistical Inference, And Application to Business Cycle A

    11/374

    Contents   xiii

    10.3.4 A Modified EM Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . 213

    10.4 Summary and Outlook . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 214

    11 Markov-Switching Models of the German Business Cycle 215

    11.1 MS-AR Processes as Stochastic Business Cycle Models . . . . . . . . . 218

    11.2 Preliminary Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219

    11.2.1 Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219

    11.2.2 Traditional Turning Point Dating . . . . . . . . . . . . . . . . . . . . . 221

    11.2.3 ARMA Representation Based Model Pre-Selection . . . . . . 222

    11.3 The Hamilton Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 224

    11.3.1 Estimation Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 224

    11.3.2 Contribution to the Business Cycle Characterization . . . . . 226

    11.3.3 Impulse Response Analysis. . . . . . . . . . . . . . . . . . . . . . . . . . 229

    11.3.4 Asymmetries of the Business Cycle . . . . . . . . . . . . . . . . . . . 230

    11.3.5 Kernel Density Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . 231

    11.4 Models with Markov-Switching Intercepts . . . . . . . . . . . . . . . . . . . . 233

    11.5 Regime-Dependent and Conditional Heteroskedasticity . . . . . . . . . 237

    11.6 Markov-Switching Models with Multiple Regimes . . . . . . . . . . . . . 243

    11.6.1 Outliers in a Three-Regime Model . . . . . . . . . . . . . . . . . . . . 243

    11.6.2 Outliers and the Business Cycle . . . . . . . . . . . . . . . . . . . . . . 245

    11.6.3 A Hidden Markov-Chain Model of the Business Cycle . . . 246

    11.6.4 A Highly Parameterized Model . . . . . . . . . . . . . . . . . . . . . . 248

    11.6.5 Some Remarks on Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . 250

    11.7 MS-AR Models with Regime-Dependent Autoregressive Parameters 250

    11.8 An MSMH(3)-AR(4) Business Cycle Model . . . . . . . . . . . . . . . . . . 253

    11.9 Forecasting Performance. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 255

    11.10 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 258

    11.A Appendix: Business Cycle Analysis with the Hodrick-Prescott Filter 260

    12 Markov–Switching Models of Global and International Business

    Cycles 262

    12.1 Univariate Markov-Switching Models . . . . . . . . . . . . . . . . . . . . . . . . 263

    12.1.1 USA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 266

    12.1.2 Canada . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 267

    12.1.3 United Kingdom . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 268

    12.1.4 Germany . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 269

    12.1.5 Japan . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 270

    12.1.6 Australia . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 276

  • 8/18/2019 Krolzig Markov-Switching Vector Autoregressions_ Modelling, Statistical Inference, And Application to Business Cycle A

    12/374

    xiv   Contents

    12.1.7 Comparisons . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 279

    12.2 Multi-Country Growth Models with Markov-Switching Regimes . . 282

    12.2.1 Common Regime Shifts in the Joint Stochastic Process of 

    Economic Growth . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 282

    12.2.2 Structural Breaks and the End of the Golden Age . . . . . . . . 28312.2.3 Global Business Cycles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 286

    12.2.4 Rapid Growth Episodes and Recessions . . . . . . . . . . . . . . . 289

    12.3 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 293

    12.A Appendix: Estimated MS-DVAR Models . . . . . . . . . . . . . . . . . . . . . 295

    13 Cointegration Analysis of VAR Models with Markovian Shifts in Re-

    gime 302

    13.1 Cointegrated VAR Processes with Markov-Switching Regimes . . . 303

    13.1.1 Cointegration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 303

    13.1.2 The MSCI-VAR Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 304

    13.1.3 A State-Space Representation for MSCI-VAR Processes . . 307

    13.2 A Cointegrated VARMA Representation for MSCI-VAR Processes 311

    13.3 A Two-Stage Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 314

    13.3.1 Cointegration Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 315

    13.3.2 EM Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 317

    13.4 Global and International Business Cycles . . . . . . . . . . . . . . . . . . . . . 317

    13.4.1 VAR Order Selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 319

    13.4.2 Cointegration Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 319

    13.4.3 Granger Causality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 322

    13.4.4 Forecast Error Decomposition . . . . . . . . . . . . . . . . . . . . . . . 324

    13.5 Global Business Cycles in a Cointegrated System . . . . . . . . . . . . . . 32513.6 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 329

    13.A Appendix: Estimated CI-VAR and MSCI-VAR Models . . . . . . . . . . 331

    Epilogue 335

    References 337

    Tables 353

    Figures 357

    List of Notation 359

  • 8/18/2019 Krolzig Markov-Switching Vector Autoregressions_ Modelling, Statistical Inference, And Application to Business Cycle A

    13/374

    Prologue 

    Objective of the Study

    In the last decade time series econometrics has changed dramatically. One increas-

    ingly prominent field has become the treatment of regime shifts and non-linear mod-

    elling strategies. While the importance of regime shifts, particularly in macroecono-

    metric systems, seems to be generally accepted, there is no established theory sug-

    gesting a unique approach for specifying econometric models that embed changes

    in regime.

    Structural changes such as the oil price shocks, the introduction of European Mon-

    etary System, the German reunification, the European Monetary Union and Eastern

    European economies in transition, are often incorporated into a dynamic system in

    a deterministic fashion. A time-varying process poses problems for estimation and

    forecasting when a shift in parameters occurs. The degradation of performance of structural macroeconomic models seems at least partly due to regime shifts. In-

    creasingly, regime shifts are not considered as singular deterministic events, but the

    unobservable regime is assumed to be governed by an exogenous stochastic process.

    Thus regime shifts of the past are expected to occur in the future in a similar fashion.

    The main aim of this study is to construct a general econometric framework for

    the statistical analysis of multiple time series when the mechanism which generated

    the data is subject to regime shifts. We build-up a  stationary model where a stable

    vector autoregression   is defined conditional on the regime and where the  regime

    generating process is given by an  irreducible ergodic Markov chain.

    The primary advantage of the  Markov-switching vector autoregressive model  is to

    provide a systematic approach to deliver statistical methods for: (i.) extracting the

    1

  • 8/18/2019 Krolzig Markov-Switching Vector Autoregressions_ Modelling, Statistical Inference, And Application to Business Cycle A

    14/374

    2   Prologue 

    information in the data about regime shifts in the past, (ii.) estimating consistently

    and efficiently the parameters of the model, (iii.) detecting recent regime shifts,

    (iv.) correcting the vector autoregressive model at times when the regime alters,

    and finally (v.) incorporating the probability of future regime shifts into forecasts.

    This Markov-switching vector autoregressive model represents a very general class

    which encompasses some alternative non-linear and time-varying models. In gen-

    eral, the model generates conditional heteroskedasticity and non-normality; predic-

    tion intervals are asymmetric and reflect the prevailing uncertainty about the regime.

    We will investigate the issues of detecting multiple breaks in multiple time series,

    modelling, specification, estimation, testing and forecasting. En route, we discuss

    the relation to alternative non-linear models and models with time-varying para-

    meters. In course of this study we will also propose new directions to generalize the

    MS-VAR model. Although some methodological and technical ideas are discussed

    in detail, the focus is on modelling, specification and estimation of suitable models.

    The previous literature on this topic is often characterized by imprecise generalities

    or the restriction of empirical analysis to a very specific model whose specifica-

    tion is motivated neither statistically nor theoretically. These limitations have to be

    overcome. Therefore, the strategy of this study has to be twofold: (i.) to provide a

    general approach to model building and (ii.) to offer concrete solutions for special

    problems. This strategy implies an increase in the number of models as well as in

    the complexity of the analysis. We believe, however, that this price will be proven

    in practice to be offset by the increased flexibility for empirical research.

    Survey of the Study

    The first part of the book gives a comprehensive mathematical and statistical ana-

    lysis of the Markov-switching vector autoregressive model. In the first chapters,

    Markov-switching vector autoregressive (MS-VAR) processes are introduced and

    their basic properties are investigated. We discuss the relation of the MS-VAR

    model to the time invariant vector autoregressive model and against alternative non-

    linear time series models. The preliminary considerations of  Chapter 1 are formal-

    ized in the state-space representation given in  Chapter 2 , which will be the frame-

    work for analyzing the stochastic properties of MS-VAR processes and for devel-

    oping statistical techniques for the specification and estimation of MS-VAR models

  • 8/18/2019 Krolzig Markov-Switching Vector Autoregressions_ Modelling, Statistical Inference, And Application to Business Cycle A

    15/374

    Survey of the Study    3

    to fit data which exhibits regime shifts in a stationary manner. In  Chapter 3, vector 

    autoregressive moving average (VARMA) representation theorems for VAR models

    with Markov-switching means or intercepts are given.

    In Chapter 4  and Chapter 5 , the statistical analysis of MS-VAR models is considered

    for known parameters. In  Chapter 4 , optimal predictors for MS-VAR processes are

    derived.  Chapter 5  is devoted to an intensive discussion of the  filtering and smooth-

    ing techniques  for MS-VAR processes which the following statistical analysis is

    based on. These statistical tools produce an inference for the time paths of unob-

    served regimes under alternative information sets and given parameters. It is shown

    that a modification of the model by introducing time-varying transition probabilities

    can be analyzed with only slight modifications within our framework.

    The main part of this study (Chapters 6 – 10 ) is devoted to the discussion of para-

    meter estimation for this class of models. The classical method of  maximum like-

    lihood estimation is considered in  Chapter 6 , where due to the nonlinearity of the

    model, iterative procedures have to be introduced. While various approaches are

    discussed, major attention is given to the EM algorithm, at which the limitation in

    the previous literature of using special MS-VAR models is overcome. The issues

    of identifiability and consistency of the maximum likelihood (ML) estimation are

    investigated. Techniques for the calculation of the asymptotic variance-covariance

    matrix of ML estimates are presented.

    In  Chapter 7  the issue of  model selection and  model checking  is investigated. The

    focus is maintained on the  specification of MS-VAR models. A strategy for sim-

    ultaneously selecting the number of regimes and the order of the autoregression inMarkov-switching time series models based on ARMA representations is proposed

    and combined with classical specification testing procedures.

    Chapter 8  introduces a multi-move  Gibbs-Sampler  for multiple time series subject

    to regime shifts. Even for univariate time series analysis, an improvement over the

    approaches described in the literature is achieved by an increased convergence due

    to the simultaneous sampling of the regimes from their joint posterior distribution

    using the methods introduced in  Chapter 5 . Here again, a thorough analysis of vari-

    ous MS-VAR specifications allows for a greater flexibility in empirical research.

    The main advantage of the Gibbs sampler is that (by invoking Bayesian theory) this

    simulation technique enables us to gain new insights into the unknown parameters.

    Without informative priors, the Gibbs sampler reproduces the ML estimator as mode

  • 8/18/2019 Krolzig Markov-Switching Vector Autoregressions_ Modelling, Statistical Inference, And Application to Business Cycle A

    16/374

  • 8/18/2019 Krolzig Markov-Switching Vector Autoregressions_ Modelling, Statistical Inference, And Application to Business Cycle A

    17/374

    Survey of the Study    5

    a multiple time series framework.   Chapter 12  contributes to the research of   inter-

    national and global business cycles  by analyzing a six-dimensional system for the

    USA, Japan, West Germany, the UK, Canada, and Australia. The considerations

    formulated in Chapter 13 suggest a new methodological approach to the  analysis of 

    cointegrated linear systems with shifts in regime. This methodology is then illus-

    trated with a reconsideration of international and global business cycles. The study

    concludes with a brief discussion of our major findings and remaining problems.

    The study has a modular structure. Given the notation and basic structures intro-

    duced in the first two chapters, most of the following chapters can stand alone.

    Hence, the reader, who is primarily interested in empirical applications and less in

    statistical techniques, can decide to read first the fundamental  Chapters 1  and  2 ,

    then Chapter 5  and Chapter 6  followed by the empirical analyses in Chapters 11 and

    12  alongside the more technically demanding Chapter 13 and to decide afterwards

    which of the remaining chapters will be of interest to him or her.

    Although it is not necessary for the reader to be familiar with all fundamental meth-

    ods of multiple time series analysis, the subject of interest requires the application

    of some formal techniques. A number of references to standard results are given

    throughout the study, while to simplify things for the reader we have remained as

    close as possible to the notation used in L ÜTKEPOHL  [1991]. In order to achieve

    compactness in our presentation, we have dispensed with a more general introduc-

    tion of the topic since these are already available in H AMILTON   [1993], [1994b,

    ch. 22] and KROLZIG AND L ÜTKEPOHL  [1995].

  • 8/18/2019 Krolzig Markov-Switching Vector Autoregressions_ Modelling, Statistical Inference, And Application to Business Cycle A

    18/374

    Chapter 1

    The Markov–Switching 

    Vector Autoregressive Model 

    This first chapter is devoted to a general introduction into the Markov–switching

    vector autoregressive (MS-VAR) time series model. In  Section 1.2  we present the

    fundamental assumptions constituting this class of models. The discussion of the

    two components of MS-VAR processes will clarify their on time invariant vector

    autoregressive and Markov-chain models. Some basic stochastic properties of MS-

    VAR processes are presented in Section 1.3. Finally, MS-VAR models are compared

    to alternative non-normal and non-linear time series models proposed in the literat-

    ure. As most non-linear models have been developed for univariate time series, this

    discussion is restricted to this case. However, generalizations to the vector case are

    also considered.

    1.1 General Introduction

    Reduced form vector autoregressive (VAR) models have been become a dominant

    research strategy in empirical macroeconomics since S IM S [1980]. In this study we

    will consider VAR models with changes in regime, most results will carry over to

    structural dynamic econometric models by treating them as restricted VAR models.

    When the system is subject to regime shifts, the parameters  θ  of the VAR process

    will be time-varying. But the process might be time-invariant conditional on an

    unobservable regime variable  s t  which indicates the regime prevailing at time   t.

    Let M  denote the number of feasible regimes, so that  s t ∈ {1, . . . , M  }. Then the

    6

  • 8/18/2019 Krolzig Markov-Switching Vector Autoregressions_ Modelling, Statistical Inference, And Application to Business Cycle A

    19/374

    1.1. General Introduction   7

    conditional probability density of the observed time series vector y t is given by

     p(yt|Y t−1, st) =

    f (yt|Y t−1, θ1)   if  st  = 1...

    f (yt|Y t−1, θM )   if  st  =  M,

    (1.1)

    where θm  is the VAR parameter vector in regime  m  = 1, . . . , M    and Y t−1  are the

    observations {yt−j}∞j=1.Thus, for a given regime s t, the time series vector yt is generated by a vector auto-

    regressive process of order p (VAR( p) model) such that

    E[yt|Y t−1, st] = ν (st) + p

    j=1

    Aj (st)yt−j ,

    where ut is an innovation term,

    ut =  yt − E[yt|Y t−1, st].

    The innovation process   ut   is a zero-mean white noise process with a variance-

    covariance matrix Σ(st), which is assumed to be Gaussian:

    ut ∼ NID(0, Σ(st)).

    If the VAR process is defined conditionally upon an unobservable regime as in equa-

    tion (1.1), the description of the data generating mechanism has to be completed by

    assumptions regarding the regime generating process. In Markov-switching vector

    autoregressive (MS-VAR) models – the subject of this study – it is assumed that theregime st is generated by a discrete-state homogeneous Markov chain: 1

    Pr(st|{st−j}∞j=1, {yt−j}∞j=1) = Pr(st|st−1; ρ),

    where ρ denotes the vector of parameters of the regime generating process.

    The vector autoregressive model with Markov-switching regimes is founded

    on at least three traditions. The first is the linear time-invariant   vector auto-

    regressive model, which is the framework for the analysis of the relation of 

    the variables of the system, the dynamic propagation of innovations to the

    1The notation Pr(·) refers to a discrete probability measure, while p(·) denotes a probability density

    function.

  • 8/18/2019 Krolzig Markov-Switching Vector Autoregressions_ Modelling, Statistical Inference, And Application to Business Cycle A

    20/374

    8   The Markov–Switching Vector Autoregressive Model 

    system, and the effects of changes in regime. Secondly, the basic statistical

    techniques have been introduced by BAUM AND   PETRIE   [1966] and BAUM

    et al.   [1970] for  probabilistic functions of Markov chains, while the MS-VAR

    model also encompasses older concepts as the   mixture of normal distributions

    model   attributed to PEARSON   [1894] and the   hidden Markov-chain model

    traced back to BLACKWELL AND   KOOPMANS   [1975] and HELLER   [1965].

    Thirdly, in econometrics, the first attempt to create Markov-switching regression

    models were undertaken by GOLDFELD AND   QUANDT   [1973], which remained,

    however, rather rudimentary. The first comprehensive approach to the statistical

    analysis of Markov-switching regression models has been proposed by L INDGREN

    [1978] which is based on the ideas of BAUM  et al.  [1970]. In time series analysis,

    the introduction of the Markov-switching model is due to HAMILTON   [1988],

    [1989] on which most recent contributions (as well as this study) are founded.

    Finally, our consideration of MS-VAR models as a Gaussian vector autoregressive

    process conditioned on an exogenous regime generating process is closely related to

    state space models as well as the concept of doubly stochastic processes introduced

    by TJØSTHEIM  [1986b].

    The MS-VAR model belongs to a more general class of models that characterize a

    non-linear data generating process as piecewise linear by restricting the process to

    be linear in each regime, where the regime is conditioned is unobservable, and only

    a discrete number of regimes are feasible.2 These models differ in their assumptions

    concerning the stochastic process generating the regime:

    (i.) The mixture of normal distributions  model is characterized by serially inde-pendently distributed regimes:

    Pr(st|{st−j}∞j=1, {yt−j}∞j=1) = Pr(st; ρ).

    In contrast to MS-VAR models, the transition probabilities are independent of 

    the history of the regime. Thus the conditional probability distribution of  y t

    is independent of  st−1,

    Pr(yt|Y t−1, st−1) = Pr(yt|Y t−1),

    2In the case of two regimes, P OTTER   [1990],[1993] proposed to call this class of non-linear, non-

    normal models the  single index generalized multivariate autoregressive (SIGMA) model.

  • 8/18/2019 Krolzig Markov-Switching Vector Autoregressions_ Modelling, Statistical Inference, And Application to Business Cycle A

    21/374

    1.1. General Introduction   9

    and the conditional mean E[yt|Y t−1, st−1] is given by  E[yt|Y t−1].3 Even so,this model can be considered as a restricted MS-VAR model where the trans-

    ition matrix has rank one. Moreover, if only the intercept term will be regime-

    dependent, MS(M )-VAR( p) processes with Gaussian errors and i.i.d. switch-

    ing regimes are observationally equivalent to time-invariant VAR( p) processes

    with non-normal errors. Hence, the modelling with this kind of model is very

    limited.

    (ii.) In the   self-exciting threshold autoregressive   SETAR( p, d, r) model, the

    regime-generating process is not assumed to be exogenous but directly linked

    to the lagged endogenous variable y t−d.4 For a given but unknown threshold

    r, the ‘probability’ of the unobservable regime s t = 1 is given by

    Pr(st  = 1|{st−j}∞j=1, {yt−j}∞j=1) = I (yt−d ≤ r) =   1   if  yt−d ≤ r0   if  yt−d  > r,

    While the presumptions of the SETAR and the MS-AR model seem to be

    quite different, the relation between both model alternatives is rather close.

    This is also illustrated in the appendix which gives an example showing that

    SETAR and MS-VAR models can be observationally equivalent.

    (iii.) In the smooth transition autoregressive (STAR) model popularized by GRA N-

    GER AND T ER ÄSVIRTA [1993], exogenous variables are mostly employed to

    model the weights of the regimes, but the regime switching rule can also be

    dependent on the history of the observed variables,  i.e. y t−d:

    Pr(st  = 1|{st−j}∞j=1, {yt−j}∞j=1, ) = F (yt−dδ − r),

    where F (yt−dδ − r)   is a continuous function determining the weight of re-

    3The likelihood function is given by

     p(Y T |Y 0; θ,  ξ̄) =

    T t=1

    M m=1

    ξ̄m p(yt|Y t−1, θm),

    where θ  = (θ1, . . . , θM )

    collects the VAR parameters and ξ̄m   is the ergodic probability of regime

    m.4

    In threshold autoregressive (TAR) processes, the indicator function is defined in a switching variablezt−d, d  ≥  0. In addition, indicator variables can be introduced and treated with error-in-variables

    techniques. Refer for example to COSSLETT AND L EE  [1985] and KAMINSKY [1993].

  • 8/18/2019 Krolzig Markov-Switching Vector Autoregressions_ Modelling, Statistical Inference, And Application to Business Cycle A

    22/374

    10   The Markov–Switching Vector Autoregressive Model 

    gime 1. For example, TER  ÄSVIRTA AND  A NDERSON [1992] use the logistic

    distribution function in their analysis of the U.S. business cycle. 5

    (iv.) All the previously mentioned models are special cases of an  endogenous se-

    lection Markov-switching vector autoregressive   model. In an EMS(M, d)-

    VAR( p) model the transition probabilities pij (·) are functions of the observedtime series vector yt−d:

    Pr(st  =  m|st−1 =  i, yt−d) = pim(yt−dδ ).

    Thus the observed variables contain additional information on the conditional

    probability distribution of the states:

    Pr(st|{st−j}∞j=1)a.e.

    = Pr(st|{st−j}∞j=1, {yt−j}∞j=1).

    Thus the regime generating process is no longer Markovian. In contrast to the

    SETAR and the STAR model, EMS-VAR models include the possibility thatthe threshold depends on the last regime,  e.g.  that the threshold for staying

    in regime 2 is different from the threshold for switching from regime 1 to

    regime 2 . The EMS(M, d)-VAR( p) model will be presented in  Section 10.3.

    It is shown that the methods developed in this study for MS-VAR processes

    can easily be extended to capture EMS-VAR processes.

    In this study, it will be shown that the MS-VAR model can encompass a wide spec-

    trum of non-linear modifications of the VAR model proposed in the literature.

    1.2 Markov-Switching Vector Autoregressions

    1.2.1 The Vector Autoregression

    Markov-switching vector autoregressions can be considered as generalizations of 

    the basic finite order VAR model of order p. Consider the p-th order autoregression

    for the K -dimensional time series vector y t = (y1t, . . . , yKt ), t  = 1, . . . , T  ,

    yt  =  ν  + A1yt−1 + . . . + A pyt− p + ut,   (1.2)

    5If  F (·) is even,  e.g.   F (yt−d − r) = 1 − exp−(yt−d − r)2, a generalized exponential auto-

    regressive model as proposed by O ZAKI [1980] and HAGGAN AND O ZAKI [1981] ensues.

  • 8/18/2019 Krolzig Markov-Switching Vector Autoregressions_ Modelling, Statistical Inference, And Application to Business Cycle A

    23/374

    1.2. Markov-Switching Vector Autoregressions   11

    where   ut   ∼   IID(0, Σ)   and   y0, . . . , y1− p   are fixed. Denoting   A(L) =IK  −  A1L − . . . − A pL p as the   (K  ×  K )  dimensional lag polynomial, we as-sume that there are no roots on or inside the unit circle |A(z)| =   0  for |z| ≤   1where L  is the lag operator, so that  y t−j   =   L

    jyt   . If a normal distribution of the

    error is assumed, ut ∼ NID(0, Σ), equation (1.2) is known as the intercept form of a stable Gaussian VAR( p) model. This can be reparametrized as the mean adjusted

    form of a VAR model:

    yt − µ =  A1(yt−1 − µ) + . . . + A p(yt− p − µ) + ut,   (1.3)where µ = (IK  −

     pj=1 Aj)

    −1ν  is the (K × 1) dimensional mean of  yt.If the time series are subject to shifts in regime, the stable VAR model with its time

    invariant parameters might be inappropriate. Then, the MS–VAR model might be

    considered as a general regime-switching framework. The general idea behind this

    class of models is that the parameters of the underlying data generating process 6 of 

    the  observed   time series vector  y t  depend upon the  unobservable regime variablest, which represents the probability of being in a different state of the world.

    The main characteristic of the Markov-switching model is the assumption that the

    unobservable realization of the regime s t ∈ {1, . . . , M  } is governed by a discretetime, discrete state Markov stochastic process, which is defined by the transition

    probabilities

     pij  = Pr(st+1 =  j |st =  i),M 

    j=1

     pij  = 1   ∀i, j ∈ {1, . . . , M  }.   (1.4)

    More precisely, it is assumed that st follows an irreducible ergodic M  state Markov

    process with the transition matrix P. This will be discussed in Section 1.2.4 in moredetail.

    In generalization of the mean-adjusted VAR( p) model in equation (1.3) we would

    like to consider Markov-switching vector autoregressions of order p and M  regimes:

    yt−µ(st) = A1(st) (yt−1 − µ(st−1))+ . . .+A p(st) (yt− p − µ(st− p))+ ut,   (1.5)where  ut  ∼   NID(0, Σ(st))   and  µ(st), A1(st), . . . , A p(st), Σ(st)   are parametershift functions describing the dependence of the parameters 7 µ, A1, . . . , A p, Σ on

    6For reasons of simplicity in notation, we do not introduce a separate notation for the theoretical

    representation of the stochastic process and its actual realizations.7In the notation of state-space models, the varying   parameters µ,ν, A1, . . . , Ap, Σ become functions

    of the model’s   hyper-parameters.

  • 8/18/2019 Krolzig Markov-Switching Vector Autoregressions_ Modelling, Statistical Inference, And Application to Business Cycle A

    24/374

    12   The Markov–Switching Vector Autoregressive Model 

    the realized regime st, e.g.

    µ(st) =

    µ1   if  st = 1,...

    µM    if  st =  M.

    (1.6)

    In the model (1.5) there is after a change in the regime an immediate one–time jump

    in the process mean. Occasionally, it may be more plausible to assume that the mean

    smoothly approaches a new level after the transition from one state to another. In

    such a situation the following model with a regime-dependent intercept term  ν (s t)

    may be used:

    yt   =   ν (st) + A1(st)yt−1 + . . . + A p(st)yt− p + ut.   (1.7)

    In contrast to the linear VAR model, the mean adjusted form (1.5) and the intercept

    form (1.7) of an MS(M )–VAR( p) model are not equivalent. In  Chapter 3 it will be

    seen that these forms imply different dynamic adjustments of the observed variables

    after a change in regime. While a permanent regime shift in the mean  µ(s t) causes

    an immediate jump of the observed time series vector onto its new level, the dynamic

    response to a once-and-for-all regime shift in the intercept term ν (s t) is identical to

    an equivalent shock in the white noise series  u t.

    In the most general specification of an MS-VAR model, all parameters of the autore-

    gression are conditioned on the state s t of the Markov chain. We have assumed that

    each regime m possesses its VAR( p) representation with parameters ν (m) (or µ m),Σm, A1m, . . . , Ajm , m = 1, . . . , M  , such that

    yt =

    ν 1 + A11yt−1 + . . . + A p1yt− p + Σ

    1/21   ut,   if  st  = 1

    ...

    ν M  + A1M yt−1 + . . . + A pM yt− p + Σ1/2M   ut,   if  st  =  M 

    where ut ∼ NID(0, IK ).8

    However for empirical applications, it might be more helpful to use a model where

    only some parameters are conditioned on the state of the Markov chain, while the

    8Even at this early stage a complication arises if the mean adjusted form is considered. The conditionaldensity of  yt  depends not only on  st  but also on  st−1, . . . , st−p,  i.e.   M p+1 different conditional

  • 8/18/2019 Krolzig Markov-Switching Vector Autoregressions_ Modelling, Statistical Inference, And Application to Business Cycle A

    25/374

    1.2. Markov-Switching Vector Autoregressions   13

    other parameters are regime invariant. In  Section 1.2.2  some particular MS-VAR

    models will be introduced where the autoregressive parameters, the mean or the in-

    tercepts, are regime-dependent and where the error term is hetero- or homoskedastic.

    Estimating these particular MS-VAR models is discussed separately in Chapter 9 .

    1.2.2 Particular MS–VAR Processes

    The MS-VAR model allows for a great variety of specifications. In principle, it

    would be possible to (i.) make all parameters regime-dependent and (ii.) to intro-

    duce separate regimes for each shifting parameter. But, this would be no practicable

    solution as the number of parameters of the Markov chain grows quadratic in the

    number of regimes and coincidently shrinks the number of observations usable for

    the estimation of the regime-dependent parameter. For these reasons a specific-to-

    general approach may be preferred for the determination of the regime generating

    process by restricting the shifting parameters (i.) to a part of the parameter vector

    and (ii.) to have identical break-points.

    In empirical research, only some parameters will be conditioned on the state of 

    the Markov chain while the other parameters will be regime invariant. In order to

    establish a unique notation for each model, we specify with the general MS(M )

    term the regime-dependent parameters:

    M Markov-switching mean ,

    I Markov-switching intercept  term ,

    A Markov-switching autoregressive parameters ,

    H Markov-switching  heteroskedasticity   .To achieve a distinction of VAR models with time-invariant mean and intercept

    term, we denote the   mean adjusted form of a vector autoregression as MVAR( p).

    means of  yt  are to be distinguished:

    yt=

    µ1 +A11 (yt−1−µ1 )+ . . . +Ap1(yt−p−µ1 )+Σ1/21  ut,   if st=1, . . . , st−p=1

    µ1 +A11 (yt−1−µ1 )+ . . . +Ap1(yt−p−µ2 )+Σ1/21  ut,   if st=1, . . . , st−p+1=1, st−p=2

    .

    .

    .µ1 +A11(yt−1−µM )+ . . . +Ap1(yt−p−µM )+Σ

    1/21  ut,   if st=1, st−1 =M,. . . , st−p=M 

    .

    .

    .

    µM +A1M (yt−1−µ1 )+ . . . +ApM (yt−p−µ1)+Σ1/2M ut,   if st=M, st−1=1, . . . , st−p=1

    .

    .

    .

    µM +A1M (yt−1−µ1 )+ . . . +ApM (yt−p−µM −1)+Σ1/2M ut, if st=M...st−p+1 =M, st−p=M −1

    µM +A1M (yt−1−µM )+ . . . +ApM (yt−p−µM )+Σ1/2M ut,   if st=M,. . . , st−p=M 

  • 8/18/2019 Krolzig Markov-Switching Vector Autoregressions_ Modelling, Statistical Inference, And Application to Business Cycle A

    26/374

    14   The Markov–Switching Vector Autoregressive Model 

    Table 1.1: Special Markov Switching Vector Autoregressive Models

    MSM MSI Specification

    µ varying   µ invariant   ν   varying   ν   invariant

    Aj   Σ invariant MSM–VAR   linear  MVAR MSI–VAR   linear  VAR

    invariant   Σ var yi ng MSMH– VAR MS H–MVAR MSI H–VAR MS H–VAR

    Aj   Σ invariant MSMA–VAR MSA–MVAR MSIA–VAR MSA–VAR

    varying   Σ varying MSMAH–VAR MSAH–MVAR MSIAH–VAR MSAH–VAR

    An overview is given in Table 1.1. Obviously the MSI and the MSM specifications

    are equivalent if the order of the autoregression is zero. For this so-called hidden

    Markov-chain model, we prefer the notation MSI(M )-VAR(0). As it will be seen

    later on, the MSI(M )-VAR(0) model and MSI(M )-VAR( p) models with p >  0  are

    isomorphic concerning their statistical analysis. In  Section 10.3   we will further

    extend the class of models under consideration.

    The MS-VAR model provides a very flexible framework which allows for hetero-

    skedasticity, occasional shifts, reversing trends, and forecasts performed in a non-

    linear manner. In the following sections the focus is on models where the mean

    (MSM(M )–VAR( p) models) or the intercept term (MSI(M )–VAR( p) models) aresubject to occasional discrete shifts; regime-dependent covariance structures of the

    process are considered as additional features.

    1.2.3 The Regime Shift Function

    At this stage it is useful to define the parameter shifts more clearly by formulating

    the system as a single equation by introducing “dummy” (or more precisely) indic-

    ator variables:

    I (st  =  m) =   1 if  st =  m

    0 otherwise,

  • 8/18/2019 Krolzig Markov-Switching Vector Autoregressions_ Modelling, Statistical Inference, And Application to Business Cycle A

    27/374

    1.2. Markov-Switching Vector Autoregressions   15

    where m  = 1, . . . , M  . In the course of the following chapters it will prove helpful

    to collect all the information about the realization of the Markov chain in the vector

    ξ t as

    ξ t  = I (st  = 1)

    ...

    I (st  =  M )

    .Thus,  ξ t  denotes the unobserved state of the system. Since ξ t  consists of binary

    variables, it has some particular properties:

    E[ξ t] =

    Pr(st = 1)

    ...

    Pr(st  =  M )

    =

    Pr(ξ t  =  ι1)...

    Pr(ξ t  = ιM )

    ,where ιm  is the m-th column of the identity matrix. Thus  E[ξ t], or a well defined

    conditional expectation, represents the probability distribution of  s t. It is easily

    verified that  1 M ξ t   = 1  as well as  ξ tξ t   = 1  and ξ tξ 

    t   = diag(ξ t), where  1 M   =

    (1, . . . , 1) is an (M  × 1) vector.For example, we can now rewrite the mean shift function (1.6) as

    µ(st) =

    M m=1

    µmI (st  = m).

    In addition, we can use matrix notation to derive

    µ(st) = Mξ t,

    where M is a  (K × M ) matrix containing the means,M =   µ1   . . . µM  , µ = vec(M).

    We will occasionally use the following notation for the variance parameters:

    Σ

    (K ×MK )

    =

      Σ1   . . .   ΣM 

    σmK(K+1)

    2   ×1   = vech (Σm), σ = (σ1, . . . , σM )

    such that

    Σt  = Σ(st) =   Σ(ξ t ⊗ IK )is a (K × K ) matrix.

  • 8/18/2019 Krolzig Markov-Switching Vector Autoregressions_ Modelling, Statistical Inference, And Application to Business Cycle A

    28/374

    16   The Markov–Switching Vector Autoregressive Model 

    1.2.4 The Hidden Markov Chain

    The description of the data-generatingprocess is not completed by the observational

    equations (1.5) or (1.7). A model for the parameter generating process has to be

    formulated. if the parameters depend on a regime which is assumed to be stochasticand unobservable, a generating process for the states  s t  must be postulated. Using

    this law, the evolution of regimes then might be inferred from the data. In the MS-

    VAR model the state process is an ergodic Markov chain with a finite number of 

    states st  = 1, . . . , M   and transition probabilities pij .

    It is convenient to collect the transition probabilities in the transition matrix P,

    P =

     p11   p12   · · ·   p1M  p21   p22   · · ·   p2M 

    ......

      . . .  ...

     p11   p12   · · ·   p1M 

    ,   (1.8)

    where piM   = 1 −  pi1 − . . . − pi,M −1   for i  = 1, . . . , M  . To be more precise, allrelevant information about the future of the  Markovian process is included in the

    present state ξ t

    Pr(ξ t+1|ξ t, ξ t−1, . . . ; yt, yt−1, . . .) = Pr(ξ t+1|ξ t)

    where the past and additional variables such as  y t   reveal no relevant information

    beyond that of the actual state. The assumption of a   first-order  Markov process is

    not especially restrictive, since each Markov chain of an order greater than one canbe reparametrized as a higher dimensional first-order Markov process (cf. FRIED-

    MANN  [1994]). A comprehensive discussion of the theory of Markov chains with

    application to Markov-switching models is given by HAMILTON  [1994b, ch. 22.2].

    We will just give a brief introduction to some basic concepts related to MS-VAR

    models, in particular to the state-space form and the filter.

    It is usually assumed that the Markov process is ergodic. A Markov chain is said

    to be  ergodic  if exactly one of the eigenvalues of the transition matrix   P  is unity

    and all other eigenvalues are inside the unit circle. Under this condition there exists

    a stationary or unconditional probability distribution of the regimes. The ergodic

     probabilities  are denoted by  ξ̄   =   E[ξ t]. They are determined by the stationarity

    restriction   Pξ̄  =  ξ̄  and the adding up restriction 1 M ξ̄  = 1, from which it follows

  • 8/18/2019 Krolzig Markov-Switching Vector Autoregressions_ Modelling, Statistical Inference, And Application to Business Cycle A

    29/374

    1.3. The Data Generating Process   17

    that

    ξ̄  =

      IM −1 − P1.M −1,1.M −1   P1.M −1,M 

    1M −1   1

    −1   0M −1

    1

    .   (1.9)

    if  ξ̄  is strictly positive, such that all regimes have a positive unconditional probab-

    ility  ξ̄ i   >  0,  i  = 1, . . . , M  , the process is called   irreducible. The assumptions of 

    ergodicity and irreducibility are essential for the theoretical properties of MS-VAR

    models, e.g. its property of being stationary. The estimation procedures, which will

    be introduced in  Chapter 6  and  Chapter 8  are flexible enough to capture even these

    degenerated cases,   e.g.  when there is a single jump (“structural break”) into the

    absorbing state that prevails until the end of the observation period.

    1.3 The Data Generating Process

    After this introduction of the two components of MS-VAR models, (i.) the Gaussian

    VAR model as the conditional data generating process and (ii.) the Markov chain

    as the regime generating process, we will briefly discuss their main implications for

    the data generating process.

    For given states  ξ t  and lagged endogenous variables  Y t−1   = (yt−1, y

    t−2, . . . , y

    1,

    y0, . . . , y1− p)

    the conditional probability density function of   y t   is denoted by

     p(yt|ξ t, Y t−1). It is convenient to assume in (1.5) and (1.7) a normal distributionof the error term ut, so that

     p(yt|ξ t  =  ιm, Y t−1)= ln(2π)−1/2 ln |Σ|−1/2 exp{(yt − ȳmt)Σ−1m  (yt − ȳmt)},   (1.10)

    where ȳmt  = E[yt|ξ t, Y t−1] is the conditional expectation of  y t in regime m. Thusthe conditional density of  y t  for a given regime ξ t  is normal as in the VAR model

    defined in equation (1.2). Thus:

    yt|ξ t  =  ιm, Y t−1

      ∼  NID (ȳmt, Σm) ,

    ∼   NID (ȳtξ t,  Σ(ξ t ⊗ IK )) ,   (1.11)

  • 8/18/2019 Krolzig Markov-Switching Vector Autoregressions_ Modelling, Statistical Inference, And Application to Business Cycle A

    30/374

    18   The Markov–Switching Vector Autoregressive Model 

    where the conditional means  ȳmt  are summarized in the vector  ȳt  which is e.g.   in

    MSI specifications of the form

    ȳt  =

    ȳ1t..

    .ȳMt

    =

    ν 1 +

     pj=1 A1j yt−j

    ..

    .ν M  +

     pj=1 AMj yt−j

    .

    Assuming that the information set available at time   t −  1  consists only of thesample observations and the pre-sample values collected in  Y t−1  and the states of 

    the Markov chain up to ξ t−1, the conditional density of  yt is a mixture of normals9:

     p(yt|ξ t−1 =  ιi, Y t−1)

    =M 

    m=1

     p(yt|ξ t−1 =  ιm, Y t−1)Pr(ξ t|ξ t−1 =  ιi)

    =

    M m=1

     pim ln(2π)− 12 ln |Σm|− 12 exp{(yt − ȳmt)Σ−1m   (yt − ȳmt)} .(1.12)if the densities of  yt conditional on ξ t and  Y t−1 are collected in the vector ηt as

    ηt  =

     p(yt|ξ t  =  ι1, Y t−1)

    ...

     p(yt|ξ t  =  ιM , Y t−1)

    ,   (1.13)equation (1.12) can be written as

     p(yt|ξ t−1, Y t−1) = η t Pξ t−1.   (1.14)

    Since the regime is assumed to be unobservable, the relevant information set avail-able at time   t − 1  consists only of the observed time series until time   t  and theunobserved regime vector ξ t  has to be replaced by the inference Pr(ξ t|Y τ ). Theseprobabilities of being in regime m  given an information set  Y τ   are denoted ξ mt|τ 

    and collected in the vector  ξ̂ t|τ  as

    ξ̂ t|τ   =

    Pr(ξ t  =  ι1|Y τ )

    ...

    Pr(ξ t  =  ιM |Y τ ),

    9

    The reader is referred to HAMILTON [1994a] for an excellent introduction into the major conceptsof Markov chains and to T ITTERINGTON , S MITH  & MAKOV [1985] for the statistical properties of 

    mixtures of normals.

  • 8/18/2019 Krolzig Markov-Switching Vector Autoregressions_ Modelling, Statistical Inference, And Application to Business Cycle A

    31/374

    1.3. The Data Generating Process   19

    which allows two different interpretations. First,   ξ̂ t|τ  denotes the discrete condi-

    tional probability distribution of  ξ t   given  Y τ . Secondly,  ξ̂ t|τ   is equivalent to the

    conditional mean of  ξ t  given Y τ . This is due to the binarity of the elements of  ξ t,

    which implies that  E[ξ mt] = Pr(ξ mt  = 1) = Pr(st   =  m). Thus, the conditional

    probability density of  yt based upon Y t−1 is given by

     p(yt|Y t−1) =M 

    m=1

     p(yt, ξ t−1  =  ιm|Y t−1)

    =

    M m=1

     p(yt|ξ t−1 =  ιm, Y t−1)Pr(ξ t−1  =  ιm|Y t−1)   (1.15)

    =   ηt Pξ̂ t−1|t−1.

    As with the conditional probability density of a single observation y t  in (1.15) the

    conditional probability density of the sample can be derived analogously. The tech-

    niques of setting-up the likelihood function in practice are introduced in Section 6.1.

    Here we only sketch the basic approach.

    Assuming presample values  Y 0  are given, the density of the sample  Y  ≡   Y T   forgiven states ξ  is determined by

     p(Y |ξ ) =T 

    t=1

     p(yt|ξ t, Y t−1).   (1.16)

    Hence, the joint probability distribution of observations and states can be calculated

    as

     p(Y, ξ ) =   p(Y |ξ ) Pr(ξ )

    =T 

    t=1

     p(yt|ξ t, Y t−1)T 

    t=2

    Pr(ξ t|ξ t−1) Pr(ξ 1).   (1.17)

    Thus, the unconditional density of  Y  is given by the marginal density

     p(Y ) =

       p(Y, ξ ) dξ,   (1.18)

    where   f (x, ξ )dξ   := M i1=1 . . .M iT =1 f (x, ξ T    =   ιiT  , . . . , ξ  1   =   ιi1 )  denotessummation over all possible values of  ξ  =  ξ T  ⊗ ξ T −1 ⊗ . . . ⊗ ξ 1 in equation (1.18).

  • 8/18/2019 Krolzig Markov-Switching Vector Autoregressions_ Modelling, Statistical Inference, And Application to Business Cycle A

    32/374

    20   The Markov–Switching Vector Autoregressive Model 

    Finally, it follows by the definition of the conditional density that the conditional

    distribution of the total regime vector ξ  is given by

    Pr(ξ |Y  ) =   p(Y, ξ ) p(Y  )

      .

    Thus, the desired conditional regime probabilities Pr(ξ t|Y  ) can be derived by mar-ginalization of  Pr(ξ |Y  ). In practice these cumbrous calculations can be simplifiedby a recursive algorithm, a matter which is discussed in Chapter 5 .

    The regime probabilities for future periods follow from the exogenous stochastic

    process of  ξ t, more precisely the Markov property of regimes,  Pr(ξ T +h|ξ T , Y ) =Pr(ξ T +h|ξ T ),

    Pr(ξ T +h|Y  ) =

    ξtPr(ξ T +h|ξ T , Y )Pr(ξ T |Y  )

    = ξt

    Pr(ξ T +h|ξ T )Pr(ξ T |Y  ).

    These calculations can be summarized in the simple forecasting rule:Pr(sT +h  = 1|Y  )

    ...

    Pr(sT +h  =  M |Y  )

    = [P]h

    Pr(sT   = 1|Y )...

    Pr(sT   = M |Y )

    ,where   P   is the transition matrix as in (1.8). Forecasting MS-VAR processes is

    discussed in full length in  Chapter 4 .

    In this section we have given just a short introduction to some basic concepts related

    to MS-VAR models; the following chapters will provide broader analyses of the

    various topics.

    1.4 Features of MS-VAR Processes and Their Rela-

    tion to Other Non-linear Models

    The Markov switching vector autoregressive model is a very general approach for

    modelling time series with changes in regime. In Chapter 3 it will be shown that MS-

    VAR processes with shifting means or intercepts but regime-invariant variances and

  • 8/18/2019 Krolzig Markov-Switching Vector Autoregressions_ Modelling, Statistical Inference, And Application to Business Cycle A

    33/374

    1.4. Features of MS-VAR Processes and Their Relation to Other Non-linear Models   21

    autoregressive parameters can be represented as non-normal linear state space mod-

    els. Furthermore, MSM-VAR and MSI-VAR models possess linear representations.

    These processes may be better characterized as non-normal than as non-linear  time

    series models as the associated Wold representations coincide with those of linear

    models. While our primary research interest concerns the modelling of the condi-

    tional mean, we will exemplify the effects of Markovian switching regimes on the

    higher moments of the observed time series.

    For sake of simplicity we restrict the following consideration mainly to univariate

    processes

    yt   =   ν (st) +

     pj=1

    αj (st)yt−j +  ut, ut ∼ NID(0, σ2(st)).

    Most of them are made for two-regimes. Thus, the process generating y t  can be

    rewritten as

    yt   = [ν 2 + (ν 1 − ν 2)ξ 1t] + p

    j=1

    [α2 + (α1 − α2)ξ 1t]yt−j + ut,

    ut ∼ NID(0, [σ22 + (σ21 − σ22)ξ 1t]).

    if the regime st  is governed by a Markov chain, the MS(2)-AR( p) model ensues. It

    will be shown that even such simple MS-AR models can encompass a wide spectrum

    of modifications of the time-invariant normal linear time series model.

    1.4.1 Non-Normality of the Distribution of the Observed Time

    Series

    As already seen the conditional densities p(yt|Y t−1)  are a mixture of  M   normals p(yt|ξ t, Y t−1) with weights p(ξ t|Y t−1):

     p(yt|Y t−1) =M 

    m=1

    ξ̂ mt|t−1ϕ

    σ−1(yt − ȳmt)

    where ϕ(·) is a standard normal density and  ȳmt  =  E[yt|ξ t  =  ιm, Y t−1]. Thereforethe distribution of the observed time series can be multi-modal. Relying on well-

    known results, cf.   e.g.   TITTERINGTON  et al.   [1985, p. 162], we can notice for

    M  = 2:

  • 8/18/2019 Krolzig Markov-Switching Vector Autoregressions_ Modelling, Statistical Inference, And Application to Business Cycle A

    34/374

    22   The Markov–Switching Vector Autoregressive Model 

    Example 1   An MS(2)-AR( p) process with a homoskedastic Gaussian inno-

    vation process  ut ∼   NID(0, σ2)   generates bimodality of the conditional density p(yt|Y t−1) if 

    σ−1(ȳ1t − ȳ2t) >  ∆ξ̄1 ≥ 2,where the critical value  ∆ξ̄1   depends on the ergodic regime probability

      ξ̄ 1 , e.g.

    ∆0.5 = 2 and  ∆0.1 = ∆0.9 = 3.

    In contrast to Gaussian VAR processes, MS-VAR models can produce skewness

    (non-zero third-order cross-moments) and leptokurtosis (fat tails) in the distribution

    of the observed time series. A simple model that generates leptokurtosis in the

    distribution of the observed time series  y t is provided by the MSH(2)-AR(0) model:

    Example 2   Let  yt  be an MSH(2)-AR(0) process,

    yt − µ =  ut, ut ∼ NID(0, σ2

    1 I (st  = 1) + σ

    2

    2 I (st  = 2)).

    Then it can be shown that the excess kurtosis is given by

    E[(yt − µ)4]E[(yt − µ)2]2 − 3 =

      3ξ̄ 1ξ̄ 2(σ21 − σ22)2

    (ξ̄ 1σ21 + ξ̄ 2σ22)

    2  .

    Thus, the excess kurtosis is different from zero if  σ21 = σ22  and  0 <  ξ̄ 1 <  1.

    BOX AND   TIAO   [1968] have used such a model for the detection of outliers. In

    order to generate skewness and excess kurtosis it is   e.g.   sufficient to assume an

    MSI(2)-AR(0) model:

    Example 3   Let  yt  be generated by an MSM(2)-AR(0) process:

    yt − µ = (µ1 − µ)I (st  = 1) + (µ2 − µ)I (st  = 2) + ut, ut ∼ NID(0, σ2),

    so that 

    yt − µ = (µ2 − µ) + (µ1 − µ2)ξ 1t + ut.Then it can be shown that the normalized third moment of y t is given by the skewness

    E[(yt − µ)3]E[(yt − µ)2]3/2   =

      (µ1 − µ2)3(1 − 2ξ̄ 1)ξ̄ 1(1 − ξ̄ 1)σ2 + (µ1 − µ2)2ξ̄ 1(1 −  ξ̄ 1)

    3/2 .if the regime i with the highest conditional mean µi  > µj  is less likely than the other 

    regime, ξ̄ i  <  ξ̄ j , then the observed variable is more likely to be far above the mean

    than it is to be far below the mean.

  • 8/18/2019 Krolzig Markov-Switching Vector Autoregressions_ Modelling, Statistical Inference, And Application to Business Cycle A

    35/374

    1.4. Features of MS-VAR Processes and Their Relation to Other Non-linear Models   23

    Furthermore the normalized fourth moment of  y t  is given by the excess kurtosis

    E[(yt − µ)4]E[(yt − µ)2]2 − 3 =

     (µ1 − µ2)4ξ̄ 1(1 − ξ̄ 1)

    1 − 3ξ̄ 1(1 −  ξ̄ 1)

    σ2 + (µ1 − µ2)2ξ̄ 1ξ̄ 2

    2

      .

    Since we have that maxξ̄1∈[0,1]{ξ̄ 1(1−ξ̄ 1)} =  1

    4  <  1

    3 , the excess kurtosis is positive,i.e. the distribution of  y t  has more mass in the tails than a Gaussian distribution

    with the same variance.

    The combination of regime switching means and variances in an MSIH(2)-AR(0)

    process (cf. Example 4 ) is given in SOLA AND T IMMERMANN  [1995]. The implic-

    ations for option pricing are discussed in K ÄHLER AND MARNET  [1994b]. For an

    MSMH(2)-AR(4) model, the conditional variance of the one-step prediction error is

    given by SCHWERT  [1989] and PAGAN AND S CHWERT  [1990].

    1.4.2 Regime-dependent Variances and Conditional Heteroske-

    dasticity

    An MS(M )-AR( p) process is called  conditional heteroskedastic  if the conditional

    variance of the prediction error y t − E[yt|Y t−1],

    Var [yt|Y t−1] =  E

    (yt − E[yt|Y t−1])2

    is a function of the information set  Y t−1   . Conditional heteroskedasticity can be

    induced by regime-dependent variances, autoregressive parameters or means.

    In MS-AR models with regime-invariant autoregressive parameters, conditional

    heteroskedasticity implies that the conditional variance of the prediction error

    yt − E[yt|Y t−1], is a function of the filtered regime vector  ξ̂ t−1|t−1. In general,an MS-AR process is called regime-conditional heteroskedastic if 

    Var [yt|ξ t−1, Y t−1] =  E

    (yt − E[yt|ξ t−1, Y t−1])2

    is a function of  ξ t−1. Interestingly, regime-dependent variances are neither neces-

    sary nor sufficient for conditional heteroskedasticity. As stated in Chapter 3, a neces-

    sary and sufficient condition for conditional heteroskedasticity in MS-VAR models

    with regime-invariant autoregressive parameters is the serial dependence of regimes.

    On the other hand, even if the white noise process u t is homoskedastic, σ2(st) = σ2,the observed process yt can be heteroskedastic. Consider the following example:

  • 8/18/2019 Krolzig Markov-Switching Vector Autoregressions_ Modelling, Statistical Inference, And Application to Business Cycle A

    36/374

    24   The Markov–Switching Vector Autoregressive Model 

    Example 4   Let  yt  be an MSI(2)-AR(0) process

    yt − µ   = (µ1 − µ)I (st  = 1) + (µ2 − µ)I (st  = 2) + ut,

    with ut ∼ NID(0, σ2) and serial correlation in the regimes according to the trans-ition matrix P. Employing the ergodic regime probability  ξ̄ 1 , yt  can be written as

    yt − µ = (µ1 − µ2)(ξ 1t − ξ̄ 1) + ut.Thus E[yt|Y t−1] =  µ + (µ1 − µ2)(ξ̂ 1t|t−1 −  ξ̄ 1) and 

    Var [yt|Y t−1] = σ2 + (µ1 − µ2)2E

    (ξ 1t − ξ̄ 1)2|Y t−1

    =   σ2 + (µ1 − µ2)2

    ξ̂ 1t|t−1(1 − ξ̂ 1t|t−1)2 + (1 − ξ̂ 1t|t−1)(−ξ̂ 1t|t−1)2

    =   σ2 + (µ1 − µ2)2ξ̂ 1t|t−1(1 − ξ̂ 1t|t−1),

    where  ξ̂ 1t|t−1 =  p11ξ̂ 1t−1|t−1 + p21(1 − ξ̂ 1t−1|t−1) = ( p11 + p22 − 1)ξ̂ 1t−1|t−1 +(1 − p22)   is the predicted regime probability  Pr(st   = 1|Y t−1). Thus {yt}   is aregime-conditional heteroskedastic process.

    In contrast to ARCH models, the conditional variance in MS-VAR models (with

    time-invariant autoregressive parameters) is a non-linear function of past squared

    errors since the predicted regime probabilities generally are non-linear functions of 

    Y t−1.

    Recently some approaches have been made to consider Markovian regime shifts in

    variance generating processes. The class of   autoregressive conditional heteroske-

    dastic processes introduced by ENGLE  [1982] is used to formulate the conditional

    process; our assumption of an i.i.d. distributed error term is substituted by an ARCH

    process ut, cf.   inter alia  HAMILTON AND  LIN  [1994], HAMILTON AND  SUSMEL

    [1994], CAI  [1994] and HALL AND S OL A [1993b]. ARCH effects can be generated

    by MSA-AR processes which will be considered in the next section.

    1.4.3 Regime-dependent Autoregressive Parameters: ARCH

    and Stochastic Unit Roots

    Autoregressive conditional heteroskedasticity is known from random coefficient

    models. Therefore it is not very surprising that also MSA-VAR models may lead to

  • 8/18/2019 Krolzig Markov-Switching Vector Autoregressions_ Modelling, Statistical Inference, And Application to Business Cycle A

    37/374

    1.4. Features of MS-VAR Processes and Their Relation to Other Non-linear Models   25

    ARCH. This effect will be considered in the following simple example.

    Example 5   Let  yt  be generated by an MSA(2)-MAR(1) process with i.i.d. regimes:

    (yt

    −µ) = α(st) (yt−1

    −µ) + ut, ut

     ∼NID(0, σ2).

    Serial independence of the regimes implies p11 = 1− p22 =  ρ; the regime-dependent autoregressive parameters α1, α2 are restricted such that E[α] = α1ρ+α2(1−ρ) =0. Thus it can be shown that 

    E[yt|Y t−1] =   µ + (α1ρ + α2(1 − ρ)) yt−1 =  µ,E[(yt − µ)2|Y t−1] =   σ2 + (α21ρ + α22(1 − ρ)) (yt−1 − µ)2.

    Then yt  possesses an ARCH representation yt  =  µ + et  with

    e2t   =   σ2 + γe2t−1 + εt

    where  γ   = −α1α2   >   0   and  εt   is white noise. Thus, ARCH(1) models can beinterpreted as restricted MSA(2)-AR(1) models.

    The theoretical foundations of MSA-VAR processes are laid in TJØSTHEIM

    [1986b]. Some independent theoretical results are provided by BRANDT   [1986].

    As pointed out by TJØSTHEIM   [1986b], the dynamic properties of models with

    regime-dependent autoregressive parameters are quite complicated. Especially, if 

    the process is stationary for some regimes and mildly explosive for others, the prob-lems of stochastic unit root processes as introduced by GRANGER AND S WANSON

    [1994] are involved.10

    It is worth noting that the stability of each VAR sub-model and the ergodicity of 

    the Markov chain are sufficient stability conditions; they are however not neces-

    sary to establish stability. Thus, the stability of MSA-AR models can be compatible

    with AR polynomials containing in some regimes roots greater than unity in ab-

    solute value and less than unity in others. Necessary and sufficient conditions for

    the stability of stochastic processes as the MSA-VAR model have been derived in

    10Models where the regime is switching between deterministic and stochastic trends are considered by

    MCCULLOCH AND  T SAY [1994a].

  • 8/18/2019 Krolzig Markov-Switching Vector Autoregressions_ Modelling, Statistical Inference, And Application to Business Cycle A

    38/374

    26   The Markov–Switching Vector Autoregressive Model 

    KARLSEN  [1990a], [1990b]. However in practice, their application has been found

    to be rather complicated (cf. HOLST et al.  [1994]).

    In this study we will concentrate our analysis on modelling shifts in the (conditional)

    mean and the variance of VAR processes which simplifies the analysis.

    1.5 Conclusion and Outlook

    In the preceding discussion of this chapter MS(M )-VAR( p)  processes have been

    introduced as doubly stochastic processes where the conditional stochastic process

    is a Gaussian VAR( p) and the regime generating process is a Markov chain. As we

    have seen in the discussion of the relationship of the MS-VAR model to other non-

    linear models, the MS-VAR model can encompass many other time series models

    proposed in the literature or replicates at least some of their features. In the fol-lowing chapter these considerations are formalized to state-space representations of 

    MS-VAR models where the measurement equation corresponds to the conditional

    stochastic process and the transition equation reflects the regime generating process.

    In  Section 2.5  the MS-VAR model will be compared to time-varying coefficient

    models with smooth variations in the parameters, i.e. an infinite number of regimes.

  • 8/18/2019 Krolzig Markov-Switching Vector Autoregressions_ Modelling, Statistical Inference, And Application to Business Cycle A

    39/374

    1.A. Appendix: A Note on the Relation of SETAR to MS-AR Processes   27

    1.A Appendix: A Note on the Relation of SETAR to

    MS-AR Processes

    While the presumptions of the SETAR and the MS-AR model seem to be quitedifferent, the relation between both model alternatives is rather close. Indeed, both

    models can be observationally equivalent, as the following example demonstrates:

    Example 6  Consider the SETAR model

    yt =  µ2 + (µ1 − µ2)I (yt−d ≤ r) + ut, ut ∼ NID(0, σ2).   (1.19)

    For  d   = 1   it has been shown by  CARRASCO  [1994, lemma 2.2] that (1.19) is a

     particular case of the Markov-switching model

    yt =  µ2 + (µ1 − µ2)I (st = 1) + ut, ut ∼ NID(0, σ2),which is an MSI(2)-AR(0) model. For an unknown r , define the unobserved regime

    variable st  as the binary variable

    st  =  I (yt−1 ≤ r) =

      1   if  yt−1 ≤ r2   if  yt−1 > r

    such that 

    Pr(st  = 1

    |st−1, Y ) = Pr(yt−1

     ≤r

    |st−1, Y  )

    = Pr(µ2 + (µ1 − µ2)I (st−1 = 1) + ut−1 ≤ r)= Pr(ut−1 ≤ r − µ2 − (µ1 − µ2)I (st−1  = 1))

    = Φ

    r − µ2 − (µ1 − µ2)I (st−1 = 1)

    σ

    = Pr(st  = 1|st−1).

     Hence st  follows a first order Markov process where the transition matrix is defined 

    as

    P =   p11   p12 p21   p22 =   Φ(r−µ1

    σ   ) Φ(µ1−r

    σ   )

    Φ( r−µ2σ   ) Φ( µ2−rσ   ) .

  • 8/18/2019 Krolzig Markov-Switching Vector Autoregressions_ Modelling, Statistical Inference, And Application to Business Cycle A

    40/374

    28   The Markov–Switching Vector Autoregressive Model 

    if  d >   1, the data can be considered as generated by  d   independent series which

    are each particular Markov processes. A proof can be based on the property

    Pr(st|{st−j}∞j=1, Y T ) = Pr(st|st−2, Y T );  thus  st   follows a second order Markovchain, which can be reparametrized as a higher dimensional first order Markov

    chain.

  • 8/18/2019 Krolzig Markov-Switching Vector Autoregressions_ Modelling, Statistical Inference, And Application to Business Cycle A

    41/374

    Chapter 2 

    The State-Space Representation

    In the following chapters we will be concerned with the statistical analysis of 

    MS(M )-VAR( p) models. As a formal framework for these investigations we em-

    ploy the state-space model which has been proven useful for the study of time series

    with unobservable states. In order to motivate the introduction of state-space rep-resentations for MS(M )-VAR( p) models it might be helpful to sketch its use for the

    three main tasks of statistical inference:

    1.   Filtering & smoothing of regime probabilities: Given the conditional dens-

    ity function p(yt|Y t−1, ξ t), the discrete Markovian chain as regime generatingprocess ξ t, and some assumptions about the initial state   y0 = (y

    0, . . . , y

    1− p)

    of the observed variables and the unobservable initial state  ξ 0  of the Markov

    chain, the complete density function p(ξ, Y  ) is specified. The statistical tools

    to provide inference for ξ t given a specified observation set  Y τ , τ  ≤ T  are thefilter and smoother recursions which reconstruct the time path of the regime,{ξ t}T t=1, under alternative information sets:

    ξ̂ t|τ , τ < t   predicted    regime probabilities.

    ξ̂ t|τ , τ  = t   filtered    regime probabilities,

    ξ̂ t|τ , t < τ   ≤ T    smoothed    regime probabilities.

    In the following, mainly the filtered regime probabilities,  ξ̂ t|t and  full-sample

    smoothed regime probabilities,  ξ̂ t|T , are considered. See Chapter 5 .

    2.  Parameter estimation & testing:   If the parameters of the model are un-

    known, classical Maximum Likelihood  as well as  Bayesian estimation meth-

    ods are feasible. Here, the filter and smoother recursions provide the analyt-

    29

  • 8/18/2019 Krolzig Markov-Switching Vector Autoregressions_ Modelling, Statistical Inference, And Application to Business Cycle A

    42/374

    30   The State-Space Representation

    ical tool to construct and evaluate the likelihood function. See  Chapters 6 – 

    9 .

    3.   Forecasting:   Given the state-space form, prediction of the system is a

    straightforward task. See Chapter 4  and Section 8.5 .

    The framework for the statistical analysis of MS-VAR models to be presented in the

    next chapters is the state-space form. The advantage of viewing MS-VAR models

    in this way is that general concepts can be introduced as the likelihood principle

    (Chapter 6 ) and a recursive filter algorithm (Chapter 5 ) which corresponds to the

    Kalman filter in Gaussian state-space models.

    For particular MS-VAR processes, a state-space representation with ξ t  as the state

    vector has been introduced by H AMILTON  [1994a]. 1 In the following section we

    investigate some state-space representations of MS-VAR models. These representa-

    tions are then used to work out general properties of MS-VAR processes,  inter aliawe discuss the non-normality of the state-space form, we formulate conditions for

    the linearity of the state-space representation and we show that the joint process of 

    observed variables and regimes, ( y t, ξ t), is Markovian. In Section 2.2  the specific-

    ation of the state-space representation is discussed with regardto its adaptation to the

    particular MS-VAR models proposed in  Chapter 1. In the remaining sections, three

    alternative state-space representations of MS-VAR processes are introduced which

    will create new insights into the theory of MS-VAR processes and will be used later

    on. In Section 2.3 the adding-up restriction on the state vector is eliminated by re-

    ducing its dimension.   Section 2.4   formulates the state-space representation in the

    predicted state vector.  Section 2.4  presents a state-space form in the vector of VARcoefficients which allows a comparison with other time-varying coefficient models.

    2.1 A Dynamic Linear State-Space Representation

    for MS-VAR Processes

    The state-space model given in   Table 2.1   consists of the set of   measurement 

    and  transition  equations. The measurement equation (2.1) describes the relation

    1HAMILTON  [1994a] considers MSIA(M )-AR( p) and MSM(M )-AR( p) models. A similar approach

    is taken in HALL AND S OL A [1993a), HALL AND S OL A [1993b] and FUNKE et al.  [1994].

  • 8/18/2019 Krolzig Markov-Switching Vector Autoregressions_ Modelling, Statistical Inference, And Application to Business Cycle A

    43/374

    2.1. A Dynamic Linear State-Space Representation for MS-V