system identification

646
System Identification Torsten Soderstrom Petre Stoica

Upload: mohamed-mamdouh

Post on 25-Sep-2015

34 views

Category:

Documents


2 download

DESCRIPTION

Soderstrom-T-Stoica-P-System-Identification

TRANSCRIPT

Forward to the 2001 edition
Since its publication in 1989, System Identification has become a standard refer­ ence source for researchers and a frequently used textbook for classroom and self studies. The book has also appeared in paperback in 1994, and a Polish transla­ tion has been published as well in 1997.
The original publisher has now declared the book out of print. Because we have got very positive feedback from many colleagues who are missing our book, we have decided to arrange for a reprinting, and here it is.
We have chosen to let the text appear in the same form as the 1989 version. Over the years we have found only very few typing errors, and they are listed on the next page. We hope that you, our readers, will enjoy the book.
Uppsala, August 2001
1. p 12, eq. (2.7) should read
h(k) = -tr ~~k+l y(t)u(t - k) -tr ~~l u2 (t)
2. p 113, line 9-. should read 'even if u(t) is not'.
3. p 248. In (C7.6.4) change '=' to '} ='.
SYSTEM IDENTIFICATION
M.J. Grimble, Series Editor
BANKS, S.P., Mathematical Theories of Nonlinear Systems BENNETT, S., Real-time Computer Control: an introduction
CEGRELL, T., Power Systems Control
COOK, P.A., Nonlinear Dynamical Systems
LUNZE, J., Robust Multivariable Feedback Control
PATrON, R., CLARK, R.N., FRANK, P.M. (editors), Fault Diagnosis in Dynamic Systems SODERSTROM, T., STOICA, P., System Identification WARWICK, K., Control Systems: an introduction
SYSTEM IDENTIFICATION
TORSTEN SODERSTROM Automatic Control and Systems Analysis Group Department of Technology, Uppsa/a University
Uppsala, Sweden
PETRE STOICA Department of Automatic Control Polytechnic Institute of Bucharest
Bucharest, Romania
First published 1989 by Prentice Hall International (UK) Ltd,
66 Wood Lane End, Hemel Hempstead, Hertfordshire, HP2 4RG
A division of Simon & Schuster International Group
© 1989 Prentice Hall International (UK) Ltd
All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying,
recording or otherwise, without the prior permission, in writing, from the publisher. For permission within the United
States of America contact Prentice Hall Inc., Englewood Cliffs, NJ 07632.
Printed and bound in Great Britain at the University Press, Cambridge.
Library of Congress Cataloging-in-Publication Data
Soderstrom, Torslen. System identification/Torsten Soderstrom and Petre Stoica. p. cm. - (Prentice Hall international series in systems and control engineering) Bibliography: p. Includes indexes. ISBN 0-13-1\SI236-5 1. System identification. I. Stoica, P. (Petre). 1949- n. Title III. series. QA402.S8933 1988 003 - dc19 87-29265
1 2 3 4 5 93 92 91 90 1\9
ISBN 0-13-881236-5
CONTENTS
1 INTRODUCTION
2 INTRODUCTORY EXAMPLES
2.1 The concepts J, vI(, J, ge 2.2 A basic example 2.3 Nonparametric methods 2.4 A parametric method 2.5 Bias, consistency and model approximation 2.6 A degenerate experimental condition 2.7 The influence of feedback Summary and outlook Problems Bibliographical notes
3 NONPARAMETRIC METHODS
3.1 Introduction 3.2 Transient analysis 3.3 Frequency analysis 3.4 Correlation analysis 3.5 Spectral analysis Summary Problems Bibliographical notes Appendices
A3.1 Covariance functions, spectral densities and linear filtering A3.2 Accuracy of correlation analysis
4 LINEAR REGRESSION
4.1 The least squares estimate 4.2 Analysis of the least squares estimate 4.3 The best linear unbiased estimate 4.4 Determining the model dimension 4.5 Computational aspects Summary Problems Bibliographical notes Complements
vii
9 10 10 12 18 23 25 28 29 31
32
55 58
VlIl Contents
C4.1 Best linear unbiased estimation under linear constraints 83 C4.2 Updating the parameter estimates in linear regression models 86 C4.3 Best linear unbiased estimates for linear regression models with possibly
singular residual covariance matrix 88 C4.4 Asymptotically best consistent estimation of certain nonlinear
regression parameters 91
5 INPUT SIGNALS
5.1 Some commonly used input signals 5.2 Spectral characteristics S.3 Lowpass filtering 5.4 Persistent excitation Summary Problems Bibliographical notes Appendix
AS.1 Spectral properties of periodic signals Complements
C5.1 Difference equation models with persistently exciting inputs C5.2 Condition number of the covariance matrix of filtered white noise C5.3 Pseudorandom binary sequences of maximum length
6 MODEL PARAMETRIZATIONS
129
146
6.1 Model classifications 146 6.2 A general model structure 148 6.3 Uniqueness properties 161 6.4 Identifiability 167 Summary 168 Problems 168 Bibliographical notes 171 Appendix
A6.1 Spectral factorization 172 Complements
C6.1 Uniqueness of the full polynomial form model 182 C6.2 Uniqueness of the parametrization and the positive definiteness of the
input-output covariance matrix 183
7 PREDICTION ERROR METHODS
7.1 The least squares method revisited 7.2 Description of prediction error methods 7.3 Optimal prediction 7-.4 Relationships between prediction error methods and other identification
methods 7.5 Theoretical analysis 7.6 Computational aspects Summary Problems Bibliographical notes Appendix
A7.1 Covariance matrix of PEM estimates for multivariable systems
185
226
Contents ix
Complements C7.1 Approximation models depend on the loss function used in estimation 228 C7.2 Multistep prediction of ARMA processes 229 C7.3 Least squares estimation of the parameters offull polynomial form models 233 C7.4 The generalized least squares method 236 C7.5 The output error method 239 C7.6 Unimodality of the PEM loss function for ARMA processes 247 C7.7 Exact maximum likelihood estimation of AR and ARMA parameters 249 C7.8 ML estimation from noisy input-output data 256
8 INSTRUMENTAL VARIABLE METHODS 260
8.1 Description of instrumental variable methods 260 8.2 Theoretical analysis 264 8.3 Computational aspects 277 Summary 280 Problems 280 Bibliographical notes 284 Appendices
A8.1 Covariance matrix of IV estimates 285 A8.2 Comparison of optimal IV and prediction error estimates 286
Complements C8.1 Yule-Walker equations 288 C8.2 The Levinson-Durbin algorithm 292 C8.3 A Levinson-type algorithm for solving non symmetric Yule-Walker systems
of equations 298 C8.4 Min-max optimal IV method 303 C8.5 Optimally weighted extended IV method 305 C8.6 The Whittle-Wiggins-Robinson algorithm 310
9 RECURSIVE IDENTIFICATION METHODS
9.1 Introduction 9.2 The recursive least squares method 9.3 Real-time identification 9.4 The recursive instrumental variable method 9.5 The recursive prediction error method 9.6 Theoretical analysis 9.7 Practical aspects Summary Problems Bibliographical notes Complements
C9.1 The recursive extended instrumental variable method C9.2 Fast least squares lattice algorithm for AR modeling C9.3 Fast least squares lattice algorithm for multivariate regression models
10 IDENTIFICATION OF SYSTEMS OPERATING IN CLOSED LOOP
10.1 Introduction 10.2 Identifiability considerations 10.3 Direct identification
320
320 321 324 327 328 334 348 350 351 357
359 361 373
10.4 Indirect identification 395 10.5 Joint input-output identification 396 lD.6 Accuracy aspects 401 Summary 406 Problems 407 Bibliographical notes 412 Appendix
AlD.1 Analysis of the joint input-output identification 412 Complement
ClO.l Identifiability properties of the PEM applied to ARMAX systems operating under general linear feedback 416
11 MODEL VALIDATION AND MODEL STRUCTURE DETERMINA TION 422
11.1 Introduction 422 11.2 Is a model flexible enough? 423 11.3 Is a model too complex? 433 11.4 The parsimony principle 438 11.5 Comparison of model structures 440 Summary 451 Problems 451 Bibliographical notes 456 Appendices
A 11.1 Analysis of tests on covariance functions 457 Al1.2 Asymptotic distribution of the relative decrease in the criterion function 461
Complement Cll.1 A general form of the parsimony principle 464
12 SOME PRACTICAL ASPECTS
l2.1 Introduction 12.2 Design of the experimental condition ge 12.3 Treating nonzero means and drifts in disturbances 12.4 Determination of the model structureJtl 12.5 Time delays 12.6 Initial conditions 12.7 Choice of the identification method J 12.8 Local minima 12.9 Robustness 12.10 Model verification 12.11 Software aspects 12.12 Concluding remarks Problems Bibliographical notes
APPENDIX A SOME MATRIX RESULTS
468
468 468 474 482 487 490 493 493 495 499 501 502 502 509
511
A.I Partitioned matrices 511 A.2 The least squares solution to linear equations, pseudoinverses and the singular
value decomposition 518
Contents Xl
A.3 The OR method 527 AA Matrix norms and numerical accuracy 532 A.5 Idempotent matrices 537 A.6 Sylvester matrices 540 A.7 Kronecker products 542 A.8 An optimization result for positive definite matrices 544 Bibliographical notes 546
APPENDIX B SOME RESULTS FROM PROBABILITY THEORY AND STATISTICS 547
B.l Convergence of stochastic variables 547 B.2 The Gaussian and some related distributions 552 B.3 Maximum a posteriori and maximum likelihood parameter estimates 559 B.4 The Cramer-Rao lower bound 560 B.5 Minimum variance estimation 565 B.6 Conditional Gaussian distributions 567 B.7 The Kalman-Bucy filter 568 B.8 Asymptotic covariance matrices for sample correlation and covariance estimates 570 B.9 Accuracy of Monte Carlo analysis 576 Bibliographical notes 579
REFERENCES 580 ANSWERS AND FURTHER HINTS TO THE PROBLEMS 596 AUTHOR INDEX 606 SUBJECT INDEX 609
PREFACE AND ACKNOWLEDGMENTS
System identification is the field of mathematical modeling of systems from experimental data. It has acquired widespread applications in many areas. In control and systems engineering, system identification methods are used to get appropriate models for synthesis of a regulator, design of a prediction algorithm, or simulation. In signal processing applications (such as in communications, geophysical engineering and mechanical engineering), models obtained by system identification are used for spectral analysis, fault detection, pattern recognition, adaptive filtering, linear prediction and other purposes. System identification techniques are also successfully used in non­ technical fields such as biology, environmental sciences and econometrics to develop models for increasing scientific knowledge on the identified object, or for prediction and controL
This book is aimed to be used for senior undergraduate and graduate level courses on system identification. It will provide the reader a profound understanding of the subject matter as well as the necessary background for performing research in the field. The book is primarily designed for classroom studies but can be used equally well for self­ studies.
To reach its twofold goal of being both a basic and an advanced text on system identifi­ cation, which addresses both the student and the researcher, the book is organized as follows. The chapters contain a main text that should fit the needs for graduate and advanced undergraduate courses. For most of the chapters some additional (often more detailed or more advanced) results are presented in extra sections called complements. In a short or undergraduate course many of the complements may be skipped. In other courses, such material can be included at the instructor's choice to provide a more profound treatment of specific methods or algorithmic aspects of implementation. Throughout the book, the important general results are included in solid boxes. In a few places, intermediate results that are essential to later developments, are included in dashed boxes. More complicated derivations or calculations are placed in chapter appendices that follow immediately the chapter text. Several general background results from linear algebra, matrix theory, probability theory and statistics are collected in the general appendices A and B at the end of the book. All chapters, except the first one, include problems to be dealt with as exercises for the reader. Some problems are illustrations of the results derived in the chapter and are rather simple, while others are aimed to give new results and insight and are often more complicated. The problem sections can thus provide appropriate homework exercises as well as challenges for more advanced readers. For each chapter, the simple problems are given before the more advanced ones. A separate solutions manual has been prepared which contains solutions to all the problems.
Xl!
Preface and Acknowledgments XlII
The book does not contain computer exercises. However, we find it very important that the students really apply some identification methods, preferably on real data. This will give a deeper understanding of the practical value of identification techniques that is hard to obtain from just reading a book. As we mention in Chapter 12, there are several good program packages available that are convenient to use.
Concerning the references in the text, our purpose has been to give some key references and hints for a further reading. Any attempt to cover the whole range of references would be an enormous, and perhaps not particularly useful, task.
We assume that the reader has a background corresponding to at least a senior-level academic experience in electrical engineering. This would include a basic knowledge of introductory probability theory and statistical estimation, time series analysis (or stochastic processes in discrete time), and models for dynamic systems. However, in the text and the appendices we include many of the necessary background results.
The text has been used, in a preliminary form, in several different ways. These include regular graduate and undergraduate courses, intensive courses for graduate students and for people working in industry, as weI! as for extra reading in graduate courses and for independent studies. The text has been tested in such various ways at Uppsa\a University, Polytechnic Institute of Bucharest, Lund Institute of Technology, Royal Institute of Technology, Stockholm, Yale University, and INTEC, Santa Fe, Argentina. The experience gained has been very useful when preparing the final text.
In writing the text we have been helped in various ways by several persons, whom we would like to sincerely thank.
We acknowledge the influence on our research work of our colleagues Professor Karl lohan Astrom, Professor Pieter Eykhoff, Dr Ben Friedlander, Professor Lennart Ljung, Professor Arye Nehorai and Professor Mihai Terti§to who, directly or indirectly, have had a considerable impact on our writing.
The text has been read by a number of persons who have given many useful sugges­ tions for improvements. In particular we would like to sincerely thank Professor Randy Moses, Professor Arye Nehorai, and Dr John Norton for many useful comments. We are also grateful to a number of students at Uppsa\a University, Polytechnic Institute of Bucharest, INTEC at Santa Fe, and Yale University, for several valuable proposals.
The first inspiration for writing this book is due to Dr Greg Meira, who invited the first author to give a short graduate course at INTEC, Santa Fe, in 1983. The material produced for that course has since then been extended and revised by us jointly before reaching its present form.
The preparation of the text has been a task extended over a considerable period of time. The often cumbersome job of typing and correcting the text has been done with patience and perseverance by Ylva Johansson, Ingrid Ringard, Maria Dahlin, Helena Jansson, Ann-Cristin Lundquist and Lis Timner. We are most grateful to them for their excellent work carried out over the years with great skill.
Several of the figures were originally prepared by using the packages ID PAC (developed at Lund Institute of Technology) for some parameter estimations and BLAISE (developed at INRIA, France) for some of the general figures.
We have enjoyed the very pleasant collaboration with Prentice Hall International. We would like to thank Professor Mike Grimble, Andrew Binnie, Glen Murray and Ruth Freestone for their permanent encouragement and support. Richard Shaw
XIV Preface and Acknowledgments
deserves special thanks for the many useful comments made on the presentation. We acknowledge his help with gratitude.
Torsten Soderstrom Uppsa/a
Petre Stoica Bucharest
Notations
!!ii DTV,..4) E e(t) F(q-I) C(q-I) Ceq-I) H(q-I) f/(q-l) f I I" k;(t) log clI ,11(0) (min) JV JV(m, P) N n nu ny nO On O(x) p(xly)
i?l i?ln
yet) y(tlt - 1) z(t)
GLOSSARY
set of parameter vectors describing models with stable predictors set of models .11 describing the true system./ expectation operator white noise (a sequence of independent random variables) data prefilter transfer function operator estimated transfer function operator noise shaping filter estimated noise shaping filter identification method identity matrix (nln) identity matrix reflection coefficient natural logarithm model set, model structure model corresponding to the parameter vector 8 matrix dimension is m by n null space of a matrix normal (Gaussian) distribution of mean value m and covariance matrix P number of data points model order number of inputs number of outputs dimension of parameter vector (nln) matrix with zero elements O(x)/x is bounded when x -'!< 0 probability density function of x given y range (space) of a matrix Euclidean space true system transpose of the matrix A trace (of a matrix) time variable (integer-valued for discrete time models) input signal (vector of dimension nu) loss function a column vector formed by stacking the columns of the matrix A on top of each other experimental condition output signal (vector of dimension ny) optimal (one step) predictor vector of instrumental variables
xv
XVI
oCc) E(t, 8) 8 8 80
A ,/ A a <j>(w) <j>//( w) <j>y//( w) <pet) q)
len) 1.jJ(t) w
Glossary
gain sequence Kronecker delta (= 1 if s = t, else = 0) Dirac function prediction error corresponding to the parameter vector 8 parameter vector estimate of parameter vector true value of parameter vector covariance matrix of innovations variance of white noise forgetting factor variance or standard deviation of white noise spectral density spectra! density of the signal u(t) cross-spectral density between the signals yet) and u(t) vector formed by lagged input and output data regressor matrix X2 distribution with n degrees of freedom negative gradient of the prediction error E(t, 8) with respect to 8 angular frequency
Abbreviations
ABCE adj AIC AR AR(n) ARIMA ARMA ARMA(n1, n2) ARMAX ARX BLUE CARIMA cov dim deg ELS FIR FFT FPE GLS iid IV LDA LIP LMS LS MA MA(n) MAP
asymptotically best consistent estimator adjoint (or adjugate) of a matrix, adj(A)~A -idetA Akaike's information criterion autoregressive AR of order 11 autoregressive integrated moving average autoregressive moving average ARMA where AR and MA parts have order 111 and 112, respectively autoregressive moving average with exogenous variables autoregressive with exogenous variables best linear unbiased estimator controlled autoregressive integrated moving average covariance matrix dimension degree extended least squares finite impulse response fast Fourier transform final prediction error generalized least squares independent and identically distributed instrumental variables Levinson-Durbin algorithm linear in the parameters least mean squares least squares moving average MA of order n maximum a posteriori
MFD mgf MIMO ML mse MVE ODE OEM pdf pe PEM PI PLR PRBS RIV RLS RPEM SA Sf SISO SVD var w.p.1 w.r.t WWRA YW
matrix fraction description moment generating function multi input, multi output maximum likelihood mean square error minimum variance estimator ordinary differential equation output error method probability density function persistently exciting prediction error method parameter identifiability pseudolinear regression pseudorandom binary sequence recursive instrumental variable recursive least squares recursive prediction error method stochastic approximation system identifiability single input, single output singular value decomposition variance with probability one with respect to Whittle-Wiggins-Robinson algorithm Yule-Walker
Glossary XVll
Notational conventions
[H(q-l)]' [cp(tW [A -llT
matrix square root of a positive definite matrix Q: (Ql!2fQII2 = Q [Ql!2p' XTQX with Q a symmetric positive definite weighting matrix
convergence in distribution the difference matrix (A - B) is nonnegative definite (here A and Bare nonnegative definite matrices) the difference matrix (A - B) is positive definite defined as assignment operator distributed as Kronecker product modulo 2 summation of binary variables direct sum of subspaces modulo 2 summation of binary variables gradient of the loss function V Hessian (matrix of second order derivatives) of the loss function V
EXAMPLES
1.1 A stirred tank 1.2 An industrial robot 1 1.3 Aircraft dynamics 2 1.4 Effect of a drug 2 1.5 Modeling a stirred tank 4
2.1 Transient analysis 10 2.2 Correlation analysis 12 2.3 A PRBS as input 15 2.4 A step function as input 19 2.5 Prediction accuracy 21 2.6 An impulse as input 23 2.7 A feedback signal as input 26 2.8 A feedback signal and an additional setpoint as input 26
3.1 Step response of a first-order system 33 3.2 Step response of a damped oscillator 34 3.3 Nonideal impulse response 37 3.4 Some lag windows 46 3.5 Effect of lag window on frequency resolution 47
4.1 A polynomial trend 60 4.2 A weighted sum of exponentials 60 4.3 Truncated weighting function 61 4.4 Estimation of a constant 65 4.5 Estimation of a constant (continued from Example 4.4) 70 4.6 Sensitivity of the normal equations 76
5.1 A step function 97 5.2 A pseudorandom binary sequence 97 5.3 An autoregressive moving average sequence 97 5.4 A…