logistic regression: a self-learning text, third edition by david g. kleinbaum, mitchel klein

30
International Statistical Review (2011), 79, 2, 272–301 doi:10.1111/j.1751-5823.2011.00149.x Short Book Reviews Editor: Simo Puntanen Bayesian Decision Analysis: Principles and Practice Jim Q. Smith Cambridge University Press, 2010, ix + 338 pages, £35.00/$65.00, hardcover ISBN: 978-0-521-76454-4 Table of contents Part I. Foundations of Decision Modeling Part II. Multi-Dimensional Decision Modeling 1. Introduction 6. Multiattribute utility theory 2. Explanations of processes and trees 7. Bayesian networks 3. Utilities and rewards 8. Graphs, decisions and causality 4. Subjective probability and its elicitation 9. Multidimensional learning 5. Bayesian inference for decision analysis 10. Conclusions Readership: Statistics graduate students and practitioners in Bayesian decision analysis and Bayesian networks. In 1989, I wrote a JASA review about Decision Analysis: A Bayesian Approach, by J. Q. Smith. In retrospect, this early review was far too critical! While acknowledging that the book developed “concepts not usually dealt with in Bayesian classics”, I bemoaned the lack of connections with classical Bayesian decision theory, as exemplified by Berger (1985), and missing entries. While I remain attached to the approach adopted in Berger’s book, I now see much more clearly the point made in Smith’s 1989 book. If we now consider Bayesian Decision Analysis, the book somehow covers the same ground of Bayesian decision analysis, as opposed to Bayesian inference, but a deeper and more mature level. Jim Smith has been involved quite a lot in consulting experiences, in particular in connection with nuclear energy (hence the link on the unusual cover), and the expertise he gained from such experiences shows throughout the book. It mostly skips the traditional Bayesian inference with its use of parametrized models. Hence a logical lack of entry on computational aspects and on hierarchical models, except for Chapter 9, for Jim Smith considers tree models to be mostly superior to the later, both in terms of versatility and of symmetries. Before moving to a brief description of the chapters, let me stress that the design and the printing of the book are both of the highest quality, numerous tree graphs appearing seamlessly at the right place [making captions superfluous], different fonts making parts more coherent and so on. I spotted very few typos and I must only mention the one massacring Maurice Allais’ name into Allias: It looks as is the file was recomposed by CUP as otherwise a typo turning a β into a 3 (page 77) would not make sense. (I must also point out that my own book is entitled The Bayesian Choice, not The Bayesian Case!) The introduction of Bayesian Decision Analysis is very good if only because it avoids jumping into a mathematization of the issues by sticking to a few coherent if classic examples. It stresses C 2011 The Author. International Statistical Review C 2011 International Statistical Institute. Published by Blackwell Publishing Ltd, 9600 Garsington Road, Oxford OX4 2DQ, UK and 350 Main Street, Malden, MA 02148, USA.

Upload: alice-richardson

Post on 21-Jul-2016

224 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Logistic Regression: A Self-Learning Text, Third Edition by David G. Kleinbaum, Mitchel Klein

International Statistical Review (2011), 79, 2, 272–301 doi:10.1111/j.1751-5823.2011.00149.x

Short Book ReviewsEditor: Simo Puntanen

Bayesian Decision Analysis: Principles and PracticeJim Q. SmithCambridge University Press, 2010, ix + 338 pages, £35.00/$65.00, hardcoverISBN: 978-0-521-76454-4

Table of contents

Part I. Foundations of Decision Modeling Part II. Multi-Dimensional Decision Modeling1. Introduction 6. Multiattribute utility theory2. Explanations of processes and trees 7. Bayesian networks3. Utilities and rewards 8. Graphs, decisions and causality4. Subjective probability and its elicitation 9. Multidimensional learning5. Bayesian inference for decision analysis 10. Conclusions

Readership: Statistics graduate students and practitioners in Bayesian decision analysis andBayesian networks.

In 1989, I wrote a JASA review about Decision Analysis: A Bayesian Approach, by J. Q. Smith.In retrospect, this early review was far too critical! While acknowledging that the book developed“concepts not usually dealt with in Bayesian classics”, I bemoaned the lack of connections withclassical Bayesian decision theory, as exemplified by Berger (1985), and missing entries. WhileI remain attached to the approach adopted in Berger’s book, I now see much more clearly thepoint made in Smith’s 1989 book.

If we now consider Bayesian Decision Analysis, the book somehow covers the same ground ofBayesian decision analysis, as opposed to Bayesian inference, but a deeper and more mature level.Jim Smith has been involved quite a lot in consulting experiences, in particular in connectionwith nuclear energy (hence the link on the unusual cover), and the expertise he gained fromsuch experiences shows throughout the book. It mostly skips the traditional Bayesian inferencewith its use of parametrized models. Hence a logical lack of entry on computational aspects andon hierarchical models, except for Chapter 9, for Jim Smith considers tree models to be mostlysuperior to the later, both in terms of versatility and of symmetries. Before moving to a briefdescription of the chapters, let me stress that the design and the printing of the book are bothof the highest quality, numerous tree graphs appearing seamlessly at the right place [makingcaptions superfluous], different fonts making parts more coherent and so on. I spotted very fewtypos and I must only mention the one massacring Maurice Allais’ name into Allias: It looks asis the file was recomposed by CUP as otherwise a typo turning a β into a 3 (page 77) would notmake sense. (I must also point out that my own book is entitled The Bayesian Choice, not TheBayesian Case!)

The introduction of Bayesian Decision Analysis is very good if only because it avoids jumpinginto a mathematization of the issues by sticking to a few coherent if classic examples. It stresses

C© 2011 The Author. International Statistical Review C© 2011 International Statistical Institute. Published by Blackwell Publishing Ltd, 9600 GarsingtonRoad, Oxford OX4 2DQ, UK and 350 Main Street, Malden, MA 02148, USA.

Page 2: Logistic Regression: A Self-Learning Text, Third Edition by David G. Kleinbaum, Mitchel Klein

SHORT BOOK REVIEWS 273

the fundamental difference with Bayesian inference from Section 1.0.2, namely that “Bayesiandecision analysis is focused on solving a given problem.” The second chapter is a wonderfulentry on trees, making their construction and the resulting optimal decision quite intuitive. Thischapter also reminded me of the very enjoyable Raiffa (1968). Chapter 3 on utilities and rewardsfeels more traditional, in the spirit of DeGroot (1970), with a well-argumented introduction ofloss functions via a system of rational axioms. The following chapter on subjective probabilityand its elicitation actually steps away from classical textbooks by focusing on the finite universescovered by decision trees (an opportunity to point out the very nice distinction between analyst,decision maker, expert and auditor). The final chapter of the first part on Bayesian inferenceis maybe less necessary, even though I appreciate the part about mixtures, as well as the finalsection on the role of Bayesian inference in decision analysis, including counterfactuals.

The second part starts with a truly interesting chapter about multiple attribute utility theory,including an almost real-life Chernobyl illustration. The most developed case is obviouslythe additive type of utility function, but this seems almost unavoidable in real-life settings.Chapter 7 covers DAGs in a Lauritzen (1996) way, but also the elicitation of a Bayesian networkin an almost-practical way (using a pipeline case as a reference example).

The next chapter is about influence diagrams and causality, that is, when prior modellingmeets utility, connecting with earlier books by Shafer (1996) and Pearl (1988). Chapter 9 onmultidimensional learning covers inference on probabilities in Bayesian networks, while the finalchapter very nicely and honestly summarises the strengths and difficulties of Bayesian decisionanalysis. I thus hope it is obvious I strongly recommend reading the book to all involved in anylevel of decision management! Or teaching it.

Christian P. Robert: [email protected]—Universite Paris-Dauphine, Bureau B638

Place du Marechal de Lattre de Tassigny, 75775 PARIS Cedex 16, France

References

Berger, J. (1985). Statistical Decision Theory and Bayesian Analysis. 2nd ed. New York: Springer.DeGroot, M. (1970). Optimal Statistical Decisions. New York: McGraw-Hill.Lauritzen, S. (1996). Graphical Models. Oxford: Oxford University Press.Pearl, J. (1988). Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. San Mateo, CA:

Morgan Kaufmann.Raiffa, H. (1968). Decision Analysis: Introductory Lectures on Choices under Uncertainty. Reading, MA: Addison-

Wesley.Shafer, G. (1996). The Art of Causal Conjecture. Cambridge, MA: The MIT Press.

The Cambridge Dictionary of Statistics, Fourth EditionB. S. Everitt, A. SkrondalCambridge University Press, 2010, ix + 468 pages, £40.00/$59.00, hardcoverISBN: 978-0-521-76699-9

Readership: Those seeking the meaning of statistical terms.

The first item I looked up was response surface. The half page account was fine but the referencegiven was a 1987 book by Box and Draper, rather than the 2007, second edition; also, one ofmy initials was wrong. Thence to Normal distribution: A clear 5 line entry referred to STDChapter 29; this brings one back to page ix, where the 3rd reference (of 9) is a book by Evans,

International Statistical Review (2011), 79, 2, 272–301C© 2011 The Author. International Statistical Review C© 2011 International Statistical Institute

Page 3: Logistic Regression: A Self-Learning Text, Third Edition by David G. Kleinbaum, Mitchel Klein

274 SHORT BOOK REVIEWS

Hastings and Peacock (2000). Its title Statistical Distributions gives rise to the STD short form.The 9 books, each with a three-symbol letter/number tag, provide a widened framework forthose who want to see more than the basic dictionary entry. The publication dates of these ninebackstops range from 1989 to 2003.

The entry “star plot” on p. 410 was interesting; however the “four” in the last line shouldbe “forty”. I stumbled upon the O. J. Simpson paradox on p. 310, under O. There is also aparagraph on Simpson’s paradox on pp. 394–395, under S. Both entries are clear, but neitherhas a cross-reference to avert possible confusion.

The majority of technical entries in the book are very well described. There are also somemini-biographies of those who have died; R. A. Fisher has 18 lines, R. C. Bose, F. N. David andH. Hotelling have 10 each, for example. Overall, I felt that the revision was well done, but therewere some minor limitations that might affect the usefulness of this work.

Norman R. Draper: [email protected] of Statistics, University of Wisconsin – Madison

1300 University Avenue, Madison, WI 53706–1532, USA

Statistical Methods for Disease ClusteringToshiro TangoSpringer, 2010, x + 247 pages, €69.95/£59.99/$79.95, hardcoverISBN: 978-1-4419-1571-9

Table of Contents

1. Introduction 6. General tests for spatial clustering: case-control2. Clustering and clusters3. Disease mapping: visualization

point data7. Tests for space-time clustering

of spatial clustering 8. Focused tests for spatial clustering4. Tests for temporal clustering 9. Space-time scan statistics5. General tests for spatial clustering:

regional count dataA. List of R functions

Readership: Students and researchers in biostatistics, epidemiology and environmentalsciences.

There is an increasing awareness nowadays of environmental health risks, including bio-terrorism, together with the development of modern data collection systems. This combinationprovides a major impetus for the sorts of studies described in this book. The basic questionaddressed is: are the observed events (of disease) compatible with random occurrence in spaceand time or is there some discernible pattern? We have all seen the headlines in recent yearsproclaiming the proximity to nuclear sites of outbreaks of childhood leukemia, or of varioushealth issues in the vicinity of waste incinerators. However, it is often difficult to establish anylink, partly because random events do sometimes cluster without any prompting. The book givesa survey of current statistical methods for detecting clustering, meaning, that, which is unlikelyto be purely random, however the latter is defined. The methodology is almost all retrospectivein the sense of examining past records of disease.

Chapter 1 is introductory, telling the reader what disease clustering is and how the book isorganised. Particularly helpful here is Section 1.1, in which different types of disease clusteringare defined and a passage is quoted from a relevant source illustrating a particular application foreach. Chapters 2 and 3 cover the basic theory: spatial point processes, types of data, statistical

International Statistical Review (2011), 79, 2, 272–301C© 2011 The Author. International Statistical Review C© 2011 International Statistical Institute

Page 4: Logistic Regression: A Self-Learning Text, Third Edition by David G. Kleinbaum, Mitchel Klein

SHORT BOOK REVIEWS 275

models and inference. Chapters 4 to 9 conform to a fairly standard pattern. They begin with oneor two examples, for which numerical data or geographical figures are presented. There followthe hypothesis to be tested, a historical review of statistical methods, a detailed description ofselected methods, application to the data, and some further relevant discussion.

Overall, the book is very practically oriented. R-code is given throughout for implementingthe various analyses, together with a website containing a set of R-functions, though not fordownloading data sets as far as I can see. Prerequisites include some working knowledge of basicStatistical methods, particularly in Biostatistics and Epidemiology. Exercises are not includedbut one would be encouraged to try out the methods on any data to hand. The explanations ofthe methods are very clear and detailed and the discussion is informative. So, I believe that thebook will be of good use to the novice coming into the subject and also for the initiated wishingto broaden their perspective.

Martin Crowder: [email protected] Department, Imperial College

London SW7 2AZ, UK

Bayesian NonparametricsNils Lid Hjort, Chris Holmes, Peter Muller, Stephen G. Walker (Editors)Cambridge University Press, 2010, viii + 299 pages, £35.00/$59.00, hardcoverISBN: 978-0-521-51346-3

Table of contentsAn invitation to Bayesian nonparametrics (Nils Lid Hjort, Chris Holmes, Peter Muller, Stephen G. Walker)1. Bayesian nonparametric methods:

motivation and ideas (Stephen G. Walker)5. Hierarchical Bayesian nonparametric models with

applications (Yee Whye Teh, Michael I. Jordan)2. The Dirichlet process, related priors, and

posterior asymptotics (Subhashis Ghosal)6. Computational issues arising in Bayesian nonparametric

hierarchical models (Jim Griffin, Chris Holmes)3. Models beyond the Dirichlet process

(Antonio Lijoi, Igor Prunster)7. Nonparametric Bayes applications to biostatistics

(David B. Dunson)4. Further models and applications (Nils Lid

Hjort)8. More nonparametric Bayesian models for biostatistics

(Peter Muller, Fernando Quintana)

Readership: Students and researchers in Statistics, Computing and Mathematics.

The book is an edited volume comprising a 20-page introductory essay by the editors pluseight chapters by a clutch of subject experts. The introductory part takes the form of a surveyover a variety of aspects including the following: the place of Bayesian nonparametrics in theFrequentist-Bayesian, parametric-nonparametric classification, a discussion of the pragmatismof the Frequentist who uses a Bayesian method when it suits, the timeliness of the approach,partly driven by developments in computing and widening application areas, some historicalperspective, a short review of the literature, and computation. I must say that this compactaccount of “what it’s all about”, “where’s it come from”, and “where’s it going” is a thoroughlyengaging and interesting read.

The book arises out of a four-week program at the Newton Institute in 2007 organized by theeditors. Four experts each gave a tutorial lecture on a core theme of the subject. These becameinvited chapters in the book, authored or co-authored by the aforementioned, and along with eachis a complementary chapter authored or co-authored by the editors. The outcome, as describedby the editors, is “a broad text on modern Bayesian nonparametrics and its theory and methods”.

International Statistical Review (2011), 79, 2, 272–301C© 2011 The Author. International Statistical Review C© 2011 International Statistical Institute

Page 5: Logistic Regression: A Self-Learning Text, Third Edition by David G. Kleinbaum, Mitchel Klein

276 SHORT BOOK REVIEWS

Enthusiasm and commitment shine out from these pages, but stopping short thankfully of theevangelism that sometimes has to be waded through. “It is a truth universally acknowledgedthat a statistician in possession of a substantial data set must be in want of a book likethis”, to slightly misquote the editors, and someone else whose name escapes me. I wouldsay that a serious reader needs a good grounding in probability and some stochastic processtheory. Although the material is well organized and the chapters well written, there is nogetting away from the mathematics: even the back cover mentions “a forbidding landscape”.Nevertheless, for the interested, this is an excellent gateway to the field and there is nobull.

Martin Crowder: [email protected] Department, Imperial College

London SW7 2AZ, UK

A Handbook of Statistical Analyses Using R, Second EditionBrian S. Everitt, Torsten HothornChapman & Hall/CRC, 2010, xxv + 355 pages, £36.99/$57.95, softcoverISBN: 978-1-4200-7933-3

Table of contents

1. An introduction to R 10. Smoothers and generalized additive models2. Data analysis using graphical displays 11. Survival analysis3. Simple inference 12. Analyzing longitudinal data I4. Conditional inference 13. Analyzing longitudinal data II5. Analysis of variance 14. Simultaneous inference and multiple comparisons6. Simple and multiple linear regression 15. Meta-analysis7. Logistic regression and generalized linear models 16. Principal component analysis8. Density estimation 17. Multidimensional scaling9. Recursive partitioning 18. Cluster analysis

Readership: Statistics students with some background in statistics and researchers and practi-tioners looking for an introduction to statistical modelling via R.

This book is the second edition of a successful handbook that can benefit a wide audienceinterested in using R for its data analysis. It covers most of non-Bayesian statistical methods,with forays into exploratory data analysis with tools like principal components, clustering andbagging/boosting. As reflected in the list of chapters given above, the coverage is quite extensiveand only missing more specialised statistical domains like time-series (apart from longitudinaldata), econometrics (except for generalised linear models), and signal processing. Beside theabsence of a Bayesian perspective (only mentioned in connection with BIC and the mclustpackage, while it would be a natural tool for analysing mixed models), I miss some material onsimulation, the only entry found in the book being bootstrap (pp. 153–154).

Given its title and emphasis on analyses, the book is logically associated with an R packageHSAUR2 and works according to a fixed pattern: each chapter starts with a description of a fewdatasets, summarises the statistical main issues in one or two pages, and then engages into anR analysis. As the complexity increases along chapters, the authors are relying more and moreon specialised packages that need to be downloaded by the reader. I have no objection with thispedagogical choice, especially when considering that the packages are mostly recent. I would

International Statistical Review (2011), 79, 2, 272–301C© 2011 The Author. International Statistical Review C© 2011 International Statistical Institute

Page 6: Logistic Regression: A Self-Learning Text, Third Edition by David G. Kleinbaum, Mitchel Klein

SHORT BOOK REVIEWS 277

however have liked a bit more details about the packages or at least about their main function,as the reader is left to experiment from the line of code provided in the handbook. (In contrast,a few passages are a bit “geeky” and require a deeper understanding of R objects than casualreaders master.) My only criticism of the book at this level is the puzzling insistence on includingall the datasets used therein in the form of tables. I frankly fail to see the point in spending somany pages on those tables given that they all are available from the HSAUR2 package. A pageof further explanation, background or statistical theory would have been much more beneficialto any reader, in my opinion! The same criticism applies to the few exercises found at the endof each chapter.

In conclusion, I find the book by Everitt and Hothorn quite pleasant and bound to fit itspurpose. The layout and presentation is nice (with a single noticeable mishap on page 332caused by Darwin’s tree of life.) It should appeal to all readers as it contains a wealth ofinformation about the use of R for statistical analysis. Included seasoned R users: When readingthe first chapters, I found myself scribbling small light-bulbs in the margin to point out featuresof R I was not aware of. (In particular, the authors mentioned the option type = “n” for plot thatR-bloggers signalled as the most useful option for plotting.) In addition, the book is quite handyfor a crash introduction to statistics for (well-enough motivated) nonstatisticians.

Christian P. Robert: [email protected]—Universite Paris-Dauphine, Bureau B638

Place du Marechal de Lattre de Tassigny, 75775 Paris Cedex 16, France

Time Series: Modeling, Computation, and InferenceRaquel Prado, Mike WestChapman & Hall/CRC, 2010, xx + 253 pages, £57.99/$89.95, hardcoverISBN: 978-1-4200-9336-0

Table of contents

1. Notation, definitions, and basic inference 6. Sequential Monte Carlo methods for state-space models2. Traditional time domain models 7. Mixture models in time series3. The frequency domain 8. Topics and examples in multiple time series4. Dynamic linear models 9. Vector AR and ARMA models5. State-space time-varying autoregressive

models10. Multivariate DLMs and covariance models

Readership: Statistics graduate students with background in Bayesian statistics and stochasticprocesses, and researchers using time series modelling.

As a preliminary to this review, let me warn that I have a recurrent difficulty with most time-series textbooks. Maybe due to my French upbringing, I feel that, apart from Brockwell andDavis (2009), they do not provide enough mathematical bases for properly understanding notionsthat are foreign to iid settings, like stationarity, causality, spectrum. Hence, my regret that theotherwise comprehensive Prado and West’s Time Series follows the same assumption of a priorfamiliarity with stochastic processes, Fourier transforms and spectral analysis, and ends up as aresult hastily presenting those notions: for instance, stochastic integration is evacuated on pages99–100.

International Statistical Review (2011), 79, 2, 272–301C© 2011 The Author. International Statistical Review C© 2011 International Statistical Institute

Page 7: Logistic Regression: A Self-Learning Text, Third Edition by David G. Kleinbaum, Mitchel Klein

278 SHORT BOOK REVIEWS

Time Series aims at “present[ing], summariz[ing] and overview[ing] core models andmethods,” plus “recent research developments.” While the book stresses the Bayesian aspects ofinference in time-series models more than its competitors, its first chapters are mostly standard.In the time domain chapter (Chapter 2), Prado and West cover Bayesian estimation for AR,MA and ARMA models. The pace is fast, with a high “info-dump” rate, so most graduatestudents will likely need to consult some of the numerous references provided by the book toreally master all concepts and techniques. For instance, reversible jump MCMC (Green, 1995)is covered in two paragraphs only and the reason why it is needed (namely that the priorson the roots include Dirac masses at zero) may escape the neophyte. At this stage, I wouldhave welcomed a discussion about the oppositions between the conditional and unconditionalrepresentations of the models and between complex and real roots. Given the informal leveladopted in the book about spectral theory, I also wonder whether some parts of Chapter 3 on thefrequency domain are truly required, apart from providing an entry to the literature.

The next group of chapters in Time Series covers topics that are more central to the authors’interests, as expressed, e.g. in the earlier West and Harrison (1997), namely dynamic linearmodels (Chapter 4), state-space time-varying models (Chapter 5), and sequential Monte Carlomethods (Chapter 6). Dynamic linear models are highly polyvalent and adaptable models, hencecapable of handling a large variety of stationary and non-stationary time series: Chapter 4 showshow many of the earlier introduced models fit within this framework. The authors mostly adoptthe same perspective on the MCMC methodology required to analyse those models as in Petriset al. (2009), who provide an R package called dlm. (A radically different but efficient approachto the non-sequential problem would be to use INLA, as in Ruiz-Cardenas et al., 2010.) Chapter 5gives a detailed processing of time-varying AR models by describing how the dynamic structureon the AR coefficients can be constructed, including some extensions posterior to West andHarrison (1997). Chapter 6 covers with enough details the more challenging problem of on-line or sequential processing of state-space models, discussing several SMC auxiliary particlealgorithms (including the particle learning technique of Carvalho et al., 2010, discussed inChopin et al., 2010). Chapter 7 is about another instance of dynamic models, namely mixturesand hidden Markov models like stochastic volatility models, centred on a 12-page study ofelectroencephalograms published in Prado (2010).

The third and final part of Time Series logically is about the extension of the above tomultivariate time series, like VAR. Chapter 8 is mostly an introduction to this extension, withnew graphical representations for the estimated dynamic factor weights. Chapter 9 deals withvector AR and ARMA models, resorting to a corresponding state-space representation fordrawing inference. Chapter 10 concludes by pointing out many recent references on multivariatedynamic linear models. (It also contains 15 pages of graphs with volatility estimates on Europeancurrencies before the euro, whose repetition somehow eludes me.)

In conclusion, Time Series constitutes a very modern entry to the field of time-seriesmodelling, with a rich (17 pages) reference list of the current literature, including 85 referencesfrom 2008 and later. It is well-written and I spotted very few typos. This textbook can undoubtedlywork as a reference manual for anyone entering the field or looking for an update. Teaching in aplace where students study stochastic calculus prior to time-series courses, I am not in a positionto judge the adequacy of the book as a graduate textbook, although I am certain there is morethan enough material within Time Series to fill an intense one-semester course.

Christian P. Robert: [email protected]—Universite Paris-Dauphine, Bureau B638

Place du Marechal de Lattre de Tassigny, 75775 Paris Cedex 16, France

International Statistical Review (2011), 79, 2, 272–301C© 2011 The Author. International Statistical Review C© 2011 International Statistical Institute

Page 8: Logistic Regression: A Self-Learning Text, Third Edition by David G. Kleinbaum, Mitchel Klein

SHORT BOOK REVIEWS 279

References

Brockwell, P. & Davis, P. (2009). Time Series: Theory and Methods, 2nd ed. New York: Springer.Carvalho, C., Johannes, M., Lopes, H. & Polson, N. (2010). Particle learning and smoothing. Stat. Science, 25, 88–

106.Chopin, N., Iacobucci, A., Marin, J.-M., Mengersen, K., Robert, C. P., Ryder, R. & Schafer, C. (2010). On particle

learning. arXiv:1006.0554.Green, P. (1995). Reversible jump MCMC computation and Bayesian model determination. Biometrika, 82, 711–732.Petris, G., Petrone, S. & Campagnoli, P. (2009). Dynamic Linear Models with R. New York: Springer.Prado, R. (2010). Multi-state models for mental fatigue. The Handbook of Applied Bayesian Analysis, Eds. A. O’Hagan

& M. West, pp. 845–874. Oxford: Oxford University Press.Ruiz-Cardenas, R., Krainski, E. T. & Rue, H. (2010). Fitting dynamic models using integrated nested Laplace

approximations—INLA. Technical Report 12, Department of mathematical sciences, Norwegian University ofScience and Technology.

West, M. & Harrison, J. (1997). Bayesian Forecasting and Dynamic Models. New York: Springer.

Regression with Linear PredictorsPer Kragh Andersen, Lene Theil SkovgaardSpringer, 2010, xi + 494 pages, €65.45/£58.99/$84.95, hardcoverISBN: 978-1-4419-7169-2

Table of contents

1. Introduction 7. Alternative outcome types and link functions2. Statistical models 8. Further topics3. One categorical covariate Appendix A: Notation4. One quantitative covariate Appendix B: Use of logarithms5. Multiple regression, the linear predictor Appendix C: Some recommendations6. Model building: from purpose to conclusion Appendix D: Programming in R, SAS and Stata

Readership: Researchers using statistics in fields such as medicine, public health, dentistry,agriculture, and so on.

Springer categorizes this volume as being in their series “Statistics for Biology and Health”.This specific orientation is emphasized as early as page 3 where three examples are discussed,one with a quantitative response, one with a binary response, and one with a survival timeresponse. These examples will be “used as illustrations throughout the book”. Computationsare made via R and SAS and are documented on Web pages. “It is our intention to supplementthe Web pages with code in STATA as well but we have chosen not to include SPSS codebecause . . . most users . . . use the menu interface rather than writing program code”. Thisbook is extremely well written and the 171 excellent diagrams produced by Therese Graversenenhance it. The five page section 1.6 (“The scope of this book and how to read it”) is particularlythoughtful and helpful. In summary, this book is excellent and fully appropriate for the targetaudience.

Norman R. Draper: [email protected] of Statistics, University of Wisconsin – Madison

1300 University Avenue, Madison, WI 53706–1532, USA

International Statistical Review (2011), 79, 2, 272–301C© 2011 The Author. International Statistical Review C© 2011 International Statistical Institute

Page 9: Logistic Regression: A Self-Learning Text, Third Edition by David G. Kleinbaum, Mitchel Klein

280 SHORT BOOK REVIEWS

Measurements and their Uncertainties: A Practical Guide to Modern Error AnalysisIfan G. Hughes, Thomas P. A. HaseOxford University Press, 2010, xiii + 136 pages, £39.95/$85.00, hardcover (also available assoftcover)ISBN: 978-0-19-956632-7

Table of contents

1. Errors in the physical sciences 6. Least-squares fitting of complex functions2. Random errors in measurement 7. Computer minimisation and the error matrix3. Uncertainties as probabilities 8. Hypothesis testing—how good are our models4. Error propagation 9. Topics for further summary5. Data visualisation and reduction

Readership: undergraduates in the physical sciences and engineering, graduate students, andprofessional scientists and engineers.

I am really pleased to see this book. It has always troubled me that physics texts almost neverinclude any actual data. They seem to create a chasm between the packaged perfection of thetheory and the fact that, at some point, those theories had been derived from data, with all itsmessiness. Life is never really that simple, and making no nod towards the path by which thefinal theory had been reached seems to me to give a misleading impression of what science isall about.

This book focuses squarely on the problems of moving from data to theory. It drags thetreatment of uncertainties for practical physics courses into the twenty-first century. That means itassumes that the computer will do the number crunching, and that calculus-based approximationsto errors and their propagation are replaced by a functional approach based on the use of standardpackages such as spreadsheets. The aim was to make it sufficiently user-friendly that studentswould actually take it into the laboratory, and I think the authors have succeeded. The level issuitable for undergraduates right through to graduate level.

In Section 1.2.4, on mistakes, it refers to the case of a Boeing 767 aircraft, which ran out offuel midflight in 1983, due, it turned out, to a misunderstanding between metric and imperialunits of measurement. That reminded me of the time that I was on a flight to New York,which ran out of fuel and was forced to land at a military airbase. I wonder how often thishappens.

The book includes exercises, and also in Chapter 9, under “topics for further study”, suchthings as uncertainties in both variables in regression, simulated annealing, MCMC methods,and Bayesian inference.

I have only one very minor quibble: there appears to be no discussion of digit preferenceor heaping of data, though this is quite a common phenomenon with analogue measuringinstruments.

Overall, however, this is a rather beautiful little book.

David J. Hand: [email protected] Department, Imperial College

London SW7 2AZ, UK

International Statistical Review (2011), 79, 2, 272–301C© 2011 The Author. International Statistical Review C© 2011 International Statistical Institute

Page 10: Logistic Regression: A Self-Learning Text, Third Edition by David G. Kleinbaum, Mitchel Klein

SHORT BOOK REVIEWS 281

Algebraic and Geometric Methods in StatisticsPaolo Gibilisco, Eva Riccomagno, Maria Piera Rogantin, Henry P. Wynn (Editors)Cambridge University Press, 2009, xvi + 365 pages, £75.00/$118.00, hardcoverISBN: 978-0-521-89619-1

Table of contents

1. Algebraic and geometric methods in statistics Part III. Information Geometry(Paolo Gibilisco, Eva Riccomagno, Maria PieraRogantin, Henry P. Wynn)

14. Non-parametric estimation (Raymond F.Streater)

15. Banach manifold of quantum states (RaymondPart I. Contingency TablesF. Streater)2. Maximum likelihood estimation in latent class

models (Stephen E. Fienberg, Patricia Hersh, 16. On quantum information manifolds (AnnaJencova)Alessandro Rinaldo, Yi Zhou)

17. Axiomatic geometries for text documents3. Algebraic geometry of 2 × 2 contingency(Guy Lebanon)tables (Aleksandra B. Slavkovic, Stephen E.

Fienberg) 18. Exponential manifold by reproducing kernelHilbert spaces (Kenji Fukumizu)4. Model selection for contingency tables with

algebraic statistics (Anne Krampe, Sonja Kuhnt) 19. Extended exponential models (DanieleImparato, Barbara Trivellato)5. Markov chains, quotient ideals, and connectivity

(Yuguo Chen, Ian H. Dinwoodie, Ruriko Yoshida) 20. Quantum statistics and measures of quantuminformation (Frank Hansen)6. Algebraic category distinguishability (Enrico

Carlini, Fabio Rapallo) Part IV. Information Geometry and AlgebraicStatistics7. Algebraic complexity of MLE for bivariate

missing data (Serkan Hosten, Seth Sullivant) 21. Algebraic varieties vs differentiable manifolds(Giovanni Pistone)8. The generalized shuttle algorithm (Adrian

Dobra, Stephen E. Fienberg) Part V. On-Line SupplementsColoured figures for Chapter 2Part II. Designed Experiments

22. Maximum likelihood estimation in latent class9. Generalised design (Hugo Maruri-Aguilar,models (Yi Zhou)Henry P. Wynn)

23. The generalized shuttle algorithm (Adrian10. Design of experiments and biochemical networkDobra, Stephen E. Fienberg)inference (Reinhard Laubenbacher, Brandilyn

Stigler) 24. Indicator function and sudoku designs (RobertoFontana, Maria Piera Rogantin)11. Replicated measurements and algebraic statistics

(Roberto Notari, Eva Riccomagno) 25. Replicated measurements and algebraicstatistics (Roberto Notari, Eva Riccomagno)12. Indicator function and sudoku designs (Roberto

Fontana, Maria Piera Rogantin) 26. Extended exponential models (DanieleImparato, Barbara Trivellato)13. Markov basis for design of experiments and

three-level factors (Satoshi Aoki, AkimichiTakemura)

Readership: The book is meant for mathematical statisticians and mathematicians interested inrelatively recent applications of computational commutative algebra and differential geometryto Statistics, specifically to categorical data, design of experiments, and classical and quantuminformation geometry.

Sophisticated algebraic methods were introduced in Statistics by Wijsman (1950s), Linnik,and Kagan (a bit later) to study similar regions and best unbiased estimators for familiesof densities that are not complete, specifically for algebraic exponential families. These arecurved exponential families where the parametric space is defined by algebraic equations inthe natural parameters. A famous example is the common mean problem for two normals withpossibly different variances. A beautiful but somewhat esoteric result was the Kagan–Palamadovtheorem characterizing all best unbiased estimators in such cases. Unni of ISI, Calcutta, showedthis had the corollary that in the case of the common means problem, there is no best-unbiased

International Statistical Review (2011), 79, 2, 272–301C© 2011 The Author. International Statistical Review C© 2011 International Statistical Institute

Page 11: Logistic Regression: A Self-Learning Text, Third Edition by David G. Kleinbaum, Mitchel Klein

282 SHORT BOOK REVIEWS

estimator for any unbiasedly estimable parametric function. This settled a conjecture of thepresent reviewer.

The recent resurgence of algebraic methods addresses equally hard but more realistic problemsin contingency tables, where models are often described through polynomial equations. A famousexample of use of algebraic methods in an important problem is the Diaconis–Sturmfels 1998paper on algebraic algorithms for sampling from conditional distributions for contingency tables.This helped in constructing conditional tests. The construction was very sophisticated, usingGromov bases. Incidentally, Gromov is a recent recipient of the Abel Prize in Mathematics.

Algebraic Geometry has been used by Feinberg and a few others for geometric representationfor contingency tables or finding the mle in latent class models for categorical variables. Findingthe mle in these popular models is surprisingly difficult. Apparently users are not aware,so sometimes EM and other methods are applied without proper justification. The discretelatent class models arose from the pioneering work of Goodman, Haberman, and others asmodels for joint marginals for manifest variables, which are conditionally independent given anunobservable (latent) variable. These applications are discussed in Part 1. For me this was themost interesting part of the book.

However algebraic techniques, including Groebner bases, have also been applied to design ofexperiments by Maruri-Aguilar, Aoki, Takemura, etc. The paper by the last authors on fractionalfactorials is quite interesting.

Part 3 introduces information geometry. Statisticians will recognize the well-known contri-butions of Efron relating curvature and C. R. Rao’s second order efficiency and the many newdivergences introduced by Amari. Equally important, but not so well known are Dawid’s work onmixtures and Censov’s theorem on Rao’s Riemannian metric, which seems to be a fundamentalmetric for Statistics but not as well known as it should be. In the case of a multivariate normalwith fixed dispersion matrix and arbitrary mean vector, Rao’s metric reduces to the famousMahalanobis distance. Part 3 also introduces a quantum information theory. I missed anyreference to the work of Professor K. R. Parthasarathy, the very well-known probabilist, analyst,and quantum probabilist at ISI, Delhi. He has a beautiful recent book on quantum Informationtheory, published by Hindustan Book Agency, India.

In part 4, Professor Pistone, one of the pioneers of both algebraic and geometric methods,whose 65th birthday is being celebrated with the publication of this monograph, brings togetherthe two themes in it, focusing on the “interplay of geometry and algebra in various contexts,”from finite state spaces to abstract Wiener spaces and Malliavin calculus.

I see many highly non-trivial applications to hard statistical problems presented with greatcare and love, if I may use that word. But I also note the cautionary advice given by Feinberget al. on p. 160. It applies not only to the latent class models, as meant by them, but to all thevery hard applications of Algebra and Geometry. Use these methods with care, all the problemsare hard!

I would also draw the attention of the interested reader to the work of Drton on algebraicexponential families, with possible singularities, especially for popular latent variable modelslike Factor Analysis. Some of his papers and books are listed in the book.

Jayanta K. Ghosh: [email protected] of Statistics, Purdue University

West Lafayette, IN 47909, USA

International Statistical Review (2011), 79, 2, 272–301C© 2011 The Author. International Statistical Review C© 2011 International Statistical Institute

Page 12: Logistic Regression: A Self-Learning Text, Third Edition by David G. Kleinbaum, Mitchel Klein

SHORT BOOK REVIEWS 283

Selected Works of C. C. HeydeRoss Maller, Ishwar Basawa, Peter Hall, Eugene Seneta (Editors)Springer, 2010, xxxviii + 463 pages, €79.95/£72.00/$89.95, hardcoverISBN: 978-1-4419-5822-8

Readership: This work would be a welcome shelf volume for research workers in probabilityand statistics and should certainly be a reference available in departmental libraries.According to Springer, “Springer’s Selected Works in Probability and Statistics series of-fers scientists and scholars the opportunity of assembling and commenting upon majorclassical works in probability and statistics. Each volume contains the original papers,original commentary by experts on the subject’s papers and relevant biographies andbibliographies”.

The selection of papers made by the editors, Ross Maller, Ishwar Basawa, Peter Hall andEugene Seneta, friends and colleagues of Chris Heyde (their eminence in the field surely requiresno elaboration) reflects the breadth of interest of Chris Heyde’s work, and his unremitting pursuitof very general and elegant results in probability and statistics.

Indeed the editors remark in their preface that the entire corpus of Chris Heyde’s workseems to embody the ethic enunciated by William Feller in 1945: “The history of probabilityshows that our problems must be treated in their greatest generality: only in this way canwe hope to discover the most natural tools and to open channels for new progress. Thisremark leads naturally to that characteristic of our theory which makes it attractive beyondits importance for various applications: a combination of an amazing generality with algebraicprecision”.

Chris’s own take on this principle “to come down on it from a great height” (a remarkhe made to me when providing guidance on the approach to adopt in of one of our jointpapers).

The volume contains 50 original papers, a chronological listing of all publications, as well asindividual commentary on particular facets of the research by each of the editors. Remarkablyit also contains a short overview by Chris Heyde himself, just prior to his death, of what heregarded as some of his key contributions over his lifetime.

Chris Heyde published over 200 scholarly articles. Some of his principal contributions werein the fields of:

Probability Theory: Rates of convergence in the Central Limit Theorem and the MartingaleCentral Limit Theorem, Law of the Iterated Logarithm, Branching Processes, PopulationGenetics

Stochastic processes: Inference using Quasi-Likelihood and Asymptotic Quasi-Likelihood,Parameter estimation for random processes with long-range dependence

Modelling in financial markets: Risky asset modelling; need for fractal properties andheavy tails for risky asset returns. Chris spent many years perfecting his FATGBM (fractalactivity time geometric Brownian motion) model, and was quite proud of it. It is stillarguably the most useful model for capturing empirical realities of stock and stock indexreturns.

Roger Gay: [email protected] of Accounting and Finance, Monash University,

Clayton, Vic 3800, Australia

International Statistical Review (2011), 79, 2, 272–301C© 2011 The Author. International Statistical Review C© 2011 International Statistical Institute

Page 13: Logistic Regression: A Self-Learning Text, Third Edition by David G. Kleinbaum, Mitchel Klein

284 SHORT BOOK REVIEWS

Using R for Data Management, Statistical Analysis, and GraphicsNicholas J. Horton, Ken KleinmanChapman & Hall/CRC, 2011, xxii + 275 pages, £38.99/$59.95, softcoverISBN: 978-1-4398-2755-0

Table of contents

1. Introduction to R 5. Regression generalizations2. Data management 6. Graphics3. Common statistical procedures 7. Advanced applications4. Linear regression and ANOVA Appendix: The HELP study dataset

Readership: Advanced researchers who do statistical analyses in R.

R is an open source package to perform statistical and graphical tasks at all levels. R has theadvantage that it is an open programming environment and offers interesting applications anduser contributed packages (in CRAN) for a large variety of users. The application of R is onlylimited by the user’s knowledge of statistical methods and his programming skills. A potentiallydifficult task is the data preparation and management that foregoes all statistical analyses (andcan take almost the same time) and therefore the book emphasizes this aspect as it is stated inthe title.

The last decade has seen many textbooks that introduces to the R language at all levels. Thereare essentially approaches: one that emphasize the statistical theory part and give R examplesand book that emphasize the computational side and develops R as statistical software. This is abook that is written in the latter spirit but is new insofar as it incorporates also the knowledge of40 contributed packages. Given the increasing number of contributed packages in R in CRAN(comprehensive R archive network) it might be conceivable that we will see more books of thistype in the future.

As the authors state in the introduction: “We have written this book as a reference text forusers of R. Our primary goal is to provide users with an easy way to learn how to perform ananalytic task in [R] . . . We include many common tasks, including data management, descriptivesummaries, inferential procedures, regression analysis, multivariate methods, and the creationof graphics.”

Experienced users in R will get many new ideas by surfing through websites, since they knowwhat to look for. Introductory R books are geared mainly to students and researchers makingstatistical applications. The book contains a detailed subject index and an R command index,describing the R syntax. Many examples involve data from a clinical trial from the HELP study(Table A.1 in the Appendix).

In addition to the HELP examples, there are extended examples and case studies thatdemonstrate R codes and facilitate the searches for hints by the reader. An extensive indexhelps the reader to find the relevant commands to perform the desired analysis. It would be niceif a book like this one could be made available online, so that the search costs are becomingsmaller. For example, to change a plot symbol in a scatterplot, most occasional users will haveto check the manual or look up the manual or a book. Hopefully future edition of R will comewith a menu on the screen to avoid such time consuming search costs.

The book shows the technical possibilities of R to display results of a statistical analysis bygraphs. Chapter 1 introduces to R and Chapter 2 to the data management capabilities. Chapter4 describes common statistical procedures and Chapter 5 regression and ANOVA models.

International Statistical Review (2011), 79, 2, 272–301C© 2011 The Author. International Statistical Review C© 2011 International Statistical Institute

Page 14: Logistic Regression: A Self-Learning Text, Third Edition by David G. Kleinbaum, Mitchel Klein

SHORT BOOK REVIEWS 285

Chapter 6 is devoted to the graphical possibilities and the last Chapter 7 is on “AdvancedApplications”, like multiple time series plots, Cox model, power curve calculations (ROC), orsimple examples as how to compute MCMC iterations with, for example, Metropolis steps.

The book comes with a website http://www.math.smith.edu/r/ that contains program codesand data sets and for each chapter there is an R code. The interesting aspect of the book is, thatit does not only describe the basic statistics and graphics function of the basic R system but itdescribes the use of 40 additional available from the CRAN website. The website contains alsothe R code to install all the packages that contain the described features.

In summary, the book is a useful complement to introductory statistics books and lectures, butcannot be used as a standalone introductory statistics text book. Readers should be familiar withthe concept of statistics and should have some experience as what can be done with statistics.Those who know R might get additional hints on new features of statistical analyses.

Wolfgang Polasek: [email protected] of Economics and Finance, Institute for Advanced Studies

Stumpergasse 56, 1060 Vienna, Austria

Bayesian Ideas and Data Analysis: An Introduction for Scientists and StatisticiansRonald Christensen, Wesley Johnson, Adam Branscum, Timothy E. HansonChapman & Hall/CRC, 2011, xvii + 498 pages, £44.99/$69.95, hardcoverISBN: 978-1-4398-0354-7

Table of contents

1. Prologue 10. Correlated data2. Fundamental ideas I 11. Count data3. Integration versus simulation 12. Time to event data4. Fundamental ideas II 13. Binary diagnostic tests5. Comparing populations 14. Nonparametric models6. Simulations Appendix A: Matrices and vectors7. Basic concepts of regression Appendix B: Probability8. Binomial regression Appendix C: Getting started in R9. Linear regression

Readership: MS and PhD students of statistics, biostatistics, epidemiology and other areas ofscience.

This is a modern book on Bayesian statistics, so that as well as outlining the philosophical coreof Bayesian inference it includes discussion and examples of how to apply such methods inpractice, using both WinBUGS and R.

The book kicks off with a chapter of examples, illustrating and motivating the development,which is to follow. This means that it is the second chapter, which outlines basic probability andBayesian theory. Chapter 3 then introduces the computational machinery of modern Bayesianstatistics. This chapter starts very much from the beginning (“Go to the website . . . and downloadWinBUGS”), but the book does progress through quite elaborate models (for example time-to-event data, survival models, and nonparametric models). Inevitably, and as the contents list indi-cates, the authors have had to be selective about the classes of methods and problems they cover.

Although concerned with the Bayesian approach, the authors appear to have a commendablypragmatic attitude to statistics: (p. 13) “Bayesian statistics appears to be the only logically

International Statistical Review (2011), 79, 2, 272–301C© 2011 The Author. International Statistical Review C© 2011 International Statistical Institute

Page 15: Logistic Regression: A Self-Learning Text, Third Edition by David G. Kleinbaum, Mitchel Klein

286 SHORT BOOK REVIEWS

consistent method of making statistical inferences, although not the only useful one. Ourpresentation incorporates some non-Bayesian ideas, especially as they relate to model checking”.There are exercises embedded in the text, and there is an associated website which contains dataand code and odd bits of other material. I imagine this website will be further developed overthe course of time.

This is a very sound introductory text, and is certainly one which teachers of any course onBayesian statistics beyond the briefest and most elementary should consider adopting.

David J. Hand: [email protected] Department, Imperial College

London SW7 2AZ, UK

A Beginner’s Guide to Structural Equation Modeling, Third EditionRandall E. Schumacker, Richard G. LomaxRoutledge, 2010, xx + 510 pages, £35.95/$59.95, softcover (also available as hardcover)ISBN: 978-1-84169-891-5

Table of contents

1. Introduction 12. Model validation2. Data entry and data editing issues 13. Multiple sample, multiple group, and structured3. Correlation means models4. SEM basics 14. Second order, dynamic, and multi trait multi5. Model fit method models6. Regression models 15. Multiple indicator multiple indicator cause,7. Path models mixture, and multi-level models8. Confirmatory factor models 16. Interaction, latent growth, and Monte Carlo9. Developing structural equation models: Part I methods

10. Developing structural equation models: Part II 17. Matrix approach to structural equation modeling11. Reporting SEM research: Guidelines and

recommendations

Readership: Graduate students and researchers, for example of social and behavioural sciences,having an understanding of basic statistics, correlation, and regression analysis.

Structural Equation Modelling (SEM) is a collection of analysis methods that are widely appliedin social and behavioural sciences as well as in business, health care and many other areas. Thisbook tries hard to offer a guide to SEM designed especially for a beginner. It uses freely availablestudent version of LISREL 8.8 (for Windows) throughout the examples. Some instructions onusing SAS and SPSS for related tasks are also given. The style is mostly non-mathematical.

The text is fairly easy to read, and it includes many special tips and suggestions that reflect thepractical experience of the authors. The path figures are useful for understanding the traditionaland sometimes rather technical LISREL codes and outputs. Each chapter has its own list ofreferences, which may be helpful, although I personally would prefer a single list in the end,perhaps with the referencing chapters marked somehow. In general, the chapters tend to bequite short, with the exception of Chapters 5 and 16, which are a bit more complicated andtechnical. Chapter 11 with its numerous checklists is very useful for both reading and reportingresults.

This is the third edition, with expanded coverage of various advanced topics such as multiple-group, multi-level and mixture modelling, second-order and dynamic factor models and Monte

International Statistical Review (2011), 79, 2, 272–301C© 2011 The Author. International Statistical Review C© 2011 International Statistical Institute

Page 16: Logistic Regression: A Self-Learning Text, Third Edition by David G. Kleinbaum, Mitchel Klein

SHORT BOOK REVIEWS 287

Carlo methods. An immediate question is, where to draw the line: is all this really meant fora beginner? It might sound a bit scary to offer 17 Chapters, total of 500 pages for an innocentbeginner. The book could well be split in two parts. The first part could be the beginner’s guidewhile the second part would assume the reader to be already familiar with the basic concepts,principles and practice of SEM introduced in the first part. Quite clearly the Chapters 1 to 12could form the first part.

All in all, this is a good textbook of SEM with a comprehensive coverage. It is definitelymuch more than just a beginner’s guide.

Kimmo Vehkalahti: [email protected] of Social Research, StatisticsFI-00014 University of Helsinki, Finland

Logistic Regression ModelsJoseph M. HilbeChapman & Hall/CRC, 2009, xviii + 637 pages, £49.99/$79.95, hardcoverISBN: 978-1-4200-7575-5

Table of contents

1. Introduction 13. Panel models2. Concepts related to the logistic model 14. Other types of logistic-based models3. Estimation methods 15. Exact logistic regression4. Derivation of the binary logistic algorithm Conclusion5. Model development Appendix A: Brief guide to using Stata commands6. Interactions Appendix B: Stata and R logistic models7. Analysis of model fit Appendix C: Greek letters and major functions8. Binomial logistic regression Appendix D: Stata binary logistic command9. Overdispersion Appendix E: Derivation of the beta-binomial

10. Ordered logistic regression Appendix F: Likelihood function of the adaptiveGauss–Hermite quadrature method of estimation11. Multinomial logistic regression

Appendix G: Data sets12. Alternative categorical responseAppendix H: Marginal effects and discrete changemodels

Readership: Students of statistics, and researchers whose work involves modelling binary data.

There is considerable and obvious merit in embedding statistical tools in higher levelgeneralisations—so that, for example, analysis of variance and linear regression lie in theclass of linear models, and these and logistic regression lie in the class of generalised linearmodels. However, there is also considerable insight to be gained by treating such models asentities in their own right, and developing and exploring the implications and ramifications oftheir special nature. This allows one to tease out useful particular properties, which can be usedto shed light when analysing data. This book does exactly that for logistic regression models.Motivation is provided by the author’s prefatory comment that “with the exception of multiplelinear regression . . . logistic regression is perhaps used more than any other statistical regressiontechnique for research of every variety” (p. xiv).

The book takes the reader from the basics, such as the difference between odds and riskratios, and parameter estimation procedures, through to advanced and even esoteric variantsand extensions of the logistic model, including such things as stereotype logistic models, scobit

International Statistical Review (2011), 79, 2, 272–301C© 2011 The Author. International Statistical Review C© 2011 International Statistical Institute

Page 17: Logistic Regression: A Self-Learning Text, Third Edition by David G. Kleinbaum, Mitchel Klein

288 SHORT BOOK REVIEWS

skewed logistic regression, and exact logistic regression. As is nowadays necessary for any bookpurporting to describe both the theory and practice of a class of statistical tools, the developmentmust be illustrated using a particular software tool. In this case it is Stata, with appendices givingan overview of this language, focusing on its use in the book, but R commands are also givenat the chapter ends. There is an accompanying web site. It includes a detailed discussion ofthe historical development of logistic regression models and software for fitting them, and alsocovers such important practical issues as handling missing data and errors in the responses.There are exercises at the end of each chapter.

Overall this is a comprehensive book, which will provide a very useful resource and handbookfor anyone whose work involves modelling binary data.

David J. Hand: [email protected] Department, Imperial College

London SW7 2AZ, UK

Simultaneous Inference in RegressionWei LiuChapman & Hall/CRC, 2011, xxi + 270 pages, £57.99/$89.95, hardcoverISBN: 978-1-4398-2809-0

Table of contents

1. Introduction to linear regression analysis 8. Confidence bands for logistic regression2. Confidence bands for one simple regression

modelAppendix A: Approximation of the percentile of a

random variable3. Confidence bands for one multiple regression

modelAppendix B: Computation of projection

π (t, P, X r)4. Assessing part of a regression model Appendix C: Computation of projection π∗(t, W, X 2)5. Comparison of two regression models Appendix D: Principle of intersection-union test6. Comparison of more than two regression

modelsAppendix E: Computation of the K-functions in

Chapter 77. Confidence bands for polynomial regression

Readership: Practitioners and researchers applying regression methods.

This book provides a comprehensive discussion of methods for determining simultaneousconfidence bands in regression. Although primarily focused on simple and multiple linearregression, it also covers polynomial and logistic regression. The authors’ perspective is thatsimultaneous confidence bands for regression models are often more intuitive and informativethan hypothesis tests and confidence intervals of the regression coefficients, a position withwhich I agree.

The depth of the coverage is indicated by the fact that the chapter on simple linear regressionincludes discussion of one- and two-sided bands, two- and three-segment bands, along withcomparison of the bands arising from different approaches. The book also discusses comparisonof different regression models, contrasting the familiar hypothesis testing approach, which shedsno light on the magnitude of any difference between the models, with a simultaneous confidenceband on the difference between models.

There are detailed real examples throughout the book.The book provides a valuable up-to-date review of work in this area. However, for practitioners

convinced by the authors’ arguments of the merits of such tools, who nevertheless did not

International Statistical Review (2011), 79, 2, 272–301C© 2011 The Author. International Statistical Review C© 2011 International Statistical Institute

Page 18: Logistic Regression: A Self-Learning Text, Third Edition by David G. Kleinbaum, Mitchel Klein

SHORT BOOK REVIEWS 289

themselves want to become experts, I think it would be enhanced if more guidance was givenon choice of methods. Perhaps in a second edition?

David J. Hand: [email protected] Department, Imperial College

London SW7 2AZ, UK

Causality: Models, Reasoning and Inference, Second EditionJudea PearlCambridge University Press, 2009, xix + 464 pages, £37.00/$50.00, hardcoverISBN: 978-0-521-89560-6

Table of contents

1. Introduction to probabilities, graphs, 7. The logic of structure-based counterfactualsand causal models 8. Imperfect experiments: bounding effects and

2. A theory of inferred causation counterfactuals3. Causal diagrams and the identification of causal

effects9. Probability of causation: interpretation and

identification4. Actions, plans, and direct effects 10. The actual cause5. Causality and structural models in social science

and economics11. Reflections, elaborations, and discussions with

readers6. Simpson’s paradox, confounding, and collapsibility Epilogue: The art and science of cause and effect

Readership: Computer Scientists, Cognitive scientists, Statisticians, Philosophers, Social Sci-entists, and Economists, in fact anyone interested in causality. Some preliminary knowledge ofgraphical models will help, but it is not essential, since the topic is introduced well in Chapter 1.

This is the second edition of the famous book with the same name. The first edition, publishedin 2000, has seen eight printings up to 2008, showing its immense popularity.

Causality has been a very controversial topic till the first edition of the book came out. In hisown words, “the popular reception . . . and rapid expansion of the structural theory of causationcall for a new edition to assist causation through her second transformation—from a demystifiedwonder to a commonplace tool in research and education.” There are substantial additions toeach of the ten chapters of the first edition. In addition, there is a new chapter which “elucidatessubtle issues that readers and reviewers have found perplexing” in one sense or other.

I recall I felt very humble in the mid or late sixties when I realized I had studied genetics in thelate fifties without ever hearing of the DNA, though the path breaking paper of Watson and Crickhad appeared several years ago I have felt equally humble as I went through Pearl’s second editionand realized how little I knew about of causal rather than association based graphical models,their rigorous justification and the light that causal models throw on confounding, interventions,and counterfactuals, to mention some of the most fundamental notions of Statistics. Having readthe book I would add to this list structural equations. Perhaps the most profound fact I have learntis that causes can be defined rigorously, except in certain well identified cases, and identifiedfrom possibly a rather large amount of data.

The book is very well written, blending intuition, humor, polemics, and rigorous mathematicaland philosophical argument. Even if a reader doesn’t go through the rigorous proofs oralgorithms, he/she would come away with substantial insight about the above basic notions.

There is a catch however. Apparently many of these ideas are still not accepted by some verydistinguished statisticians. Throughout the book Pearl conducts a debate with them. His ideas

International Statistical Review (2011), 79, 2, 272–301C© 2011 The Author. International Statistical Review C© 2011 International Statistical Institute

Page 19: Logistic Regression: A Self-Learning Text, Third Edition by David G. Kleinbaum, Mitchel Klein

290 SHORT BOOK REVIEWS

seem very convincing. Whether all his ideas appearing for the first time will be as “stable” asthe earlier part, may not become immediately clear, but they certainly deserve full attention.

This is a wonderful book.

Jayanta K. Ghosh: [email protected] of Statistics, Purdue University

West Lafayette, IN 47909, USA

Medical Uses of Statistics, Third EditionJohn C. Bailar III, David C. Hoaglin (Editors)Wiley, 2009, xxx + 491 pages, £88.95/€106.70/$139.00, hardcoverISBN: 978-0-470-43952-4

Table of contents

IntroductionSection I: Broad Concepts and Analytic

Techniques11. Statistical analysis of survival data (Stephen W.

Lagakos)1. Statistical concepts fundamental to

investigations (Lincoln E. Moses)12. Analysis of categorical data in medical studies

(Paul S. Albert)2. Some uses of statistical thinking (John C.

Bailar III)13. Analyzing data from ordered categories (Lincoln

E. Moses, John D. Emerson, Hossein3. Use of statistical analysis in the New England Hosseini)

Journal of Medicine (Shilpi Agarwal, Graham A. Section IV: Communicating ResultsColditz, John D. Emerson) 14. Guidelines for statistical reporting in articles

Section II: Design for medical journals: amplifications and4. Randomized trials and other parallel

comparisons of treatment (Nancy E. Mayo)explanations (John C. Bailar III, FrederickMosteller)

5. Crossover and self-controlled designs in clinicalresearch (John C. Bailar III, Thomas A. Louis,Philip W. Lavori, Marcia Polansky)

15. Reporting of subgroup analyses in clinical trials(Rui Wang, Stephen W. Lagakos, James H. Ware,David J. Hunter, Jeffrey M. Drazen)

6. The series of consecutive cases as a device forassessing outcomes of interventions (Lincoln E.Moses)

16. Writing about numbers (FrederickMosteller, Margaret Perkins, StephenMorrissey)

7. Biostatistics in epidemiology: design and basic Section V: Specialized Methodsanalysis (Mark S. Goldberg) 17. Combining results from independent studies:

Section III: Analysis systematic reviews and meta-analysis in clinical8. p-Values (James H. Ware, Frederick Mosteller, research (Michael A. Stoto)

Fernando Delgado, Christl Donnelly, Joseph A.Ingelfinger)

18. Biostatistics in epidemiology: advanced methodsof regression analysis (Mark S. Goldberg)

9. Understanding analyses of randomized trials(Nancy E. Mayo)

19. Genetic inference (Dan L. Nicolae, Thorsten Kurz,Carole Ober)

10. Linear regression in medical research (Paul J.Rathouz, Amita Rastogi)

20. Identifying disease genes in association studies(Dan L. Nicolae, Thorsten Kurz, Carole Ober)

21. Risk assessment (A. John Bailer, John C.Bailar III)

Readership: This volume will be of good help for medical researchers for gaining usefulstatistical insight for planning of clinical and biomedical studies as well as in seeking statisticalconclusions.

The first edition (1986) was based on 13 articles published in the New England Journal ofMedicine and included six chapters written specifically for the book. A second and enlargededition came out in 1992. The present volume is the third and updated version of the earliertwo editions. There are 21 chapters followed by an Index. These chapters have been written byexperts in this evolving field. While a dozen of the chapters are updated versions of articles in

International Statistical Review (2011), 79, 2, 272–301C© 2011 The Author. International Statistical Review C© 2011 International Statistical Institute

Page 20: Logistic Regression: A Self-Learning Text, Third Edition by David G. Kleinbaum, Mitchel Klein

SHORT BOOK REVIEWS 291

earlier editions, there are some new additions too. Chapters 4, 9, 12, 14, 18, 19, 20, and 21 areall in this category of “written for this edition of this book”.

In medical and clinical research, involving human subjects, not only there is greater variationarising from diverse factors, some ascribable and some not, but also additional complicationsdue to restricted use of randomization and forced complex (at least, nonstandard) designsof such investigations. It became quite clear, at least five decades ago, that there was andstill is ample evidence of empirics among deterministics in clinical research that called fornonstandard statistical methodology and the art of combining statistical interpretation withscientific acumen to facilitate and validate objective conclusions which constitute the basicmotivation of such studies. Back in the 1980s, this venture on the part of the editors wascertainly a very commendable task, and in that way, the follow-up of the second and thirdeditions were natural. In that way, Section V of the present edition incorporates developmentsmostly taking place in the past two decades and it adds to the appeal of this book to a widerclass of researchers in a wider field of biostatistics, epidemiology, medical sciences, clinicaltrials, environmental health sciences, and to a certain extent, the evolving field of genomicsand bioinformatics. In this broad interdisciplinary field, statistical reasoning, albeit essential,is mingled with the computational needs and the conjugation of computer science, molecularbiology and toxicology among others. The present volume is a good introduction to this evolvingarea of biostatistical research.

The past twenty-five years have witnessed a phenomenal growth of the outreach of statisticalscience in a variety of interdisciplinary fields. The evolution of the journals Statistics in Medicine(published by John Wiley), Randomized Clinical Trials, and Bioinformatics, followed by otherallied journals certainly speaks of this exodus of statistics in clinical and biomedical research.The Encyclopedia of Biostatistics (Wiley) and the Encyclopedia of Environmetrics (Wiley)have articles covering a wider area. John Wiley and SIAM have also initiated new series ofintroductory books and monographs in this evolving field of clinical and biomedical research.Although it is very difficult to include this broad spectrum of up-to-date developments in asingle volume, it may be worth mentioning that these accompanying volumes are very useful togo beyond the (mostly) descriptive introduction provided in this volume.

Pranab K. Sen: [email protected] of Statistics & Operations Research, 338 Hanes Hall, CB# 3260,

The University of North Carolina at Chapel Hill, NC 27599-3260, USA

Introduction to Nonparametric EstimationAlexandre B. TsybakovSpringer, 2009, xii + 214 pages, £53.99/€59.95/$79.95, hardcover (also available as softcover)ISBN: 978-0-387-79051-0

Table of contents

1. Nonparametric estimators 3. Asymptotic efficiency and adaptation2. Lower bounds on the minimax risk

Readership: The potential reader of this book should be conversant with functional analysis andtopology at least beyond the introductory level. In this way, the book may be out of reach ofmany statisticians who may not have this sophisticated mathematical background. Nevertheless,

International Statistical Review (2011), 79, 2, 272–301C© 2011 The Author. International Statistical Review C© 2011 International Statistical Institute

Page 21: Logistic Regression: A Self-Learning Text, Third Edition by David G. Kleinbaum, Mitchel Klein

292 SHORT BOOK REVIEWS

for a broad spectrum of mathematical statisticians, especially in the Continental Europe, thiswill be welcome as a good reading material.

Nonparametric inference, particularly estimation theory, has its genesis in the parametriccase. Whereas in the parametric case, typically, there is a finite-dimensional parameter, inthe nonparametric case it is very rarely a finite-dimensional one; rather it is a functional ofsome underlying distributions or probability laws. This book, a revised and extended version a2003 French text by the same author, attempts to formulate the basic theory of nonparametricfunctional estimation, including (i) construction of such estimators, (ii) their (asymptotic)statistical properties, (iii) optimality, in some sense, and (iv) adaptive estimation.

The author contends to present the material at an introductory level and as an introduction tononparametric estimation, albeit interpreted in his own way and somewhat different from theuse of the term, nonparametric estimation, in a broad sense more acceptable to statisticians. Thebook consists of three chapters: Chapter 1 is devoted to the basic methodology of nonparametricestimation, namely for density estimation and nonparametric regression. Chapter 2 deals withlower bounds to the minimum risk of nonparametric estimation. In the concluding chapter,asymptotic efficiency and adaptation are discussed.

In view of this specific journal, I would rather view the developments in this treatise in amore statistical way. In this vein, the first spark of nonparametric estimation may be ascribed tovon Mises (1947) and Hoeffding (1948) where the concept of estimable parameters or regularfunctional were clearly formulated in a statistical way. These developments were at least a decadebefore the spur of density estimation and later on nonparametric regression. I am a bit surprisedto see that there is no mention of this genesis in the text nor in the bibliography. Likewise, in riskestimation and related topics, there has been some good developments in the Soviet school in the1970s, perhaps a bit earlier than the contemporary developments in the West. Some historicalreferences would have been welcome.

Pranab K. Sen: [email protected] of Statistics & Operations Research, 338 Hanes Hall, CB# 3260,

The University of North Carolina at Chapel Hill, NC 27599-3260, USA

References

Hoeffding, W. (1948). A class of statistics with asymptotically normal distribution. Annals of Mathematical Statistics,19, 293–325.

Mises, R. von (1947). On the asymptotic distribution of differentiable statistical functions. Annals of MathematicalStatistics, 18, 300–348.

International Statistical Review (2011), 79, 2, 272–301C© 2011 The Author. International Statistical Review C© 2011 International Statistical Institute

Page 22: Logistic Regression: A Self-Learning Text, Third Edition by David G. Kleinbaum, Mitchel Klein

SHORT BOOK REVIEWS 293

Inequalities: Theory of Majorization and Its Applications, Second EditionAlbert W. Marshall, Ingram Olkin, Barry C. ArnoldSpringer, 2011, xxvii + 909 pages, €69.95/£62.99/$89.95, hardcoverISBN: 978-0-387-40087-7

Table of contents

Part I. Theory of Majorization 12. Probabilistic, statistical, and other applications1. Introduction 13. Additional statistical applications2. Doubly stochastic matrices Part IV. Generalizations3. Schur-convex functions 14. Orderings extending majorization4. Equivalent conditions for majorization 15. Multivariate majorization5. Preservation and generation of majorization Part V. Complementary Topics6. Rearrangements and majorization 16. Convex functions and some classical inequalities

Part II. Mathematical Applications 17. Stochastic ordering7. Combinatorial analysis 18. Total positivity8. Geometric inequalities 19. Matrix factorizations, compounds, direct9. Matrix theory products, and M-matrices

10. Numerical analysis 20. Extremal representations of matrix functions.Part III. Stochastic Applications Biographies

11. Stochastic majorizations

Readership: Researchers, postgraduate students in mathematical sciences.

In his interview in Statistical Science in 2007, Ingram Olkin is glad to tell that he has a lot ofcoauthors but he also has one co-author with whom he has been involved for more than 50 years,Albert W. Marshall, and in an interesting way Olkin describes how their cooperation started andfrom 1967 to when [the first edition of] the book on Inequalities appeared in 1979—that’s over a12 or 13 year period—they were collecting results, working together, and then ultimately wrotethe book. In the Preface of the second edition Marshall and Olkin write: “With this backgroundof commitment for the first edition, it was clear to both of us that an appropriate revision wouldbe a major commitment that we were reluctant to consider undertaking alone. Fortunately, wehad the wisdom to invite Barry Arnold to join us”. Let’s cite also Barry Arnold who in theabstract of his article “Majorization: Here, There and Everywhere” (Statistical Science, 2007)states: “The appearance of Marshall and Olkin’s 1979 book on inequalities with special emphasison majorization generated a surge of interest in potential applications of majorization and Schurconvexity in a broad spectrum of fields. After 25 years this continues to be the case”.

Now, here we have the second edition of the praised classic without whom I know that somepeople never leave home; now these faithful ones must take into account that the second editionhas 909 pages (vs. 569) and shipping weight is 3.2 pounds (vs. 2.2).

As the authors state, since 1979, many new applications of the majorization ordering haveappeared and this revision attempts to bring these uses to the fore so that the reader can seethe extent and variation in its use. The chapters of the original version remain intact; additionsappear within the text and as supplements at the end of chapters. The bibliography has increasedby over 50%. A new, large addition is the discussion of Lorenz curves. Needless to say, therevised volume obviously continues as a celebrated classic.

Simo Puntanen: [email protected] of Information Sciences,

FI-33014, University of Tampere, Finland

International Statistical Review (2011), 79, 2, 272–301C© 2011 The Author. International Statistical Review C© 2011 International Statistical Institute

Page 23: Logistic Regression: A Self-Learning Text, Third Edition by David G. Kleinbaum, Mitchel Klein

294 SHORT BOOK REVIEWS

Medical Applications of Finite Mixture ModelsPeter SchlattmannSpringer, 2009, x + 246 pages, £58.99/€64.95/$89.95, hardcover (also available as softcover)ISBN: 978-3-540-68650-7

Table of contents

1. Overview of the book 5. Disease mapping and cluster investigations2. Introduction: heterogeneity in medicine 6. Modelling heterogeneity in psychophysiology3. Modelling count data 7. Investigating and analyzing heterogeneity in4. Theory and algorithms meta-analysis

8. Analysis of gene expression data

Readership: This lucid and consistent presentation should be welcome by researchers in thegreater domain of biomedical research.

This impressive monograph attempts to cover the use of finite mixture models in a varietyof biomedical problems, illustrated by appropriate case studies. The book is comprised of8 chapters, identifying the basic heterogeneity that prevails in medical research, use ofexplanatory variables or (covariates) with specific reference to some problems in the studyof the so-called PBPK (or pharmacokinetics) models. Existing statistical methodology anduseful computational algorithms are discussed in Chapters 3 and 4. In the rest of the bookdisease mapping, psychophysiology, and gene expression data where the mixture models havebeen appraised, including a treatise of meta-analysis which is common in such studies.

Statistical analysis has been presented consistently at an intermediate level so that researchersin a broader biomedical field, constituting the general audience of this book, can appreciate therationality of statistical modelling and analysis to a greater extent.

Finite mixture models have affinity to Bayes methodology, without being in a strict parametricsetup or invoking a conjugate prior. Yet, in this setup, there is a pertinent question: Cansuch a simple structure be assumed in a complex biological system which may be marred bystructural constraints, non-normal variation, and manipulations of data collection. For example,in Chapter 8, in the context of gene expression data models, it has been tacitly assumed thatthe response variables are normally distributed so that conventional t-tests can be used. Becausesuch microarray data are far from being elementary (and subject to various layers of datamanipulation and standardization) such the conventional normality assumption may not standwell. Further, in high-dimensions, such as arising in microarray data models, in view of smallsample sizes, conventional t-test may turn out to be highly non-robust to model departures. Somenonparametric tests are better competitor in this respect, though they led to discrete distributionsfor p-values requiring special care of handling their attained level. This difficulty may alsoarise in meta-analysis. The use of finite mixture models in these contexts may also result innon-robustness.

Pranab K. Sen: [email protected] of Statistics & Operations Research, 338 Hanes Hall, CB# 3260,

The University of North Carolina at Chapel Hill, NC 27599-3260, USA

International Statistical Review (2011), 79, 2, 272–301C© 2011 The Author. International Statistical Review C© 2011 International Statistical Institute

Page 24: Logistic Regression: A Self-Learning Text, Third Edition by David G. Kleinbaum, Mitchel Klein

SHORT BOOK REVIEWS 295

Statistics in Plain English, Third EditionTimothy C. UrdanRoutledge, 2010, xii + 211 pages, £19.95/$32.95, softcoverISBN: 978–0-415–87291-1

Table of contents

1. Introduction to social science research principlesand terminology

11. Factorial analysis of variance

2. Measures of central tendency12. Repeated-measures analysis of variance

3. Measures of variability13. Regression

4. The normal distribution14. The chi-square test of independence

5. Standardization and z scores 15. Factor analysis and reliability analysis: datareduction techniques6. Standard errors

Appendix A: Area under the normal curve beyond z7. Statistical significance, effect size, andconfidence intervals Appendix B: Critical values of the t distribution

8. Correlation Appendix C: Critical values of the F distribution9. t Tests Appendix D: Critical values of the studentized range

statistics (for the Tukey HSD test)10. ne-way analysis of varianceAppendix E: Critical values of the χ 2 distributions

Readership: Students of introductory statistics.

The concept of a support text describing statistical ideas in simple language has been refinedby Urdan across three editions now. Reading it is like reading one of those high school examrevision guides—plain English, reference forward and backward to where topics are discussed,all the steps in a calculation described carefully. The book has a very clear purpose, to be asupplement to a standard colourful first-year textbook. I would say Urdan achieves his aimsextremely well. There are no exercises, but there are nice sections in each chapter giving atemplate for reporting the results of each analysis. Each chapter contains an example that isworked through with care, and a glossary appears in each chapter (sometimes terms appearin multiple chapters with slightly different definitions!) The website for the book provides thequestions for a survey used several times as an example (and may even provide the data if youhave SPSS on your computer, which I didn’t).

The fifteen chapters comprise all of the standard topics in an introductory statistics course,although many such courses include design of experiments and sampling schemes which arenot covered here. The later topics which are more common in social science research such asfactorial and repeated measures ANOVA, regression inference and factor analysis are dealt within less depth. Output from the statistical package SPSS is occasionally used for illustration but ina fairly generic way, maintaining the book’s usefulness for students using a variety of statisticalsoftware.

So what’s not to like? I thought the definition of a distribution was a bit clunky. I didn’tlike seeing H0: μ = x . And I did not like the preference for assuming equal variances in theindependent samples t-test. However as an overall support package, I would have no hesitationin recommending it to students of introductory statistics.

Alice Richardson: [email protected] of Information, Sciences and Engineering,

University of Canberra, ACT 2601, Australia

International Statistical Review (2011), 79, 2, 272–301C© 2011 The Author. International Statistical Review C© 2011 International Statistical Institute

Page 25: Logistic Regression: A Self-Learning Text, Third Edition by David G. Kleinbaum, Mitchel Klein

296 SHORT BOOK REVIEWS

Logistic Regression: A Self-Learning Text, Third EditionDavid G. Kleinbaum, Mitchel KleinSpringer, 2010, xvii + 701 pages, £81.00/€89.95/$99.00, hardcoverISBN: 978-1-4419-1741-6

Table of contents

1. Introduction to logistic regression 9. Assessing goodness of fit for logistic regression2. Important special cases of the logistic model 10. Assessing discriminatory performance of a binary

logistic model: ROC curves3. Computing the odds ratio in logistic regression11. Analysis of matched data using logistic regression

4. Maximum likelihood techniques: an overview 12. Polytomous logistic regression5. Statistical techniques using maximum likelihood

techniques13. Ordinal logistic regression

6. Modelling strategy guidelines14. Logistic regression for correlated data: GEE

7. Modelling strategy for assessing interaction andconfounding

15. GEE examples

8. Additional modelling strategy issues

16. Other approaches for analysis of correlated dataAppendix: Computer programs for logistic regression

Readership: Researchers, undergraduate students, graduate students.

The third edition of this book continues the tradition of the authors of a two-column book thatreally does act as a self-learning text. The left-hand column is like a collection of PowerPointslides, including generic-style computer output and diagrams to visualize the relationshipbetween concepts.

Each chapter contains about 10 exercises, some routine calculation and some asking forexplanation of particular points. Answers are provided immediately. Tests consist of about 20multiple choice questions and about 15 longer questions that mirror the exercises. There arefewer multiple choice questions as the chapters progress, and answers are given at the back ofthe book.

The reference list includes about 40 items and has been updated to include publications up to2008. The authors’ website (http://www.sph.emory.edu/∼dkleinb/logreg2.htm) provides linksto the data sets that are used for illustration throughout the book.

Alice Richardson: [email protected] of Information Sciences and Engineering,

University of Canberra, ACT 2601, Australia

International Statistical Review (2011), 79, 2, 272–301C© 2011 The Author. International Statistical Review C© 2011 International Statistical Institute

Page 26: Logistic Regression: A Self-Learning Text, Third Edition by David G. Kleinbaum, Mitchel Klein

SHORT BOOK REVIEWS 297

Multiple Comparisons Using RFrank Bretz, Torsten Hothorn, Peter WestfallChapman & Hall/CRC, 2011, xvii + 187 pages, £49.99/$79.95, hardcoverISBN: 978-1-58488-574-0

Table of contents

1. Introduction 4. Applications2. General concepts 5. Further topics3. Multiple comparisons in parametric models

Readership: Researchers and users of multiple comparisons in R.

An excellent first chapter gently explains to the reader why multiple comparison techniquesare required. Successive chapters delve into more and more sophisticated scenarios for multiplecomparisons. The benefits of applying multiple testing in the context of ordinary linear modelsare in my opinion not often stressed in undergraduate statistics courses. It was refreshing to seean entire chapter devoted to this idea.

The over 150 references make this an excellent entry point into the literature, but there are noexercises at the end of each chapter.

The main statistical tool employed is of course R, in particular the package multcomp. Thepreface confidently states that the package includes demos relevant to each chapter of the book.Make sure you are close to an Internet connection before you begin, as I found that some of thedemos required the installation of extra packages.

The preface states that the book’s main point of difference is its concentration on maximumstatistics, but that did not come through too clearly in my reading. I found more of an emphasison closure and nested hypotheses, both interesting concepts in their own right of course.

Do not buy this book if you are expecting a thorough introduction to multiple testing in thecontext of microarrays, and the false discovery rate of Benjamini and Hochberg (1995). There’sonly one reference to the technique, in Chapter 2. But more importantly, the preface clearlystates those who will benefit from buying this book. Do consider it if you are a user of R, whois also a researcher or teacher of linear modelling, and needs to apply multiple comparisons intheir work. Your work will be the better for it.

Alice Richardson: [email protected] of Information Sciences and Engineering,

University of Canberra, ACT 2601, Australia

References

Benjamini, Y. & Hochberg, Y. (1995). Controlling the false discovery rate: a practical and powerful approach tomultiple testing. Journal of the Royal Statistical Society, Series B, 57, 289–300.

International Statistical Review (2011), 79, 2, 272–301C© 2011 The Author. International Statistical Review C© 2011 International Statistical Institute

Page 27: Logistic Regression: A Self-Learning Text, Third Edition by David G. Kleinbaum, Mitchel Klein

298 SHORT BOOK REVIEWS

Introduction to Psychometric TheoryTenko Raykov, George A. MarcoulidesRoutledge, 2011, xii + 335 pages, £44.95/$75.00, hardcoverISBN: 978-0-415-87822-7

Table of contents

1. Measurement, measuring instruments, andpsychometric theory

7. Procedures for estimating reliability

2. Basic statistical concepts and relationships8. Validity

3. An introduction to factor analysis9. Generalizability theory

4. Introduction to latent variable modeling andconfirmatory factor analysis

10. Introduction to item response theory

5. Classical test theory

11. Fundamentals and models of item response theory

6. Reliability

Appendix: A brief introduction to some graphicsapplications of R in item response modeling

Readership: “Advanced undergraduate students, graduate students, and researchers in thebehavioural, social, educational, marketing, business and biomedical disciplines”. And I wouldadd statisticians who wish to see a particular domain of application of these important statisticalmodelling ideas.

This book gives an impressively clear introduction to the ideas underlying modern psychometrictheory, emphasizing the core concept of latent variable modelling. It assumes essentially nostatistical knowledge, with a chapter introducing things from the level of the calculation of asample mean, defining matrix operations, etc. This chapter also includes a brief introduction toR. Later chapters illustrate the use of other software packages, such as SPSS and the specializedlatent variable modelling language Mplus.

The third chapter gives an admirably clear outline of factor analysis, noting the differencebetween that and principal components analysis and including coverage of such matters asrotation and the determination of number of factors. The distinction between exploratory andconfirmatory factor analysis is also touched on, but is explored in more depth in Chapter 4.The key topics of psychometric theory are all discussed—factor analysis, reliability and validityassessment, classical test theory, generalizability theory, and item response theory but, inevitablein book of finite length, some topics are not covered. Incomplete data is one such topic,of potential importance when one comes to applying the methods in practice. The authorsrecognize this importance, and in an epilogue draw attention to it, directing the reader toappropriate literature.

A useful discussion notes some misconceptions, such as the attribution of the term “model”to descriptions which have not imposed any constraining structure, the belief that classicaltest theory assumes independence of various kinds, and the assertion that the theory requiresparticular measurement properties of the underlying scale. A reader new to the area, or indeedto statistical modelling in general, who reads and thinks about the points being made here willgain deep insight.

From a broader statistical perspective, it would have been attractive to have some explicit linksto other literatures touching on some of the topics covered here: such as multilevel models forgeneralizability theory, and the distinction between “pragmatic” and “representational” aspectsof measurement theory, but as an introduction for its intended audience, perhaps this is not toocritical.

International Statistical Review (2011), 79, 2, 272–301C© 2011 The Author. International Statistical Review C© 2011 International Statistical Institute

Page 28: Logistic Regression: A Self-Learning Text, Third Edition by David G. Kleinbaum, Mitchel Klein

SHORT BOOK REVIEWS 299

If I have a criticism of the book, it is the very minor one that the index is disappointingly sparse.Overall, however, the book is to be recommended as giving a thoughtful—and surprisinglycomprehensive in just 335 pages—introduction to psychometric theory. I would certainlyrecommend it to students or other researchers, statisticians and otherwise, who wished to gainunderstanding and insight into modern psychometric theory.

David J. Hand: [email protected] Department, Imperial College,

London SW7 2AZ, UK

Linear Causal Modeling with Structural EquationsStanley A. MulaikChapman & Hall/CRC, 2009, xxiii + 444 pages, £50.99/$82.95, hardcoverISBN: 978-1-4398-0038-6

Table of contents

1. Introduction 9. Confirmatory factor analysis2. Mathematical foundations for structural equation

modelling10. Equivalent models

3. Causation11. Instrumental variables

4. Graph theory for causal modelling12. Multilevel models

5. Structural equation models13. Longitudinal models

6. Identification14. Non-recursive models

7. Estimation of parameters15. Model evaluation

8. Designing SEM studies16. Polychoric correlation and polyserial correlation

Readership: Quantitative methodologists and graduate students in methodology programmes,and others seeking a deeper understanding of causation, linear causal modelling, and structuralequation modelling.

Causality has become a hot topic amongst statisticians and researchers from related disciplines inrecent years, where the practical opportunities provided by the computer have led to significantadvances, after centuries of rather slow progress.

The first quarter of this book lays the foundations, first of the mathematics, which will beneeded later in the book, and second of the philosophical aspects of causality. So, in principle,little mathematical knowledge is needed—it includes a (55 page) chapter, which begins from avery basic level (e.g., multiplication of a vector by a scalar, using calculus to find the maximumof a function). In practice, of course, and as is always the case, some prior familiarity with themathematical tools will probably be necessary to make the remainder of the book comfortablyaccessible. It then proceeds through an excellent (48 page) detailed historical presentation ofthe ideas and philosophy of causation, and the relationship between causation and correlation,beginning with the ancient Greeks and working through Descartes, Locke, Berkeley, Hume,Kant, and so on up to present day contributions.

It goes on to describe modern approaches, models, and also estimation and model evaluation,and includes chapters on more specialised aspects of causal issues, such as multilevel modelsand longitudinal models.

The book benefits very substantially from the author’s mixed background in multivariateanalysis, psychometrics, and philosophy of science—a background which is ideally suited tothe eclectic issues raised by considerations of causality.

International Statistical Review (2011), 79, 2, 272–301C© 2011 The Author. International Statistical Review C© 2011 International Statistical Institute

Page 29: Logistic Regression: A Self-Learning Text, Third Edition by David G. Kleinbaum, Mitchel Klein

300 SHORT BOOK REVIEWS

I am sure the volume will prove to be a very useful contribution to the literature, an excellenttext for someone intending to research in this area, and a useful reference source for thosealready doing so.

David J. Hand: [email protected] Department, Imperial College,

London SW7 2AZ, UK

Robust Nonparametric Statistical Methods, Second EditionThomas P. Hettmansperger, Joseph W. McKeanChapman & Hall/CRC, 2011, xvii + 535 pages, £63.99/$99.95, hardcoverISBN: 978-1-4398-0908-2

Table of contents

1. One-sample problems 5. Models with dependent error structure2. Two-sample problems 6. Multivariate3. Linear models Appendix: Asymptotic results4. Experimental designs: fixed effects

Readership: Advanced undergraduate students, graduate students, statisticians, and appliedstatisticians interested in and/or using nonparametric and robust methods.

Rank-based methods offer a unified, robust, and highly efficient approach for modern dataanalysis. The analysis, as described in this book, proceeds very much as does the traditionalanalysis based on the normal error distribution. The regular L-2 norm is just replaced by differentweighted L-1 norms. The optimal weights that maximize the efficiency of the tests and estimatesdepend on the underlying error distribution.

The book covers univariate tests and estimates for one-sample and two-sample location modelswith extensions to linear models and fixed effects experimental designs. This second editionextends analyses based on ranks to multivariate models, non-linear models, times series models,and models with dependent error structures (mixed models). As in the first edition, the rank-based methods are seen, throughout the book, as an L-1 norm based data fitting and inference.The theory with careful proofs of the asymptotic results is fully developed. The authors alsoillustrate the implementation of the methods using many real-world examples and R software.The methods described in the book can be applied using R libraries and functions made availablealso by the authors.

This book gives an excellent treatment of modern rank-based methods with a special attentionto their practical application to data. This is not at all a surprise as the authors belong to leadingresearchers in rank-based nonparametric methods. The book is a welcome highly up-to-dateand very readable contribution to the field. It will certainly become a standard reference fornonparametric and robust methods. I recommend the book as an important textbook for researchlibraries. The book will soon find its place on the shelves and the tables of many kind ofresearchers and will serve as a graduate course textbook.

Hannu Oja: [email protected] of Health Sciences,

FI-33014, University of Tampere, Finland

International Statistical Review (2011), 79, 2, 272–301C© 2011 The Author. International Statistical Review C© 2011 International Statistical Institute

Page 30: Logistic Regression: A Self-Learning Text, Third Edition by David G. Kleinbaum, Mitchel Klein

SHORT BOOK REVIEWS 301

GARCH Models: Structure, Statistical Inference and Financial ApplicationsChristian Francq, Jean-Michel ZakoianWiley, 2010, xiv + 489 pages, £65.00/€78.00/$95.00, hardcoverISBN: 978-0-470-68391-0

Table of contents

1. Classical time series models and financialseries

Part III: Extensions and Applications

Part I: Univariate GARCH Models10. Asymmetries

2. GARCH(p, q) processes11. Multivariate GARCH processes

3. Mixing12. Financial Applications

4. Temporal aggregation and weak GARCHmodels

Part IV: Appendices

Part II: Statistical Inference

A. Ergodicity, martingales, mixing

5. Identification

B. Autocorrelation and partial autocorrelation

6. Estimating ARCH models by least squares

C. Solutions to the exercises

7. Estimating GARCH models byquasi-maximum likelihood

D. Problems

8. Tests based on the likelihood9. Optimal inference and alternatives to theQMLE

Readership: Graduate students and researchers who have some knowledge and experience intime series analyses and have an interest in advanced theory and financial applications.

This book is written by two experienced researchers who are both active in Time SeriesAnalysis, Statistics and Econometrics. Since 2003 Nobel laureate Engle’s (1982) Econometricapaper, ARCH and GARCH models have become extremely influential in areas of FinancialEconometrics and the interface between Finance and Statistics, with significant real-worldapplications. The book presents a comprehensive approach to the models, including distributionproperties, identification, estimation, testing, optimal inference, asymmetric and multivariateextensions, and financial applications in option pricing and value at risk.

This book indeed provides an up-to-date coverage of current research in the areas. Inaddition to definitions and theorems, the chapters give an appropriate collection of examples,applications, exercises, with solutions, and sometimes explanatory remarks on the theoreticalresults introduced. The chapters also include bibliographical notes. These features help thisbook to be accessible to those who wish to be familiar with the modelling techniques devoted tofinancial time series. There is an author website http://perso.univ-lille3.fr/∼cfrancq/Christian-Francq/book-GARCH.html where data sets and programs in R, Mathematica, and Fortran areprovided.

Shuangzhe Liu: [email protected] of ISE, University of Canberra,

Canberra, ACT 2601, Australia

Reference

Engle, R. (1982). Autoregressive conditional heteroskedasticity with estimates of the variance of U.K. inflation.Econometrica, 50, 987–1008.

International Statistical Review (2011), 79, 2, 272–301C© 2011 The Author. International Statistical Review C© 2011 International Statistical Institute