geoenv vi – geostatistics for environmental...

30
geoENV VI – GEOSTATISTICS FOR ENVIRONMENTAL APPLICATIONS

Upload: vantruc

Post on 19-Mar-2018

217 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: geoENV VI – GEOSTATISTICS FOR ENVIRONMENTAL …download.e-bookshelf.de/download/0000/0044/37/L-G-0000004437... · Geostatistics for Contaminated Sites and Soils: Some Pending Questions

geoENV VI – GEOSTATISTICS FOR ENVIRONMENTAL APPLICATIONS

Page 2: geoENV VI – GEOSTATISTICS FOR ENVIRONMENTAL …download.e-bookshelf.de/download/0000/0044/37/L-G-0000004437... · Geostatistics for Contaminated Sites and Soils: Some Pending Questions

Quantitative Geology and Geostatistics

VOLUME 15

The titles published in this series are listed at the end of this volume.

Page 3: geoENV VI – GEOSTATISTICS FOR ENVIRONMENTAL …download.e-bookshelf.de/download/0000/0044/37/L-G-0000004437... · Geostatistics for Contaminated Sites and Soils: Some Pending Questions

geoENV VI – GEOSTATISTICSFOR ENVIRONMENTAL

APPLICATIONS

Proceedings of the Sixth European Conferenceon Geostatistics for Environmental Applications

Edited by

AMÍLCAR SOARES

Instituto Superior Técnico,Lisbon, Portugal

MARIA JOÃO PEREIRA

Instituto Superior Técnico,Lisbon, Portugal

and

ROUSSOS DIMITRAKOPOULOS

McGill University, Montreal, Canada

Page 4: geoENV VI – GEOSTATISTICS FOR ENVIRONMENTAL …download.e-bookshelf.de/download/0000/0044/37/L-G-0000004437... · Geostatistics for Contaminated Sites and Soils: Some Pending Questions

Editors

Amílcar SoaresInstituto Superior TécnicoLisbon, Portugal

Maria Joao PereiraInstituto Superior TécnicoLisbon, Portugal

Roussos DimitrakopoulosMcGill UniversityMontreal, Canada

ISBN: 978-1-4020-6447-0 e-ISBN: 978-1-4020-6448-7

Library of Congress Control Number: 2007936934

c© 2008 Springer Science+Business Media B.V.No part of this work may be reproduced, stored in a retrieval system, or transmittedin any form or by any means, electronic, mechanical, photocopying, microfilming, recordingor otherwise, without written permission from the Publisher, with the exceptionof any material supplied specifically for the purpose of being enteredand executed on a computer system, for exclusive use by the purchaser of the work.

Cover illustrations by A. Soares

Printed on acid-free paper

springer.com

Page 5: geoENV VI – GEOSTATISTICS FOR ENVIRONMENTAL …download.e-bookshelf.de/download/0000/0044/37/L-G-0000004437... · Geostatistics for Contaminated Sites and Soils: Some Pending Questions

Contents

Foreword . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix

Organizing Committee and International Scientific Committee . . . . . . . . . . . . . . xi

Contributors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiii

Part I Environment and Health

Geostatistical Analysis of Health Data: State-of-the-Art and PerspectivesP. Goovaerts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

Early Detection and Assessment of Epidemics by Particle FilteringC. Jégat, F. Carrat, C. Lajaunie and H. Wackernagel . . . . . . . . . . . . . . . . . . . . . . 23

Improvement of Forecast Noise Levels in ConfinedSpaces by Means of Geostatistical MethodsG. A. Degan, D. Lippiello, M. Pinzari and G. Raspa . . . . . . . . . . . . . . . . . . . . . . 37

Geostatistical Modeling of Environmental Sound PropagationO. Baume, H. Wackernagel, B. Gauvreau, F. Junker, M. Bérengierand J.-P. Chilès . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

Geostatistical Estimation of Electromagnetic ExposureY. O. Isselmou, H. Wackernagel, W. Tabbara and J. Wiart . . . . . . . . . . . . . . . . . . 59

How Spatial Analysis Can Help in Predictingthe Level of Radioactive Contamination of CerealsC. Mercat-Rommens, J.-M. Metivier, B. Briand and V. Durand . . . . . . . . . . . . . . 71

Stochastic Modelling Applied to Air Quality Space-Time CharacterizationA. Russo, R. M. Trigo and A. Soares . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83

Automatic Mapping Algorithm of Nitrogen Dioxide Levelsfrom Monitoring Air Pollution Data Using ClassicalGeostatistical Approach: Application to the French Lille CityG. Cardenas and E. Perdrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95

v

Page 6: geoENV VI – GEOSTATISTICS FOR ENVIRONMENTAL …download.e-bookshelf.de/download/0000/0044/37/L-G-0000004437... · Geostatistics for Contaminated Sites and Soils: Some Pending Questions

vi Contents

King Prawn Catch by Grade Category from an Economicand a Stock Management PerspectiveU. Mueller, L. Bloom, M. Kangas and N. Caputi . . . . . . . . . . . . . . . . . . . . . . . . . . 103

Part II Hydrology

Machine Learning Methods for Inverse ModelingD. M. Tartakovsky, A. Guadagnini and B. E. Wohlberg . . . . . . . . . . . . . . . . . . . . . 117

Statistical Moments of Reaction Rates in Subsurface ReactiveSolute TransportX. Sanchez-Vila, A. Guadagnini, M. Dentz and D. Fernàndez-Garcia . . . . . . . . 127

Including Conceptual Model Information when KrigingHydraulic HeadsM. Rivest, D. Marcotte and P. Pasquier . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141

Effect of Sorption Processes on Pump-and-Treat RemediationPractices Under Heterogeneous ConditionsM. Riva, A. Guadagnini and X. Sanchez-Vila . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153

A Stochastic Approach to Estimate Block Dispersivitiesthat Includes the Effect of Mass Transfer Between Grid BlocksD. Fernàndez-Garcia and J. J. Gómez-Hernández . . . . . . . . . . . . . . . . . . . . . . . . 165

Fracture Analysis and Flow Simulations in the RoselendFractured GraniteD. Patriarche, E. Pili, P. M. Adler and J.-F. Thovert . . . . . . . . . . . . . . . . . . . . . . . 175

Assessment of Groundwater Salinisation Risk UsingMultivariate GeostatisticsA. Castrignanò, G. Buttafuoco and C. Giasi . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191

A MultiGaussian Kriging Application to the EnvironmentalImpact Assessment of a New Industrial Site in Alcoy (Spain)J. R. Ilarri and J. J. Gómez-Hernández . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203

Hydrogeological Modeling of Radionuclide Transport in HeterogeneousLow-Permeability Media: A Comparison Between Boom Clayand Ieper ClayM. Huysmans and A. Dassargues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 211

Topological Kriging of RunoffJ. O. Skøien and G. Blöschl . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 221

Page 7: geoENV VI – GEOSTATISTICS FOR ENVIRONMENTAL …download.e-bookshelf.de/download/0000/0044/37/L-G-0000004437... · Geostatistics for Contaminated Sites and Soils: Some Pending Questions

Contents vii

Part III Meteorology

Quantifying the Impact of the North Atlantic Oscillation on Western IberiaR. M. Trigo . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235

Monthly Average Temperature ModellingM. Andrade-Bejarano . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 247

Improving the Areal Estimation of Rainfall in Galicia (NW Spain)Using Digital Elevation InformationJ. M. M. Avalos and A. P. González . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 263

Identification of Inhomogeneities in PrecipitationTime Series Using Stochastic SimulationA. C. M. Costa, J. Negreiros and A. Soares . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 275

Bayesian Classification of a Meteorological Risk Indexfor Forest Fires: DSRR.M. Durão, A. Soares, J.M.C. Pereira, J.A. Corte-Realand M.F.E.S. Coelho . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 283

Part IV Remote Sensing

Super Resolution Mapping with Multiple Point GeostatisticsA. Boucher . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 297

Super-Resolution Mapping Using the Two-PointHistogram and Multi-Source ImageryP. M. Atkinson . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 307

Dating Fire Events on End of Season Maps of Burnt ScarsT. J. Calado and C. C. DaCamara . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 323

Influence of Climate Variability on Wheat Production in PortugalC. Gouveia and R. M. Trigo . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 335

Part V Soil

Joint Simulation of Mine Spoil Uncertainty for RehabilitationDecision MakingR. Dimitrakopoulos and S. Mackie . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 349

Temporal Geostatistical Analyses of N2O Fluxes from DifferentlyTreated SoilsJ. M. M. Avalos, A. Furon, C. Wagner-Riddle and A. P. González . . . . . . . . . . . . 361

Page 8: geoENV VI – GEOSTATISTICS FOR ENVIRONMENTAL …download.e-bookshelf.de/download/0000/0044/37/L-G-0000004437... · Geostatistics for Contaminated Sites and Soils: Some Pending Questions

viii Contents

Zinc Baseline Level and its Relationship with Soil Texturein Flanders, BelgiumT. Meklit, M. V. Meirvenne, F. Tack, S. Verstraete, E. Gommerenand E. Sevens . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 373

Assessing the Quality of the Soil by Stochastic SimulationA. Horta, J. Carvalho and A. Soares . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 385

Interpolation of Soil Moisture Content Aided by FDR Sensor ObservationsK. Vanderlinden, J.A. Jiménez, J.L. Muriel, F. Perea, I. Garcíaand G. Martínez . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 397

Geostatistics for Contaminated Sites and Soils: Some Pending QuestionsD. D’Or, H. Demougeot-Renard and M. Garcia . . . . . . . . . . . . . . . . . . . . . . . . . . 409

Evaluation of an Automatic Procedure Based on GeostatisticalMethods for the Characterization of Contaminated SedimentsG. Raspa, C. Innocenti, F. Marconi, E. Mumelter and A. Salmeri . . . . . . . . . . . . 421

Part VI Methods

Nonlinear Spatial Prediction with Non-Gaussian Data:A Maximum Entropy ViewpointP. Bogaert and D. Fasbender . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 445

Data Fusion in a Spatial Multivariate Framework:Trading off Hypotheses Against InformationD. Fasbender and P. Bogaert . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 457

The Challenge of Real-Time Automatic Mapping forEnvironmental Monitoring Network ManagementE. J. Pebesma, G. Dubois and D. Cornford . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 467

Geostatistical Applications of Spartan Spatial Random FieldsS. N. Elogne and D. T. Hristopulos . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 477

A New Parallelization Approach for Sequential SimulationH.S. Vargas, H. Caetano and H. Mata-Lima . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 489

Clustering in Environmental Monitoring Networks:Dimensional Resolutions and Pattern DetectionD. Tuia, C. Kaiser and M. Kanevski . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 497

Page 9: geoENV VI – GEOSTATISTICS FOR ENVIRONMENTAL …download.e-bookshelf.de/download/0000/0044/37/L-G-0000004437... · Geostatistics for Contaminated Sites and Soils: Some Pending Questions

Foreword

geoENV: Ten Years Later

The environment has unquestionably become a key topic of focus and concern fortoday’s society, encompassing themes that include sustainable development, cli-mate change, reduction of biological diversity, and carbon emissions along withthe need for new energy paradigms. These themes are no longer the exclusive do-main of academic and scientific exploration. They are now high-priority issues forgovernments and environmental agencies of all industrialized countries because ofthe tremendous effects on the industrialized and, to an even grater extent, devel-oping world. Quantifying and predicting global environmental impacts and risksencompasses political, social, economic as well as technical dimensions, and isnow an integral part of strategic planning for both governments and internationalorganizations.

Geostatistics has become an important set of technical tools for environmen-tal problem-solving, in particular spatial and temporal assessment of uncertaintyof physical/environmental phenomena and related natural resources. Geostatisticshas been applied to a variety of fields from the characterization of desertification,degradation of soil, air and water quality, to the evaluation of health and pollutantspace-time relationships in the field of environmental epidemiology, and the assess-ment of climatic and meteorology for predicting the dynamic of natural phenomena.

Geostatistics for Environmental Applications (geoENV), a series of bi-annualconferences, was started in 1996 with the ambitious goal of bringing together thedisparate geostatisticical community to discuss ideas and methods regarding newand diverse applications in the environmental field. Thanks to everyone involvedin the organization and scientific coordination of the conferences, first in Lisbon,then Valencia, Avignon, Barcelona, Neufchâtel and most recently in Rhodes, thegeoENV international conferences and subsequent publication of selected papershave contributed to maintaining the high standards of scientific quality in approach-ing the diversity of new environmental modeling problems. Ten years after Lisbon,we are proud to see that geoENV has become a well-respected and well-supportedscientific project.

This book marks the first decade of geoENV and reflects the status of the mostup-to-date research in the field as presented in Rhodes. As in past years, scientists

ix

Page 10: geoENV VI – GEOSTATISTICS FOR ENVIRONMENTAL …download.e-bookshelf.de/download/0000/0044/37/L-G-0000004437... · Geostatistics for Contaminated Sites and Soils: Some Pending Questions

x Foreword

who approach environmental problems with different methodological perspectivesthan those encountered in our field, were invited to present their methodologiesat geoENV conferences, with the goal of enriching our own field through cross-fertilization with related fields and their approaches, concepts, tools and develop-ments. Ricardo Trigo, the keynote speaker in Rhodes, introduced us to new methodsfor modeling climate change and assessing corresponding impacts on natural re-sources and human health. The additional two keynote papers from Pierre Goovaertsand Philippe Renard presented the state of the art of geostatistical applications inanalyzing public health data and stochastic hydrology, respectively. The 42 papersof this book were presented in oral sessions of Methods, Environment and Health,Soil, Hydrology, Remote Sensing and Meteorology.

We would like to thank to all the authors and reviewers for their outstandingefforts and technical contributions to the present volume.

Amílcar SoaresMaria João Pereira

Roussos Dimitrakopoulos

Page 11: geoENV VI – GEOSTATISTICS FOR ENVIRONMENTAL …download.e-bookshelf.de/download/0000/0044/37/L-G-0000004437... · Geostatistics for Contaminated Sites and Soils: Some Pending Questions

Organizing Committee geoENV2006

Amilcar Soares, IST, Lisbon, PortugalDenis Allard, INRA, Avignon, FranceHélène Demougeot-Renard, FSS International, Neuchâtel, SwitzerlandJaime Gómez-Hernández, UPV, Valencia, SpainMaria João Pereira, IST, Lisbon, PortugalPascal Monestiez, INRA, Avignon, FrancePhaedon Kyriakidis, UC Santa Barbara, Santa Barbara, USAPhilippe Renard, University of Neuchâtel, SwitzerlandPierre Goovaerts, BioMedware, Inc., USARoland Froidevaux, FSS International, Geneva, SwitzerlandRoussos Dimitrakopoulos, BRC, U. QueenslandXavier Sanches-Vila, UPC, Barcelona, Spain

International Scientific Committee geoENV2006

Aldo FioriAlberto GuadagniniAlexander BoucherAlfred SteinAmilcar SoaresCarla NunesCarol Gotway CrawfordChantal de FouquetChris LoydDenis AllardDenis MarcotteDick BrusDimitri D’OrEdzer PebesmaGeoffrey M. JacquezHans WackernagelHarrie-Jan HendricksHelene DemougeotHenrique G. PereiraHirotaka SaitoJacques RivoirardJaime Gomez-HernandezJean Paul ChilésJennifer MckinleyJorge SousaJosé Almeida

Júlia CarvalhoLaura GuadagniniMarc Van MerveinneMargaret OliverMaria João PereiraMaria-Theresa SchafmeisterMohan SrivastavaMonica RivaMuray LarkOy LeuangthongPascal MonestiezPatrick BogaertPeter AtkinsonPeter KettlewellPhaedon KiriakidisPhilippe RenardPierre GoovaertsRicardo Garcia-HerreraRichard WebsterRoland FroidevauxRoussos DimitrakopoulosSouheil EzzedineVicente SerranoXavier EmeryXavier Sanchez-Vila

xi

Page 12: geoENV VI – GEOSTATISTICS FOR ENVIRONMENTAL …download.e-bookshelf.de/download/0000/0044/37/L-G-0000004437... · Geostatistics for Contaminated Sites and Soils: Some Pending Questions

Contributors

P.M. AdlerUPMC – Sisyphe, 4 place Jussieu,75252 Paris cedex 05, [email protected]

P.M. AtkinsonSchool of Geography, University ofSouthampton, Highfield,Southampton, SO17 1BJ, [email protected]

O. BaumeDepartment of EnvironmentalSciences, Wageningen University,The [email protected]

M. Andrade-BejaranoSchool of Biological Sciences,Statistics Section, Harry Pitt Building,The University of Reading,RG6 6FN, Reading, [email protected]

M. BérengierSection Acoustique Routière etUrbaine, Laboratoire Centraldes Ponts et Chaussées, 44341Bouguenais, [email protected]

L. BloomSchool of Engineering andMathematics, Edith Cowan University,Perth, Western [email protected]

G. BlöschlInstitute for Hydraulic and WaterResources Engineering, ViennaUniversity of Technology, Karlsplatz13/222, A-1040 Vienna, [email protected]

P. BogaertDepartment of Environmental Sciencesand Land Use Planning,Université catholique de Louvain,[email protected]

A. BoucherDpt. of Geological and EnvironmentalSciences, Stanford University,CA, [email protected]

B. BriandLaboratory of Radioecological Studiesfor Marine and Terrestrial Ecosystems,Institute for Radioprotection andNuclear Safety (IRSN),DEI/SESURE/LERCM, Cadarache,Bld 153, 13105 St-Paul-lez-Durancecedex, France

G. ButtafuocoCNR – Institute for Agricultural andForest Systems in theMediterranean, Via Cavour, 87030Rende (CS), [email protected]

xiii

Page 13: geoENV VI – GEOSTATISTICS FOR ENVIRONMENTAL …download.e-bookshelf.de/download/0000/0044/37/L-G-0000004437... · Geostatistics for Contaminated Sites and Soils: Some Pending Questions

xiv Contributors

H. CaetanoCMRP – Centre for ModellingPetroleum Reservoirs, Mining andGeoresources Department, InstitutoSuperior Técnico, Av. RoviscoPais, 1049-001, Lisboa, [email protected]

T.J. CaladoCentro Geofísico da Universidade deLisboa, 1749-016 Lisboa,[email protected]

N. CaputiMarine and Fisheries ResearchLaboratories, Department of Fisheries,North Beach, WA 6020,[email protected]

G. CardenasInstitut National de l’EnvironnementIndustriel et des Risques,Parc Technologique ALATA,BP 2, 60550 Verneuil-en-Halatte,[email protected]

F. CarratINSERM UMR-S 707, Paris, [email protected]

J. CarvalhoEnvironmental Group of the Centre forModelling Petroleum Reservoirs– CMRP, Instituto Superior Técnico,Lisboa, [email protected]

A. CastrignanòCRA – Agronomic Research Institute,Bari, [email protected]

J.-P. ChilèsCentre de Géosciences, École desMines de Paris, 77305Fontainebleau, [email protected]

M.F.E.S. CoelhoInstituto de Meteorologia, Lisboa,[email protected]

D. CornfordNeural Computing Research Group,Aston University, Birmingham B4 [email protected]

J.A. Corte-RealCentro de Geofísica de Évora,Universidade de Évora,[email protected]

A.C.M. CostaISEGI, Universidade Nova de Lisboa,Campus de Campolide, 1070-312Lisbon, [email protected]

C.C. DaCamaraCentro Geofísico da Universidade deLisboa, 1749-016 Lisboa,Portugal

A. DassarguesApplied Geology and Mineralogy,Department of Geology-Geography,Katholieke Universiteit Leuven,Belgium; Hydrogeology andEnvironmental Geology, University ofLiège, [email protected]

G.A. DeganDipartimento di Ingegneria Meccanicae Industriale; Universitàdegli studi Roma Tre, Rome, [email protected]

Page 14: geoENV VI – GEOSTATISTICS FOR ENVIRONMENTAL …download.e-bookshelf.de/download/0000/0044/37/L-G-0000004437... · Geostatistics for Contaminated Sites and Soils: Some Pending Questions

Contributors xv

H. Demougeot-RenardFSS International r&d 1956, Av. RogerSalengro, 92370 Chaville,[email protected]

M. DentzDepartment of GeotechnicalEngineering and Geosciences,Technical University of Catalonia,Gran Capità S/N, 08034 Barcelona,[email protected]

R. DimitrakopoulosDepartment of Mining and MaterialsEngineering, McGillUniversity, Montreal, Qc, CanadaH3A [email protected]

D. D’OrFSS International r&d 1956, Av. RogerSalengro, 92370 Chaville, [email protected]

G. DuboisEuropean Commission – DG JointResearch Centre, Institute forEnvironment and Sustainability, Ispra,[email protected]

V. DurandLaboratory of radioecological studiesfor marine and terrestrial ecosystems,Institute for Radioprotection andNuclear Safety (IRSN),DEI/SESURE/LERCM, Cadarache,Bld 153, 13105 St-Paul-lez-Durancecedex, France

R.M. DurãoCentro de Geofísica de Évora,Universidade de Évora, [email protected]

S.N. ElogneDepartment of Mineral ResourcesEngineering, Technical University ofCrete, Chania 73100, [email protected]

D. FasbenderDepartment of Environmental Sciencesand Land Use Planning,Université catholique de Louvain,[email protected]

D. Fernàndez-GarciaPolytechnic University of Valencia,Ingeniería Hidráulica yMedio Ambiente, Camino de Vera s/n.,46022 Valencia, [email protected]

A. FuronDept. of Land Resource ScienceUniversity of Guelph, Guelph,Ontario, Canada

I. GarcíaIFAPA, Centro “Las Torres-Tomejil”,Junta de Andalucía. Ctra.Sevilla-Cazalla km 12.2, 41200 Alcaládel Río (Seville), [email protected]

M. GarciaFSS International r&d 1956, Av. RogerSalengro, 92370 Chaville,[email protected]

B. GauvreauSection Acoustique Routière et Urbaine,Laboratoire Centraldes Ponts et Chaussées,44341 [email protected]

Page 15: geoENV VI – GEOSTATISTICS FOR ENVIRONMENTAL …download.e-bookshelf.de/download/0000/0044/37/L-G-0000004437... · Geostatistics for Contaminated Sites and Soils: Some Pending Questions

xvi Contributors

C. GiasiDepartment of Environmental and CivilEngineering, Polytechnic ofBari, [email protected]

J.J. Gómez HernándezPolytechnic University of Valencia,Department of Hydraulic Engineeringand EnvironmentCamino de Vera s/n., 46022 Valencia,[email protected]

E. GommerenOpenbare Afvalstoffenmaatschappijvoor het Vlaamse Gewest (OVAM),Stationstraat 110, 2800 Mechelen,[email protected]

P. GoovaertsBioMedware, Inc. 516 North StateStreet, Ann Arbor, MI 48104 – 1236,[email protected]

C. GouveiaCGUL, Departamento de Física,Faculdade de Ciências,Universidade de Lisboa, Portugal;Escola Superior de Tecnologia,Instituto Politécnico de Setúbal,[email protected]

A. GuadagniniDipartimento di Ingegneria Idraulica,Ambientale, InfrastruttureViarie, Rilevamento (DIIAR),Politecnico di Milano, Piazza L. DaVinci 32, 20133 Milano, [email protected]

A. HortaEnvironmental Group of the Centre forModelling Petroleum Reservoirs –CMRP, Instituto Superior Técnico,Lisboa, [email protected]

D.T. HristopulosDepartment of Mineral ResourcesEngineering, Technical University ofCrete, Chania 73100, [email protected]

M. HuysmansApplied Geology and Mineralogy,Department of Geology-Geography,Katholieke Universiteit Leuven,[email protected]

J.R. IlarriUniversidad Politécnica de Valencia,Instituto de Ingeniería del Agua yMedio Ambiente, Camino de Vera s/n.,46022 Valencia, [email protected]

C. InnocentiCentral Institute for Marine Research,Via di Casalotti, 300,00166 Roma, [email protected]

Y.O. IsselmouFrance Télécom R & D, RESA/FACE,Issy-les-Moulineaux,France; Ecole des Mines, CG –Geostatistics, Fontainebleau, France

C. JégatINSERM UMR-S 707, Paris, France;Ecole des Mines de Paris –Geostatistics group, Fontainebleau,[email protected]

Page 16: geoENV VI – GEOSTATISTICS FOR ENVIRONMENTAL …download.e-bookshelf.de/download/0000/0044/37/L-G-0000004437... · Geostatistics for Contaminated Sites and Soils: Some Pending Questions

Contributors xvii

J.A. JiménezIFAPA, Centro “Las Torres-Tomejil”,Junta de Andalucía. Ctra.Sevilla-Cazalla km 12.2, 41200 Alcaládel Río (Seville),Spain

F. JunkerDépartement Analyse Mécanique etAcoustique, Electricité de France,[email protected]

M. KangasMarine and Fisheries ResearchLaboratories, Department of Fisheries,Northbeach,Western Australia [email protected]

C. KaiserInstitute of Geography, University ofLausanne, CH-1015 Switzerland

M. KanevskiInstitute of Geomatics and Analysis ofRisk, University of Lausanne, CH-1015Switzerland

C. LajaunieEcole des Mines de Paris – Geostatisticsgroup, Fontainebleau,[email protected]

D. LippielloDipartimento di Ingegneria Meccanicae Industriale; Universitàdegli studi Roma Tre, [email protected]

S. MackieXstrata, Brisbane, Qld, Australia

F. MarconiCentral Institute for Marine Research,via di Casalotti, 300 – 00166 Roma,Italy

D. MarcotteÉcole Polytechnique de Montréal,Département des génies civil,géologique et des mines,C.P. 6079, Succ.Centre-ville, Montréal, Qc, Canada,H3C [email protected]

G. MartínezIFAPA, Centro “Las Torres-Tomejil”,Junta de Andalucía, Ctra.Sevilla-Cazalla km 12.2, 41200 Alcaládel Río (Seville),Spain

H. Mata-LimaCMRP – Centre for ModellingPetroleum Reservoirs, Mining andGeoresources Department, InstitutoSuperior Técnico, Av. RoviscoPais, 1049-001, Lisboa, PortugalDepartamento de Matemática eEngenharias, Universidade daMadeira, 9000-390 Funchal, [email protected]

M. Van MeirvenneDept. Soil Management and Soil Care,Ghent University, Coupure 653,9000 Gent, [email protected]

T. MeklitDept. Soil Management and Soil Care,Ghent University, Coupure 653,9000 Gent, [email protected]

Page 17: geoENV VI – GEOSTATISTICS FOR ENVIRONMENTAL …download.e-bookshelf.de/download/0000/0044/37/L-G-0000004437... · Geostatistics for Contaminated Sites and Soils: Some Pending Questions

xviii Contributors

C. Mercat-RommensLaboratory of radioecological studiesfor marine and terrestrial ecosystems,Institute for Radioprotection andNuclear Safety (IRSN),DEI/SESURE/LERCM, Cadarache,Bld 153, 13105 St-Paul-lez-Durancecedex, [email protected]

J.-M. MetivierLaboratory of radioecological studiesfor marine and terrestrial ecosystems,Institute for Radioprotection andNuclear Safety (IRSN),DEI/SESURE/LERCM, Cadarache,Bld 153, 13105 St-Paul-lez-Durancecedex, [email protected]

J.M.M. AvalosFaculty of Sciences, University ofCoruña, A Zapateira 15071,A Coruña, [email protected]

U. MuellerSchool of Engineering andMathematics, Edith Cowan University,Perth, Western [email protected]

E. MumelterCentral Institute for Marine Research,Via di Casalotti, 300,00166 Roma, [email protected]

J.L. MurielIFAPA, Centro “Las Torres-Tomejil”,Junta de Andalucía. Ctra.Sevilla-Cazalla km 12.2, 41200 Alcaládel Río (Seville)[email protected]

J. NegreirosInstituto Superior de Línguas eAdministração, Estradada Correia 53, 1500-210 Lisbon,[email protected]

P. PasquierGolder Associates, 9200, L’Acadieblvd., Montréal, Québec,Canada, H4N 2T2

D. PatriarcheGaz de France361, Avenue du President Wilson,BP 33, 93211 Saint Denis La PlaineCedex, [email protected]

A.P. GonzálezFaculty of Sciences, University ofCoruña, La Zapateira 15071,La Coruña, Spain

E.J. PebesmaGeosciences Faculty,Utrecht University, The [email protected]

E. PerdrixEcole des Mines de Douai, [email protected]

F. PereaIFAPA, Centro “Las Torres-Tomejil”,Junta de Andalucía. Ctra.Sevilla-Cazalla km 12.2, 41200 Alcaládel Río (Seville),Spain

J.M.C. PereiraCartography Centre, Tropical ResearchInstitute, Lisboa, Portugal;Department of Forestry, InstitutoSuperior de Agronomia, Lisboa,[email protected]

Page 18: geoENV VI – GEOSTATISTICS FOR ENVIRONMENTAL …download.e-bookshelf.de/download/0000/0044/37/L-G-0000004437... · Geostatistics for Contaminated Sites and Soils: Some Pending Questions

Contributors xix

E. PiliCommissariat à l’Energie Atomique,Département AnalyseSurveillance Environnement, BP 12,91680 Bruyères-le-Châtel,[email protected]

M. PinzariDipartimento di Ingegneria Meccanicae Industriale; Universitàdegli studi Roma Tre, [email protected]

G. RaspaDipartimento di Ingegneria Chimica,dei Materiali delle MateriePrime e Metallurgia; La SapienzaUniversità di Roma, Dept. ICMMPM,via Eudossiana, 18 – 00184Roma, [email protected]

M. RivaDipartimento Ingegneria Idraulica,Ambientale, InfrastruttureViarie, Rilevamento (DIIAR),Politecnico di Milano, Piazza L. DaVinci 32, 20133 Milano, [email protected]

M. RivestÉcole Polytechnique de Montréal,Département des génies civil,géologique et des mines, C.P. 6079,Succ. Centre-ville, Montréal, Qc,Canada, H3C [email protected]

A. RussoCMRP, Instituto Superior Técnico, Av.Rovisco Pais, 1049-001Lisboa, [email protected]

A. SalmeriCentral Institute for Marine Research,Via di Casalotti, 300,00166 Roma, [email protected]

X. Sanchez-VilaDepartment of Geotechnical Engineer-ing and Geosciences, TechnicalUniversity of Catalonia, Gran CapitàS/N, 08034 Barcelona,[email protected]

E. SevensOpenbare Afvalstoffenmaatschappijvoor het Vlaamse Gewest (OVAM),Stationstraat 110, 2800 Mechelen,[email protected]

J.O. SkøienDepartment of Physical Geography,Utrecht University, P.O. box 80115,3508 TC Utrecht, The [email protected]

A. SoaresEnvironmental Group of the Centre forModelling Petroleum Reservoirs –CMRP, Instituto Superior Técnico,Lisbon, [email protected]

W. TabbaraSupélec, DRE-LSS, Gif-sur-Yvette,[email protected]

F. TackDept. Soil Management and Soil Care,Ghent University, Coupure 653,9000 Gent, [email protected]

Page 19: geoENV VI – GEOSTATISTICS FOR ENVIRONMENTAL …download.e-bookshelf.de/download/0000/0044/37/L-G-0000004437... · Geostatistics for Contaminated Sites and Soils: Some Pending Questions

xx Contributors

D.M. TartakovskyDepartment of Mechanical andAerospace Engineering, University ofCalifornia, San Diego, La Jolla, CA92093, USA; TheoreticalDivision, Los Alamos NationalLaboratory, Los Alamos, NM 87545,[email protected]

J.-F. ThovertLaboratoire de Combustion etDétonique, SP2MI, BP 179, 86960Futuroscope Cedex, [email protected]

R.M. TrigoCGUL, Departamento de Física,Faculdade de Ciências,Universidade de Lisboa, Portugal;Departamento de Engenharias,Universidade Lusófona, Lisboa,[email protected]

D. TuiaInstitute of Geomatics and Analysis ofRisk, University of Lausanne, [email protected]

K. VanderlindenIFAPA, Centro “Las Torres-Tomejil”,Junta de Andalucía. Ctra.Sevilla-Cazalla km 12.2, 41200 Alcaládel Río (Seville),[email protected]

H.S. VargasCMRP – Centre for ModellingPetroleum Reservoirs, Mining andGeoresources Department, InstitutoSuperior Técnico, Av. RoviscoPais, 1049-001, Lisboa, [email protected]

S. VerstraeteDept. Soil Management and Soil Care,Ghent University, Coupure 653,9000 Gent, [email protected]

H. WackernagelCentre de Géosciences, École desMines de Paris, 77305Fontainebleau, FranceEcole des Mines de Paris – Geostatisticsgroup, Fontainebleau,[email protected]

C. Wagner-RiddleDept. of Land Resource Science,University of Guelph, Guelph,Ontario, [email protected]

J. WiartFrance Télécom R & D, RESA/FACE,Issy-les-Moulineaux,[email protected]

B.E. WohlbergTheoretical Division, Los AlamosNational Laboratory, Los Alamos, NM87545, [email protected]

Page 20: geoENV VI – GEOSTATISTICS FOR ENVIRONMENTAL …download.e-bookshelf.de/download/0000/0044/37/L-G-0000004437... · Geostatistics for Contaminated Sites and Soils: Some Pending Questions

Part IEnvironment and Health

Page 21: geoENV VI – GEOSTATISTICS FOR ENVIRONMENTAL …download.e-bookshelf.de/download/0000/0044/37/L-G-0000004437... · Geostatistics for Contaminated Sites and Soils: Some Pending Questions

Geostatistical Analysis of Health Data:State-of-the-Art and Perspectives

P. Goovaerts

Abstract The analysis of health data and putative covariates, such as environmental,socio-economic, behavioral or demographic factors, is a promising application forgeostatistics. It presents, however, several methodological challenges that arise fromthe fact that data are typically aggregated over irregular spatial supports and consistof a numerator and a denominator (i.e. population size). This paper presents anoverview of recent developments in the field of health geostatistics, with an em-phasis on three main steps in the analysis of aggregated health data: estimation ofthe underlying disease risk, detection of areas with significantly higher risk, andanalysis of relationships with putative risk factors. The analysis is illustrated usingage-adjusted cervix cancer mortality rates recorded over the 1970–1994 period for118 counties of four states in the Western USA. Poisson kriging allows the filter-ing of noisy mortality rates computed from small population sizes, enhancing thecorrelation with two putative explanatory variables: percentage of habitants livingbelow the federally defined poverty line, and percentage of Hispanic females. Area-to-point kriging formulation creates continuous maps of mortality risk, reducingthe visual bias associated with the interpretation of choropleth maps. Stochasticsimulation is used to generate realizations of cancer mortality maps, which allowsone to quantify numerically how the uncertainty about the spatial distribution ofhealth outcomes translates into uncertainty about the location of clusters of highvalues or the correlation with covariates. Last, geographically-weighted regressionhighlights the non-stationarity in the explanatory power of covariates: the highermortality values along the coast are better explained by the two covariates than thelower risk recorded in Utah.

1 Introduction

Since its early development for the assessment of mineral deposits, geostatisticshas been used in a growing number of disciplines dealing with the analysis of data

P. GoovaertsBioMedware, Inc. 516 North State Street, Ann Arbor, MI 48104-1236, USAe-mail: [email protected]

A. Soares et al. (eds.), geoENV VI – Geostatistics for Environmental Applications, 3–22C© Springer Science+Business Media B.V. 2008 3

Page 22: geoENV VI – GEOSTATISTICS FOR ENVIRONMENTAL …download.e-bookshelf.de/download/0000/0044/37/L-G-0000004437... · Geostatistics for Contaminated Sites and Soils: Some Pending Questions

4 P. Goovaerts

distributed in space and/or time. One field that has received little attention in thegeostatistical literature is medical geography or spatial epidemiology, which is con-cerned with the study of spatial patterns of disease incidence and mortality and theidentification of potential “causes” of disease, such as environmental exposure orsocio-demographic factors (Waller and Gotway 2004). This lack of attention con-trasts with the increasing need for methods to analyze health data following theemergence of new infectious diseases (e.g. West Nile Virus, bird flu), the higher oc-currence of cancer mortality associated with longer life expectancy, and the burdenof a widely polluted environment on human health.

Individual humans represent the basic unit of spatial analysis in health research.However, because of the need to protect patient privacy publicly available data areoften aggregated to a sufficient extent to prevent the disclosure or reconstructionof patient identity. The information available for human health studies thus takesthe form of disease rates, e.g. number of deceased or infected patients per 100,000habitants, aggregated within areas that can span a wide range of scales, such ascensus units, counties or states. Associations can then be investigated between theseareal data and environmental, socio-economic, behavioral or demographic covari-ates. Figure 1 shows an example of datasets that could support a study of the impactof demographic and socio-economic factors on cervix cancer mortality. The topmap shows the spatial distribution of age-adjusted mortality rates recorded over the1970-1994 period for 118 counties of four states in the Western USA. The corre-sponding population at risk is displayed in the middle map, either aggregated withincounties or assigned to 25 km2 cells. The bottom maps show two putative explana-tory variables: percentage of habitants living below the federally defined povertyline, and percentage of Hispanic females. Indeed, Hispanic women tend to haveelevated risk of cervix cancer, while poverty reduces access to health care and toearly detection through the Pap smear test in particular (Friedell et al. 1992). Thesesocio-demographic data are available at the census block level and were assigned tothe nodes of a 5 km spacing grid for the purpose of this study (same resolution asthe population map).

A visual inspection of the cancer mortality map conveys the impression that ratesare much higher in the centre of the study area (Nye and Lincoln Counties), as wellas in one Northern California county. This result must however be interpreted withcaution since the population is not uniformly distributed across the study area andrates computed from sparsely populated counties tend to be less reliable, an effectknown as “small number problem” and illustrated by the top scattergram in Fig. 1.The use of administrative units to report the results (i.e. counties in this case) canalso bias the interpretation: had the two counties with high rates been much smallerin size, these high values likely would have been perceived as less problematic. Last,the mismatch of spatial supports for cancer rates and explanatory variables preventstheir direct use in the correlation analysis.

Unlike datasets typically analyzed by geostatisticians, the attributes of interestare here measured exhaustively. Ordinary kriging, the backbone of any geostatisti-cal analysis, thus seems of little use. Yet, I see at least three main applications ofgeostatistics for the analysis of such aggregated data:

Page 23: geoENV VI – GEOSTATISTICS FOR ENVIRONMENTAL …download.e-bookshelf.de/download/0000/0044/37/L-G-0000004437... · Geostatistics for Contaminated Sites and Soils: Some Pending Questions

Geostatistical Analysis of Health Data 5

1. Filtering of the noise caused by the small number problem using a variant ofkriging with non-systematic measurement errors.

2. Modeling of the uncertainty attached to the map of filtered rates using stochasticsimulation, and propagation of this uncertainty through subsequent analysis, suchas the detection of aggregate of counties (clusters) with significantly higher orlower rates than neighboring counties.

3. Disaggregation of county-level data to map cancer mortality at a resolution com-patible with the measurement support of explanatory variables.

Goovaerts (2005a, 2006a,b) introduced a geostatistical approach to address all threeissues and compared its performances to empirical and Bayesian methods whichhave been traditionally used in health science. The filtering method is based on Pois-son kriging and semivariogram estimators developed by Monestiez et al. (2006) formapping the relative abundance of species in the presence of spatially heterogeneousobservation efforts and sparse animal sightings. Poisson kriging was combined withp-field simulation to generate multiple realizations of the spatial distribution of can-cer mortality risk. A limitation of all these studies is the assumption that the sizeand shape of geographical units, as well as the distribution of the population withinthose units, are uniform, which is clearly inappropriate in the example of Fig. 1.The last issue of change of support was addressed recently in the geostatisticalliterature (Gotway and Young 2002, 2005; Kyriakidis 2004). In its general formkriging can accommodate different spatial supports for the data and the prediction,while ensuring the coherence of the predictions so that disaggregated estimates ofcount data are non-negative and their sum is equal to the original aggregated count.The coherence property needs however to be tailored to the current situation whereaggregated rate data have various degree of reliability depending on the size of thepopulation at risk (Goovaerts, 2006b).

This paper discusses how geostatistics can benefit three main steps of the anal-ysis of aggregated health data: estimation of the underlying disease risk, detec-tion of areas with significantly higher risk, and analysis of relationships with pu-tative risk factors. An innovative procedure is proposed for the deconvolution ofthe semivariogram of aggregated rates and the disaggregation of these rates, ac-counting for heterogeneous population densities and the shape and size of admin-istrative units. The different concepts are illustrated using the cervix cancer data ofFig. 1.

2 Estimating Mortality Risk from Observed Rates

For a given number N of entities vα (e.g. counties), denote the observed mortalityrates as z(vα) = d(vα)/n(vα), where d(vα) is the number of recorded mortality casesand n(vα) is the size of the population at risk. Let us assume for now that all entitiesvα have similar shapes and sizes, with a uniform population density. These entitiescan thus be referenced geographically by their centroids with the vector of spatial

Page 24: geoENV VI – GEOSTATISTICS FOR ENVIRONMENTAL …download.e-bookshelf.de/download/0000/0044/37/L-G-0000004437... · Geostatistics for Contaminated Sites and Soils: Some Pending Questions

6 P. Goovaerts

Log10 (counts)

Population at risk (counties)

Cervix cancer mortality

Poverty level Hispanic population

Population at risk (25 km2 cells)

deaths/106 hab.

Log10 (counts)

Fig. 1 Geographical distribution of cervix cancer mortality rates recorded for white females overthe period 1970–1994, and the corresponding population at risk (aggregated within counties orassigned to 25 km2 cells). Scatterplot illustrates the larger variance of rates computed from sparselypopulated counties. Bottom maps show two putative risk factors: percentage of habitants livingbelow the federally defined poverty line, and percentage of Hispanic females

coordinate’s uα = (xα, yα). The disease count d(uα) is interpreted as a realizationof a random variable D(uα) that follows a Poisson distribution with one parameter(expected number of counts) that is the product of the population size n(uα) by thelocal risk R(uα), see Goovaerts (2005a) for more details.

Page 25: geoENV VI – GEOSTATISTICS FOR ENVIRONMENTAL …download.e-bookshelf.de/download/0000/0044/37/L-G-0000004437... · Geostatistics for Contaminated Sites and Soils: Some Pending Questions

Geostatistical Analysis of Health Data 7

In Poisson kriging (PK), the risk over a given entity vα is estimated as a linearcombination of the kernel rate z(uα) and the rates observed in (K -1) neighboringentities:

rP K (uα) =K∑

i=1

λi (uα)z(ui ) (1)

where λi (uα) is the weight assigned to the rate z(ui) when estimating the risk at uα .The K weights are the solution of the following system of linear equations:

K∑j=1

λ j (uα)[CR(ui − u j ) + δi j

m∗n(ui )

]+ μ(uα) = CR(ui − uα) i = 1,. . . ,K

K∑j=1

λ j (uα) = 1(2)

where δi j =1 if ui=uj and 0 otherwise, and m* is the population-weighted mean ofthe N rates. The addition of an “error variance” term, m*/n(ui), for a zero distanceaccounts for variability arising from population size, leading to smaller weights forless reliable data (i.e. measured over smaller populations). The prediction varianceassociated with the estimate (1) is computed using the traditional formula for theordinary kriging variance:

σ 2P K (uα) = CR(0) −

K∑

i=1

λi (uα)CR(ui − uα) − μ(uα) (3)

The computation of kriging weights and kriging variance (Equations (2) and (3))requires knowledge of the covariance of the unknown risk, CR(h), or equivalently itssemivariogram γ R(h)=CR(0)- CR(h). Following Monestiez et al. (2006) the semi-variogram of the risk is estimated as:

γR(h) = 1

2N(h)∑α=1

n(uα)n( uα+h)n(uα)+n( uα+h)

N(h)∑

α=1

{n(uα)n(uα + h)

n(uα) + n(uα + h)[z(uα) − z( uα + h)]2 −m∗

}

(4)

where the different pairs [z(uα) − z(uα+h)] are weighted by the correspondingpopulation sizes to homogenize their variance.

2.1 Area-to-Area (ATA) Poisson Kriging

In the situation where the geographical entities have very different shapes and sizes,areal data can not be simply collapsed into their respective polygon centroids. Fol-lowing the terminology in Kyriakidis (2004), ATA kriging refers to the case where

Page 26: geoENV VI – GEOSTATISTICS FOR ENVIRONMENTAL …download.e-bookshelf.de/download/0000/0044/37/L-G-0000004437... · Geostatistics for Contaminated Sites and Soils: Some Pending Questions

8 P. Goovaerts

both the prediction and measurement supports are blocks (or areas) instead of points.The PK estimate (1) for the areal risk value r(vα) thus becomes:

rP K (vα) =K∑

i=1

λi (vα)z(vi ) (5)

The Poisson kriging system (2) is now written as:

K∑

j=1

λ j(vα)[

CR(vi , v j ) + δi jm∗

n(vi )

]+ μ(vα) = CR(vi , vα) i = 1,. . . ,K

K∑

j=1

λ j(vα) = 1.

(6)

The main change is that point-to-point covariance terms CR(ui − u j ) are replacedby area-to-area covariances CR(vi , v j ) = Cov{Z(vi),Z(vj)}. Like in the traditionalblock kriging, those covariances are approximated by the average of the point sup-port covariance C(h) computed between any two locations discretizing the areas vi

and vj :

CR(vi , v j ) = 1Pi∑

s=1

Pj∑s ′=1

wss ′

Pi∑

s=1

Pj∑

s ′=1

wss ′C(us, us ′) (7)

where Pi and Pj are the number of points used to discretize the two areas vi and vj,respectively. For the example of Fig. 1 a grid with a spacing of 5 km was overlaidover the study area, yielding a total of 11 to 2,082 discretizing points per countydepending on its area. The high-resolution population map in Fig. 1 clearly showsthe heterogeneous distribution of population within counties. To account for spa-tially varying population density in the computation of the area-to-area covariance,the weights wss ′ were identified to the product of population sizes within the 25 km2

cells centred on the discretizing point us and u′s :

wss ′ = n(us) × n(us ′) withPi∑

s=1

n(us) = n(vi ) andPj∑

s ′=1

n(us ′) = n(v j ) (8)

The kriging variance for the areal estimator is computed as:

σ 2P K (vα) = CR(vα, vα) −

K∑

i=1

λi (vα)CR(vi , vα) − μ(vα) (9)

Page 27: geoENV VI – GEOSTATISTICS FOR ENVIRONMENTAL …download.e-bookshelf.de/download/0000/0044/37/L-G-0000004437... · Geostatistics for Contaminated Sites and Soils: Some Pending Questions

Geostatistical Analysis of Health Data 9

where CR(vα, vα) is the within-area covariance that depends on the form of thegeographical entity vα and decreases as its area increases. Thus, ignoring the size ofthe prediction support in the computation of the kriging variance (3) can lead to asystematic overestimation of the prediction variance of large blocks.

2.2 Area-to-Point (ATP) Poisson Kriging

A major limitation of choropleth maps is the common biased visual perception thatlarger rural and sparsely populated areas are of greater importance. A solution is tocreate continuous maps of mortality risk, which amounts to perform a disaggrega-tion or area-to-point interpolation. At each discretizing point us within an entity vα ,the risk r (us) can be estimated as the following linear combination of areal data:

rP K (us) =K∑

i=1

λi (us)z(vi ) (10)

The Poisson kriging system is similar to system (6), except for the right-hand-sideterm where the area-to-area covariances CR(vi , vα) is replaced by the area-to-pointcovariance CR(vi , us). The latter is approximated by a procedure similar to the onedescribed in equation (7). A critical property of the ATP kriging estimator is itscoherence, that is the aggregation of the Pα point risk estimates within any givenentity vα yields the areal risk estimate rP K (vα) :

rP K (vα) = 1

n(vα)

Pα∑

s=1

n(us)rPK (us) (11)

Condition (11) differs from the constraint commonly found in the geostatisticalliterature (Kyriakidis, 2004) in that: 1) the observation z(vα) is uncertain, henceit is the reproduction of the PK risk estimate rP K (vα) that is imposed, and 2) theincorporation of the population density in the computation of the areal covarianceimplies that it is the population-weighted average of the point risk estimates, nottheir arithmetical average, that satisfies the coherence condition. The constraint (11)is satisfied if the same K areal data are used for the estimation of the Pα point riskestimates. Indeed, in this case the population-weighted average of the right-hand-side covariance terms of the K ATP kriging systems is equal to the right-hand-sidecovariance of the single ATA kriging system:

1n(vα)

Pα∑s=1

n(us)CR(vi , us) = 1n(vα)

Pα∑s=1

n(us)[

1n(vi )

Pi∑s ′=1

n( us ′)C(us ′ , us)]

= CR(vi , vα) ,

(12)

Page 28: geoENV VI – GEOSTATISTICS FOR ENVIRONMENTAL …download.e-bookshelf.de/download/0000/0044/37/L-G-0000004437... · Geostatistics for Contaminated Sites and Soils: Some Pending Questions

10 P. Goovaerts

per relations (7) and (8). Therefore, the following relationship exists between thetwo sets of ATA and ATP kriging weights:

λi (vα) = 1

n(vα)

Pα∑

s=1

n(us)λi (us) i = 1, . . . , K (13)

which ensures the coherence of the estimation.

2.3 Deconvolution of the Semivariogram of the Risk

Both ATA and ATP kriging require knowledge of the point support covariance ofthe risk C(h), or equivalently the semivariogram γ (h). This function cannot be es-timated directly from the observed rates, since only aggregated data are available.Derivation of a point support semivariogram from the experimental semivariogramof areal data is called “deconvolution”, an operation that is frequent in mining andhas been the topic of much research (Journel and Huijbregts, 1978). However, intypical mining applications all blocks (areas) have the same size and shape, whichmakes the deconvolution reasonably straightforward. Goovaerts (2008) proposedan iterative approach to conduct the deconvolution in presence of irregular geo-graphical units. This innovative algorithm starts with the derivation of an initialdeconvoluted model γ (0)(h); for example the model γR(h) fitted to the areal data.This initial model is then regularized using the following expression:

γregul(h) = γ (0)(v, vh) − γ(0)h (v, v) (14)

where γ (0)(v, vh ) is the area-to-area semivariogram value for any two countiesseparated by a distance h. It is approximated by the population-weighted average(7), using γ (0)(h) instead of C(h). The second term, γ

(0)h (v, v), is the within-area

semivariogram value. Unlike the expression commonly found in the literature, thisterm varies as a function of the separation distance since smaller areas tend to bepaired at shorter distances. To account for heterogeneous population density, thedistance between any two counties is estimated as a population-weighted average ofdistances between locations discretizing the pair of counties:

Dist(vi , v j ) = 1Pi∑

s=1

Pj∑s ′=1

n(us)n(us ′)

Pi∑

s=1

Pj∑

s ′=1

n(us)n( us ′) ‖us − us ′‖ (15)

Note that the block-to-block distances (15) are numerically very close to the Euclid-ian distances computed between population-weighted centroids (Goovaerts, 2006b).The theoretically regularized model,γregul(h), is compared to the model fitted toexperimental values, γR(h), and the relative difference between the two curves,

Page 29: geoENV VI – GEOSTATISTICS FOR ENVIRONMENTAL …download.e-bookshelf.de/download/0000/0044/37/L-G-0000004437... · Geostatistics for Contaminated Sites and Soils: Some Pending Questions

Geostatistical Analysis of Health Data 11

denoted D, is used as optimization criterion. A new candidate point-support semi-variogram γ (1)(h) is derived by rescaling of the initial point-support model γ (0)(h),and then regularized according to expression (14). Model γ (1)(h) becomes the newoptimum if the theoretically regularized semivariogram model γ

(1)regul(h) gets closer

to the model fitted to areal data, that is if D(1) < D(0). Rescaling coefficients arethen updated to account for the difference between γ

(1)regul(h) and γR(h), leading to

a new candidate model γ (2)(h) for the next iteration. The procedure stops when themaximum number of allowed iterations has been tried (e.g. 35 in this paper) or thedecrease in the D statistic becomes negligible from one iteration to the next. The useof lag-specific rescaling coefficients provides enough flexibility to modify the initialshape of the point-support semivariogram and makes the deconvolution insensitiveto the initial solution adopted. More details and simulation studies are available inGoovaerts (2006b, 2008).

2.4 Application to the Cervix Cancer Mortality Data

Figure 2 (top graph, dark gray curve) shows the experimental and model semi-variograms of cervix cancer mortality risk computed from aggregated data usingestimator (4) and the distance measure (15). This model is then deconvoluted and,as expected, the resulting model (light gray curve) has a higher sill since the punc-tual process has a larger variance than its aggregated form. Its regularization usingexpression (14) yields a semivariogram model that is close to the one fitted to ex-perimental values, which validates the consistency of the deconvolution.

The deconvoluted model was used to estimate aggregated risk values at thecounty level (ATA kriging) and to map the spatial distribution of risk values withincounties (ATP kriging). Both maps are much smoother than the map of raw ratessince the noise due to small population sizes is filtered. In particular, the high riskarea formed by two central counties in Fig. 1 disappeared, which illustrates howhazardous the interpretation of the map of observed rates can be. The highest risk(4.081 deaths/100,000 habitants) is predicted for Kern County, just west of SantaBarbara County. ATP kriging map shows that the high risk is not confined to this solecounty but spreads over four counties, which is important information for designingprevention strategies. By construction, aggregating the ATP kriging estimates withineach county using the population density map of Fig. 1 (right medium graph) yieldsthe ATA kriging map.

The map of ATA kriging variance essentially reflects the higher confidence inthe mortality risk estimated for counties with large populations. The distribution ofpopulation can however be highly heterogeneous in large counties with contrastedurban and rural areas. This information is incorporated in the ATP kriging variancemap that shows clearly the location of urban centers, such as Los Angeles, SanFrancisco, Salt Lake City, Las Vegas or Tucson. The variance of point risk estimatesis much larger than the county-level estimates, as expected.

Page 30: geoENV VI – GEOSTATISTICS FOR ENVIRONMENTAL …download.e-bookshelf.de/download/0000/0044/37/L-G-0000004437... · Geostatistics for Contaminated Sites and Soils: Some Pending Questions

12 P. Goovaerts

ATA kriged risk ATA kriging variance

ATP kriged risk ATP kriging variance

deaths/106

deaths/106

Fig. 2 Experimental semivariogram of the risk estimated from county-level rate data, and theresults of its deconvolution (top curve). The regularization of the point support model yields a curve(black dashed line) that is very close to the experimental one. The model is then used to estimatethe cervix cancer mortality risk (deaths/100,000 habitants) and associated prediction variance atthe county level (ATA kriging) or at the nodes of a 5 km spacing grid (ATP kriging)

3 Detection of Spatial Clusters and Outliers

Mapping cancer risk is a preliminary step towards further analysis that mighthighlight areas where causative exposures change through geographic space, the