1 practical issues and tools for modeling temporal and spatio-temporal trends in atmospheric...

53
1 Practical issues and tools for modeling temporal and spatio-temporal trends in atmospheric pollutant monitoring data Paul D. Sampson Department of Statistics University of Washington The International Environmetric Society Modelling Spatio-Temporal Trends Workshop 3 November 2003

Post on 22-Dec-2015

213 views

Category:

Documents


0 download

TRANSCRIPT

  • Slide 1
  • 1 Practical issues and tools for modeling temporal and spatio-temporal trends in atmospheric pollutant monitoring data Paul D. Sampson Department of Statistics University of Washington The International Environmetric Society Modelling Spatio-Temporal Trends Workshop 3 November 2003
  • Slide 2
  • 2 Our experience in analysis of trends in atmospheric pollutants Part I: Meteorological adjustment and long-term temporal trends in ozone Meteorological adjustment of western Washington and northwest Oregon surface ozone observations with investigation of trends. Reynolds, Das, Sampson, Guttorp, NRCSE TRS #15 ( http://www.nrcse.washington.edu/pdf/trs15_doe.pdf ) Meteorological adjustment of Chicago, Illinois, surface ozone observations with investigation of trends. Reynolds, Caccia, Sampson, Guttorp, NRCSE TRS #25 ( http://www.nrcse.washington.edu/pdf/trs25_chicago.pdf ) http://www.nrcse.washington.edu/pdf/trs25_chicago.pdf A review of statistical methods for the meteorological adjustment of tropospheric ozone. Thompson, Reynolds, Cox, Guttorp, Sampson, Atmospheric Environment 35, 617-630, 2001. Part II: Spatial trend for health effects studies Spatial estimation of ambient air concentrations for ozone, 1986-94, for chronic health effects modeling in 83 counties in the U.S. Current contract with U.S. EPA. Spatio-temporal modeling and prediction of ambient PM2.5 concentrations for acute and chronic health effects modeling with the NIH/NHLBI cohort study, MESA (Multi-Ethnic Study of Atheroscloerosis). Current proposal and ongoing collaboration with colleagues at the Univ of Washingtons Northwest Center for Particulate Matter and Health.
  • Slide 3
  • 3 Spatio-temporal modeling of ambient PM exposure for chronic health effect studies Paul D. Sampson Department of Statistics University of Washington Northwest Center for Particulate Matter & Health External Science Advisory Committee Meeting 12 November 2003
  • Slide 4
  • 4 Major North American cohort studies of PM: single community- wide exposure/monitor to characterize a metropolitan area. Fails to address important local spatial variation of air pollutants known to exist within regions. Hoek et al.: 3-component regression model to predict exposure to air pollutants (black smoke and NO 2 ). Incorporates (a) regional background levels, (b) urban gradient (based on population density) and (c) proximity to heavily-trafficked roadways and other point sources. Build on this approach to combine in a spatial model average concentration data from fixed-site ambient monitors and spatial covariate information encoded in a GIS, including population density, proximity to roads, and traffic density. Motivation for fine(r)-scale spatial modeling of pollutant exposure for chronic health effect studies:
  • Slide 5
  • 5 Aside: U.S. EPA currently funded Epidemiologic Research on Health Effects of Long-Term Exposure to Ambient P.M. and Other Air Pollutants (June 2003) Laden, Schwartz et al (Harvard): Chronic Exposure to Particulate Matter and Cardiopulmonary Disease. Nurses Health Study: Prospective cohort study of 121,700 women throughout U.S. Knutson, Beeson et al (Loma Linda): Relating Cardiovascular Disease Risk to Ambient Air Pollutants using GIS and Bayesian Neural Networks. AHSMOG study. Samet, Zeger, Dominici et al (Johns Hopkins): Chronic and Acute Exposure to Ambient Fine Particulate Matter and Other Air Pollutants: National Cohort Studies of Mortality and Morbidity. Data from Medicate beneficiary file and National Claims History File. Diez-Roux, Keeler, Samson, Lin (Michigan). Long-Term Exposure to Ambient PM and Subclincial Atherosclerosis. MESA Study.
  • Slide 6
  • 6 EPA apparently mandated/directed that all these studies be concerned with computing exposure estimates from ambient monitoring data and GIS-based information on local traffic, pop density, . Jon Samet (Johns Hopkins): EPA should invest in drawing national maps of exposure as all our research groups are trying to do the same thing.
  • Slide 7
  • 7 Applications: MESA Air: NHLBI-funded Multi-Ethnic Study of Atherosclerosis: effects of ambient PM (and other pollutants) on subclinical cardiovascular function 8700 subjects, aged 50-89, from 9 communities, assessed prospectively, longitudinally. Monitoring data and exposure assessment: Current AQS PM monitors (mostly 3-day sampling) Supplemental monitors, up to 5 per community (2 week integrated msmts of key pollutants) Mobile gradient monitoring (2 week integrated sampling) PLUS Distances to nearest major roadways with traffic volume and composition Distances to pollutant point source EVALUATION on PM2.5 and co-pollutants measured at 10% of homes Preliminary demonstration of spatio-temporal modeling using S. Calif ozone data.
  • Slide 8
  • 8 sum of non-ambient (N) and ambient (A) components: Ambient exposure is ambient concentration times an ambient exposure attenuation factor reflecting time spent outside the home and particle infiltration into the home: Model for ambient concentration: trend + residual Personal PM exposure for subject I at time t:
  • Slide 9
  • 9 Smoothly varying spatio-temporal trend is further decomposed: the 1 st term represents long-term mean concentration and will derive from a Bayesian analysis of a spatial regression model combining average concentration data from fixed- site ambient monitors and spatial covariate information encoded in a GIS. the 2 nd component represents mainly smooth seasonal temporal variation.
  • Slide 10
  • 10 The variance model for the residual term represents the spatio-temporal variation considered primarily at the 2- week time scale of the fixed sites and mobile gradient monitors. Estimation of this component will be based on (extensions to) the Bayesian model for the Sampson- Guttorp spatial deformation approach to nonstationary spatial covariance as demonstrated in Damian et al. (2001, 2003). This modeling strategy accommodates the spatial varying effects of predominant meteorology, coast lines and topographic features that underlie the statistical relationship between time varying pollutant levels at different points in space.
  • Slide 11
  • 11
  • Slide 12
  • 12
  • Slide 13
  • 13
  • Slide 14
  • 14
  • Slide 15
  • 15
  • Slide 16
  • 16
  • Slide 17
  • 17
  • Slide 18
  • 18
  • Slide 19
  • 19
  • Slide 20
  • 20
  • Slide 21
  • 21
  • Slide 22
  • 22
  • Slide 23
  • 23 Following Hoek and colleagues (2002 Atmos Env, 2003 Epidemiology), assume the regression model can be written Where represent pop density, proximity to roads, traffic density, and possibly local topographic and climatic wind patterns. The Bayesian analysis incorporates prior information on the parameters and on the spatial covariance structure of residuals from this regression model in a manner similar to that of our Bayesian framework for spatial estimation of the residual component (see (3) below). (1) Estimation of the long-term mean spatial field
  • Slide 24
  • 24 Note that monitoring observations will be used directly in the estimation, not just in the specification or calibration of the regression model as in the work of Hoek et al. I.e., in Hoek et al., (long-term) exposure is estimated as: In our (geostatistical) approach, we will be estimating the space-time field, ; the long-term exposure at a point includes an estimated (kriged) spatial residual and can be written: Mean field
  • Slide 25
  • 25 The spatial index in the 2 nd component allows for the possibility that the magnitude and precise details of the seasonal variation may vary from location to location over the spatial scale of the regional target communities. Preliminary analysis of PM 2.5 monitoring data in the Los Angeles county region suggests some spatial variation in seasonality, but in some regions we expect to find that this seasonal variation is homogeneous, permitting an additive (separable) decomposition of the spatio-temporal trend. (2) Smooth, spatially varying, temporal variation.
  • Slide 26
  • 26 Characterize and estimate the seasonal structure of air pollutant concentrations in terms of a model written as: where the are temporal basis functions describing possible seasonal trend patterns, and represent spatially varying coefficients of these trend patterns. Example: O3 trend components. (What do we expect with PM more generally?) Trend decomposition
  • Slide 27
  • 27 We compute trend components empirically as smoothed versions of the temporal singular vectors of the T N data matrix (rather than assuming parametric forms such as trigonometric functions). Arbitrary amounts of missing data are accommodated in an EM-like iterative calculation of the SVD. The Bayesian spatial regression model can incorporate the coefficients of these trend components as spatial fields, and thus provide the basis for estimation of at target homes.
  • Slide 28
  • 28
  • Slide 29
  • 29
  • Slide 30
  • 30
  • Slide 31
  • 31
  • Slide 32
  • 32
  • Slide 33
  • 33
  • Slide 34
  • 34
  • Slide 35
  • 35
  • Slide 36
  • 36
  • Slide 37
  • 37
  • Slide 38
  • 38 Final component: spatio-temporal variation at the (2-week) time scale of the fixed sites and mobile gradient monitors. Sampson-Guttorp spatial deformation approach (Damian et al. 2001, 2003), to model the nonstationary spatial covariance structure. Allows for spatially varying effects of predominant meteorology, coast lines and topographic features that underlie the statistical relationship between time varying pollutant levels at different points in space. Bayesian analysis provides a full posterior distribution for the model parameters, and thus a ready computation of multiple imputations of exposures for the health effects analysis. (3) Nonstationary residual spatio-temporal variation.
  • Slide 39
  • 39
  • Slide 40
  • 40
  • Slide 41
  • 41 Observed vs Predicted ozone at 3 validation sites.
  • Slide 42
  • 42
  • Slide 43
  • 43
  • Slide 44
  • 44
  • Slide 45
  • 45 Conclusion: We can estimate/predict both the day-to-day deviations from the trend, and the seasonal shape of the trend quite well, but We sometimes miss the long-term mean. => need to incorporate extra local information to predict the mean concentration.
  • Slide 46
  • 46
  • Slide 47
  • 47 Technical details
  • Slide 48
  • 48 Gaussian assumption after transformation Current AQS data sampling usually every 3 days; proposed sampling on 2-week intervals Conditional, hierarchical approach to estimating the parameters of our space-time models from this incomplete data, beginning with models estimated from the longer-term AQS data and them updating estimated model parameters with data from the new fixed and mobile monitors. Details, issues, and extensions
  • Slide 49
  • 49 First stage of analysis: build separate models and estimates for the three major exposures of interest, PM 2.5, NO x, and O 3. Second stage: take advantage of the association between PM 2.5 and NO x in a multivariate (co-kriging) analysis that assumes only that spatial nonstationarity can be expressed in a common underlying deformed coordinate system.
  • Slide 50
  • 50 or where we are writing C as an S T (space-time) matrix of observations, is an S (J+1) matrix of coefficients multiplying the matrix F, (J+1) T, with columns containing values of the basis functions evaluated at the S observation sites (i=1,,S). Obvious calculation is an SVD of the concentration matrix C.
  • Slide 51
  • 51 where the columns of the (truncated) matrix of right singular vectors is considered to represent the matrix of values of the J+1 temporal basis functions: F = Issues: Smoothness of the singular vectors as components of trend; computation with missing data.
  • Slide 52
  • 52 Posterior sample
  • Slide 53
  • 53 Site variances