innovative uses of geographic information systems lance a. waller department of biostatistics...

28
Innovative Uses of Geographic Information Systems Lance A. Waller Department of Biostatistics Rollins School of Public Health Emory University [email protected]

Post on 22-Dec-2015

214 views

Category:

Documents


0 download

TRANSCRIPT

Innovative Uses of Geographic Information Systems

Lance A. WallerDepartment of Biostatistics

Rollins School of Public HealthEmory University

[email protected]

Outline

Why does the geography of immunization matter?

What is GIS?What does GIS do?What data do I have?What questions can I answer

with my data?

Why geography?

Is immunization coverage constant?

If you know where coverage is low, can you do something?

If you know where coverage is high, can you learn something?

What is GIS?

A “geographic information system” (GIS) links: Geographic features

Houses Census tracts

Attribute measurements Immunized (yes/no) Age Sociodemographics

Think of…

Map (locations) Table (attributes)

linked with

Each cell contains an attribute value

Objects on the map are features.

What does a GIS do?Basic GIS operation #1:

Layering

Non-compliers

Health centercachement

Compliers

Basic GIS operation # 2:

Buffering Find areas

within a user-specified distance of:points linesareas

Buffers around an area

Buffers around a line feature

Famous public health map!

Snow, J. (1949) Snow on Cholera. Oxford University Press: London.

Wow! Can we do that?

Many introductions to GIS and public health essentially say:

“If John Snow could do it with shoe leather, ink, and paper, just imagine what we can do with a computer!”

Basic take-home figure

The Whirling Vortex of GIS analysis

The question you want to

answer

The data youneed to answerthat question

The data youcan get

The questionyou can

answer withthose data

Original source: Toxicologist EPA Region IV

GIS

What kind of questions?

Where is coverage the lowest?Where is coverage the highest?Outbreak size starting in high

coverage area?Outbreak size starting in low

coverage area?How could coverages impact the

course of an outbreak?Best response to current outbreak?

What kind of attributes?

CompliersResidence locationCensus region counts

Sociodemographic dataCensus summaries on age, race,

sex, income of census region residents

Some information on compliers’ sociodemographics

Additional attributes

NoncompliersResidence locationRegional counts

School dataSchool district

Health plan dataBilling provides residence

addressZIP codes?

Basic location types

Point dataLatitude and longitude(Seems) preciseDistance calculations

Regional dataCounts (cases/controls) from census regions

Any complications?

Maxcy (1926): Endemic typhus fever in Montgomery, AL

Where is “where”?Which location for each case?

Maxcy, K.F. (1926) “An epidemiological study of endemic typhus (Brill’s disease) in the Southeastern United States with special reference to its mode of transmition.” Public Health Reports 41, 2967-2995.

Residence:

Lilienfeld, D.E. and Stolley, P.D. (1994) Foundations of Epidemiology, Third Edition. Oxford University Press: New York, pp. 136-140.

Employment:

Complication: Nonconstant population density

Complications with regions

Counts lose some resolution...

4 1

21

1 2

Modifiable Areal Unit Problem

Different aggregations can lead to different results.

4 1

21

1 2

2

0 0 0 0

2

1

0

2

0

0

24

MAUP example: John Snow

Monmonier, M (1991) How to Lie with Maps. University of Chicago Press: Chicago. p. 142.

?

What questions can I ask?

Point locations Interesting/uninteresting clusters Interesting: clusters of non-

compliers away from clusters of compliers

Regional counts Interesting/uninteresting raised

counts Interesting: Less coverage than

“expected”

Point locations

Treat locations as spatial point process

Spatial “intensity” (average number of events per unit area)

Think of intensity as a surfaceCompare intensity of compliers to

intensity of non-compliers.Peaks and valleys in same places?

Monte Carlo simulation Simulate data sets under null

hypothesis (e.g., constant coverage rate).

See if observed data (actual compliers) appear “unusual”.

To compare intensities, split all locations into compliers and non-compliers at random, find out how high peaks, how low valleys can get.

Most GIS packages will not do this, but it is a very handy tool in spatial statistics.

Regions

Compare observed counts to “expected” counts.

Some basic point process results extend to counts (counts of points in regions).

Constant coverage rate (perhaps age-adjusted) again a common way of obtaining “expected” counts.

Monte Carlo simulation for significance.

Related work

Cancer registries: North American Association of Central Cancer Registries (NAACCR) report on GIS (Wiggins 2002)

Birth outcome registries Public Health/Bioterrorism/Syndromic

Surveillance Similarities:

Registry data Differences:

Infectious vs. chronic outcome Urgency of temporality

Conclusion

Best work a collaboration betweenGeographersGISersEpidemiologistsStatisticians

Get the best data you can to answer the questions you want.

Handy references

Wiggins L (Ed). Using Geographic Information Systems Technology in the Collection, Analysis, and Presentation of Cancer Registry Data: A Handbook of Basic Practices. Springfield (IL): North American Association of Central Cancer Registries, October 2002, 68 pp.

Cromley, E.K. and McLafferty, S.L. (2002) GIS and Public Health. The Guilford Press.

Bailey and Gatrell (1995) Interactive Spatial Data Analysis. Longman.

Waller and Crawford (2004) Applied Spatial Statistics for Public Health Data. Wiley.

What kind of software?

Statistical Software(SAS, S+ Spatial Stats)Spatially and/or visually challenged

Subject-specificSpaceStat/GeoDaSaTScanGS+ClusterSeerWinBUGS/GeoBUGSXGOBI/XGvisR (many nice spatial modules, must write code, quality control?)Link to GIS S+/ArcView 3.x SAS Bridge to ArcGIS 8.x

Commercial GIS Software(ArcView, Mapinfo)Statistically challenged

Extensions (Analysts)$$$, limited capability Packages by scientific user good, but basic Scripts and MacrosUser-contributed

Often do not give numerical output