what you don’t know can hurt you: uncertainties in georeferencing john wieczorek museum of...

28
What you don’t know can hurt you: uncertainties in georeferencing John Wieczorek Museum of Vertebrate Zoology University of California, Berkeley

Post on 19-Dec-2015

217 views

Category:

Documents


0 download

TRANSCRIPT

What you don’t know can hurt you: uncertainties in georeferencing

John WieczorekMuseum of Vertebrate Zoology

University of California, Berkeley

Uncertainties

What comes out of a system depends on:

a)what goes into itb)what you ask of itc) what happens in between

What species occur where?

Basis for:

conservationbio-prospectingentertainmentsurvival?

What species occur where?

species identification

What species occur where?

occurrence location

Problem: most original data are in textual form

Problem: collection resources are scarce and can’t support large-scale digitization

What species occur where?occurrence location

What species occur where?

What can Biodiversity Informatics do?

Taxonomic Resolution Services

What can Biodiversity Informatics do?

Taxonomic Resolution Services

What species occur where?

Georeferencing Services

ID Species Locality1 Lynx rufus Dawson Rd. N Whitehorse2 Pudu puda cerca de Valdivia3 Canis lupus 20 mi NW Duluth

9 Ursus arctos Bear Flat, Haines Junction

4 Felis concolor Pichi Trafúl5 Lama alpaca near Cuzco6 Panthera leo San Diego Zoo7 Sorex lyelli Lyell Canyon, Yosemite8 Orcinus orca 1 mi W San Juan Island

What we have:Localities we can read

What we want:Localities we can map

Integration – Species Pages

What is a georeference?

A numerical description of a place that can be mapped.

“Davis, Yolo County, California”

“point method”

Coordinates: 38.5463 -121.7425Horizontal Geodetic Datum: NAD27

What is an acceptable georeference?

A numerical description of a place that can be mapped

and that describes the spatial extent of a locality

and its associated uncertainties.

1) Map inaccuracy

2) Extent of the reference

3) Coordinate imprecision

4) Undocumented datum

5) Distance imprecision

6) Direction imprecision

Scale Uncertainty (ft) Uncertainty (m)

1:1,200 3.3 ft 1.0 m

1:2,400 6.7 ft 2.0 m

1:4,800 13.3 ft  4.1 m

1:10,000 27.8 ft 8.5 m

1:12,000 33.3 ft 10.2 m

1:24,000 40.0 ft  12.2 m

1:25,000 41.8 ft 12.8 m

1:63,360 106 ft 32.2 m

 1:100,000 167 ft 50.9 m

1:250,000 417 ft 127 m

Sources of uncertainty

“Davis, Yolo County, California”

“bounding-box method”

Coordinates: 38.5486 -121.754238.5450 -121.7394

Horizontal Geodetic Datum: NAD27

“Davis, Yolo County, California”

“point-radius method”

Coordinates: 38.5468 -121.7469Horizontal Geodetic Datum: NAD27Maximum Uncertainty: 8325 m

What is an ideal georeference?

A numerical description of a place that can be mapped

and that describes the spatial extent of a locality

and its associated uncertaintiesas well as possible.

“Davis, Yolo County, California”

“shape method”

“20 mi E Hayfork, California”

“probability method”

point easy to produce no data quality

bounding-box simple spatial queriesdifficult quality assessment

point-radius easy quality assessmentdifficult spatial queries

shape accurate representationcomplex, uniform

Method Comparison

probability accurate representationcomplex, non-uniform

Global Biodiversity Information Facility (GBIF)

Point-radius Method

“Manual” Georeferencing Tools

Semi-automatedGeoreferencing Tools

(a)

(d)(c)

(b)

Rowe, 2005. Elevational gradient analysis ofhistorical museum specimens: a cautionary tale

Rowe, 2005. Elevational gradient analysis ofhistorical museum specimens: a cautionary tale

What species occur where?Conclusions:

1)We can help users find relevant records

2) We can help users assess data quality and fitness for use

3) In the end, users must exercise due diligence. Without 1) and 2), they can’t.