networked biodiversity data and credibility: citizen science and occurrence data in calflora nancy...
Post on 20-Dec-2015
218 Views
Preview:
TRANSCRIPT
Networked Biodiversity Data and Credibility:
Citizen Science and Occurrence Data
in CalFloraNancy Van HouseSIMS, UC Berkeley
www.sims.berkeley.edu/~vanhouse
Argument Networked info >> ready access to unpublished
information Information from outside own epistemic community Accessed by people from outside own epistemic
community Issues of trust and credibility
Of info Of sources Of users
This paper: empirical study of a user-designed, state-level biodiversity digital library: How have consortium of users and producers of info addressed
problems of networked data? Practical development of knowledge spaces
Biodiversity Data Broad range of datasets: biological,
geographical, meteorological, geological… Many, varied data producers and users Created, used for different purposes
Large quantities of data that vary in Precision and accuracy Methods of data collection, description, storage
Politically, economically, sensitive data Old data particularly valuable
Change over time How things used to be before…
Biodiversity Data, Epistemic Communities, Knowledge
Spaces Boundary crossings Scientific specialties Planners, governmental agency decision-
makers Plant enthusiasts Resource-extraction industries Environmental activists
Boundary objects (Star) Users Designers Managers Technologists
Botanical Occurrence Data
Specimen in hand (herbarium) Report of sighting, no specimen Literature – published, scientific Literature – e.g. flora: ‘this species occurs in
Marin County… List - e.g., all species observed on the
Bootjack Trail in Tamalpais State Park, Nov. 4, 2002, by Joe Smith
List – “Common plants of Tamalpais State Park”
Observers in the Field as Source of Fine-Grained
Data How far north does California Sun Cup grow? Is Desert Sand Verbena still growing in Los Angeles county?
(The last observation was recorded in 1935) Does Five-finger Fern grow in Ventura county?
(There are no direct observations, but it has been reported from surrounding counties)
Does anyone know about that new patch of invasive Artichoke Thistle growing on my local hillside? (Early alerts to new infestations while they are small & are
easier to eradicate than well established ones) What local biodiversity will be lost if the city decides to allow
that new housing development on the edge of town? (You can record the plants that are growing there now as a
record for history)
Sources of Data Academic researchers Professionals in government agencies, environmental
organizations, Park rangers, forest service botanists, professionals in
environmental groups… Consultants
Much government research/planning done by consultants Land developers, resource extraction industries hire them
to prepare environmental impact documents Native plant enthusiasts
“Expert amateurs” -- “citizen science” California Native Plant Society
People belong to multiple communities; roles overlap
Risks Unreliable info
Erroneous info Undetected duplication > belief that a species is prevalent
>> not preserving a population of a rare species Chasing after erroneous reported sighting of a rare species Confusing naturally-occurring and cultivated populations
Accurate but not credible info Discounting significant sighting as amateur’s error
Inappropriate use of info Private landowners destroying specimens of a rare plant to
avoid legal limits on land development Collectors (over-)collecting specimens of rare or valuable
species Cacti, orchids, floristic materials, mushrooms
CalFlora
http://www.calflora.org Comprehensive web-accessible database of
plant distribution information for California Independent non-profit organization Designed/managed by people from botanical
community, not librarians or technologists Contributors and users: a coalition of public
and private organizations and individuals To exist, has to be responsive to users’ needs,
concerns, practices and negotiate differences x epistemic and organizational boundaries
In conjunction with UC Berkeley Digital Library (http://elib.cs.berkeley.edu)
CalFlora Priorities
Focus on people; put technology in the back seat
Pay attention to how the world works for the people who produce and use information
Honor existing traditions of data exchange
CalFlora Occurrence Database
> 850,000 geo-referenced reports of observations Specimens in collections Reports from literature Reports from field Checklists
Sources 19 institutions Recently began accepting reports from
registered contributors via Internet
Changing Emphases in Occurrence Data
Existing data - emphasis on Unusual taxa Interesting locations, or where observers happened to be
Surprisingly small #s Most Calif species distributions based on <100 obs
Data collection methods Some emphasize rare taxa, underestimate common Some emphasize common taxa, underestimate rare
New emphasis on common plants Preserving species requires preserving community Better understanding of current distributions Baseline data for future developments
CalFlora Occurrence Database: Significance
Most comprehensive source by far (for Calif) Data from many sources ‘synoptically present’ Adding data from the public:
“When you have 5 million little trail lists for the whole state of Calif…all of a sudden you have a real density of observations [that] would be meaningful.”
Common as well as rare taxa Reasonably easy to use Data downloadable, manipulable Updated quickly Remote access via Internet
CalFlora Tensions Dangers, benefits of info about rare taxa
Controversy over photos, location info– benefits outweigh dangers?
Data Quality Accuracy, (undetectable) duplication Inclusiveness of observations vs. selectivity, quality
Trusting users Benefits vs. dangers of wide access to information Users’ abilities to understand info, use appropriately
vs. guidance from CalFlora, e.g. re quality
Tensions, cont. Quality, precision of mapping
County level too gross; Not too specific for rarities Who bears the cost?
If free, no one has incentive to support it Fee may discourage frivolous use State: if they charge for their data, even $1, they
can deny people access Archiving
Deletion of modalities? Track data back to source, definitions, conditions of collection
Stability of electronic media Stability of independent organization
Tensions, Cont.
Between technologists and information creators and users Techies not understanding the complex
social, organizational, epistemic issues around creation, maintenance, curation, use of digital libraries
Discussed elsewhere – http://www.sims.berkeley.edu/~vanhouse/
p84vanhouse.pdf
How (Some) Experts Assess Occurrence Reports
The evidence: Type of report (specimen, field observation,
list) Type of search (casual, directed)
The source: Personal knowledge of contributor’s
expertise Examination of other contributions, same
person Annotations by trusted others
Ancillary conditions: Likelihood of that species appearing at that
time, habitat, geographical location Other, similar reports
Current Practice
Know the individuals: “If they are active in CNPS [Calif Native Plant
Society], the people in CNPS know each another…That’s where you get to that really personal level of quality control and assurance and data reliability.”
“We have a collection of the usual suspects.” “When I started my job I went to lots of meetings but I
know everybody now.” Review the observations one by one
“That’s why we have a fairly large concern about any sort of automated library like CalFlora. No one is looking at those kind of things.”
How CalFlora Presents Occurrence Data
Links to data source(s) – personal and institutional
Compliance with institutional source’s requirements Fuzzed locations Links to institutional source’s caveats,
explanations
Publicly-contributed observations Info about observer Info about observation Annotations by experts
Data from the public -- How to identify ‘expert
amateurs’? May be expert in Particular place
Know the common flora Know when something unsual shows up (not
not nec’ly what it is) Particular taxon
Know this taxon and its species and subspecies (but not necessarily others)
Wide range of common taxa But not unusual ones
Contributor Registration
Biography, credentials (free text) Expertise/interests (free text) Affiliation Contact info/web site Vows
“I will submit only my own observations of wild plants. I realize that this system is only for first-hand reports about plants, native and introduced, that are growing without deliberate planting or cultivation.”
“I will…make sure I have the correct scientific name…I will submit uncertain identifications only if I believe them to be very important and time sensitive, and will label such reports ‘uncertain.’”
Contributor Registration (cont) Experience level (self-assessment)
I am a professional biologist/botanist, or have professional training in botany.
Although I do not have formal credentials, I am recognized as a peer by professional botanists.
Although I do not consider myself to have professional-level knowledge, I am quite experienced in the use of keys and descriptions, and/or have expertise with the plants for which I will be submitting observations.
I do not have extensive experience or background in botany, but I am confident that I can accurately identify the plants for which I will be submitting observations.
Occurrence Report
Species identification, habitat, location, date Method of identification
“I recognize …from prior determinations and experience” “I compared this plant with herbarium specimens” “I keyed this plant in a botanical reference” “I compared … with published taxonomic descriptions” “An expert reviewed and confirmed this identification”
Certainty of identification “I am confident of this identification, and submit this as a
positive observation.” “I am not certain of this identification but believe it to be
a significant observation and submit it here as an alert only.”
Observation Contribution Process
Data entered Photo appears (if available) – I.e., “Are you
sure?” If new county record, notice appears
I.e., “Are you sure?” Lists who will be notified – record likely to be
reviewed If new county record, notice sent to county
agricultural officials If listed as rare species, notice sent to appropriate
state agency
Annotations
Herbarium practice: experts annotate records with corrections, comments
CalFlora: registered experts can annotate photos and occurrence records Annotation by an expert raises the
credibility of a record. Actually – how often?
Annotation history viewable
Current Developments: CalFlora Meeting Tomorrow
Invited wide range of interested parties to come discuss future of CalFlora Services Funding
Seeking to create an engaged user group
Seeking to create a community around CalFlora
Attendees: many people no one seems to already know
Knowledge Spaces“ Knowledge is not simply local, it is located....It has place and
creates a space…. “Knowledge spaces have a wide diversity of components: people,
skills, local knowledge and equipment … linked by social strategies and technical devices …
“To move knowledge from the local site and moment of its production and application to other places and times, knowledge producers deploy a variety of social strategies and technical devices for creating the equivalences and connections between otherwise heterogeneous and isolated knowledges….
“Knowledge spaces acquire their … seemingly unchallengeable naturalness thru the suppression and denial of work involved in their construction.”
--David Turbull, Masons, Tricksters, and Cartographers p. 19-20
CalFlora and Local Knowledge
Not as opposed to scientific, but intimate, specific In biodiversity:
Baseline data early warning of subtle changes
How to collect, report, evaluate? CalFlora: retain the modalities Retain link to observer, info about observer
CalFlora as a Knowledge Space
Links layers of data, knowledge Allows user flexibility in moving local knowledge, combining, filtering
different kinds of data, different sources, making linkages, equivalences
Seeks to preserve the work and multiple voices behind the data Seeks to create a knowledge space, epistemic community by
making linkages among CalFlora users and contributors Moving
from small-scale and personal to large-scale and impersonal To large-scale and personal?
Conclusion
Trust as always a critical issue in knowledge Networking as
Foregrounding taken-for-granted practices Making new practices possible Creating new knowledge spaces Making linkages and equivalences across different kinds of knowledge Empowering users to make own linkages, assessments for different
purposes
Information systems as sociotechnical networks Often invisible to the participants who see them as merely technical
Using concepts of knowledge spaces, epistemic cultures to help understand and contribute to system design and use
top related