u.s. department of the interior u.s. geological survey scientific information management at the u.s....
TRANSCRIPT
U.S. Department of the InteriorU.S. Geological Survey
Scientific Information Management at the U.S. Geological Survey: Issues, Challenges, and a Collaborative Approach to Identifying and Applying Solutions
David L. Govoni and Thomas M. GuntherUSGS Geospatial Information Office
Geoinformatics 2006May 12, 2006
Geospatial Information Office (GIO)Science Information and Education Office
Responsibilities: Publishing policy and coordination
Libraries and Information Centers
Web infrastructure and content policy
Product Warehouse and distribution
Education and outreach
Scientific information management
Geospatial Information Office (GIO)Science Information and Education Office
Accomplished in partnership with USGS science and administrative programs through a combination of: Governance
Consultation
Facilitation
Collaborative development
Goal is to enable and support an “Integrated Information Environment” for the USGS
Integrated Information Environment (IIE)
Problems, problems … everywhere
Common issues identified from discussions with scientists and others across USGS disciplines: Search and discovery (especially by place and topic)
Database access and integration
Interoperability of tools and processes
Advanced visualization, modeling, other tools
Archive and preservation
Compliance with mandates: Security, science quality, publishing, records
management, accessibility, …
The solution? Good news … bad news
Lots of talent, innovation, and motivation, but: Widely scattered geographically and organizationally Many local efforts unknown to others in USGS Duplicative or overlapping in purpose, capabilities Built on multiple platforms in multiple languages Some good, some not so good Some potentially scalable, some not “Costly” to organization as a whole
So how do we …
Increase awareness? Identify “best of breed”? Accelerate diffusion? Provide support? Institutionalize?
Communities of Practice (CoPs)
What is a “Community of Practice”?
Communities of Practice are groups of people who share a concern or a passion for something they do and learn how to do it better through the process of collective learning as they interact regularly. CoPs are: Problem driven
Self-organizing, voluntary, and motivated
Not constrained by position in formal organizations
Not formally chartered or accountable through management chains as for teams Modified after
Etienne Wenger(www.ewenger.com)
USGS Scientific Information Management (SIM) Workshop
Three day Scientific Information Management Workshop, March 2006
150+ people representing all USGS regions and both science and administrative programs
Other DOI bureaus, other public and private-sector organizations also participated
Explicit focus on intersection of SIM and CoPs
SIM Workshop
Three parts: Overviews of problems and approaches to SIM both
inside and outside of the USGS
Introduction to “Community of Practice” concept as a framework for collective learning and collaborative problem solving
Breakouts designed to simultaneously: Identify key issues and needs Explore and encourage the formation of CoPs to develop
solutions
Potential communities
Data/information management Field data for small research projects Large time series data sets Scientific data from monitoring programs
Classification and discovery Metadata Knowledge organization systems
Delivery Digital libraries Portals and frameworks
Potential communities
Interoperability and integration Database networks
Preservation and long-term access Archiving of scientific data and information
Preservation of physical collections
Knowledge management Knowledge capture
Emerging workforce
Outcomes
At least 9 of 12 potential communities agreed to continue on as “formal” CoPs
Other potential communities proposed, e.g., Open access
Open source software
Search
Program management
Management commitment to support creation of bureau-wide infrastructure to enable current and future CoPs
USGS Communities Network
Common gateway to all known USGS CoPs Framework of shared collaborative services and tools
available to support interested communities: Discussion forums Document management Digital library and bibliography management News and Events calendar Wikis and annotation RSS feeds …
Initially USGS-only but eventually available to external collaborators and partners
Workshop evaluation
Reviews positive: Met or exceeded expectations: 89%
Change practices as result: 33%
Participate in communities: 72%
Learned new tools or approaches: 50%
Make valuable new contacts: 90%
Suggests broad interest and appeal of communities approach
(based on ~50% survey response)
What was learned
One size won’t always fit all, but … Many issues are common to all USGS disciplines
Local approaches may be broadly applicable, scalable, and cost-effective for the USGS as a whole
Those “in the trenches” know best: Cannot implement top-down SIM
solutions
Solutions can come from (and be managed from) anywhere
What was learned … a digression
SIM needs to be considered from two distinct, but intimately related perspectives: “Information life-cycle” or Producer perspective
Course of data and information from initial acquisition to final disposition
Consumer perspective How data and information is used to accomplish tasks
Producer perspective
Fieldwork(in situ, in vitro,
in silico)
Analysis, synthesis& interpretation
Preparation & distribution
(via any medium)
Preservation & archiving
refers to
Direct & remote observation, monitoring &
recording
Laboratory experiments,
modeling, visualization
Publications, data, talks, seminars, models, libraries
refers to refers to refers to
includes includes includes
Collect Analyze Publish Preserve
Records management,
data rescue, physical sample preservation
includes
Consumer perspective
Resource discovery(search, browse,
mine)
Acquisition(contact, determine
restrictions, access, & download, etc.)
Evaluation(assess relevance,
quality, significance, suitability)
Use for/Integration into studies, models,
visualizations, experiments, etc.
refers to
Catalogs; controlled vocabularies,
ontologies; geospatial, topical & preservation
metadata
Documented exchange formats &
protocols; administrative
& legal metadata
Contextual information, e.g., documentation,
reports; quality & functionality metadata
refers to refers to refers to
depends on depends on depends on
Find Get Understand Use
Documenteddata schema, service
models, protocols, etc.
depends on
“Metainformation” is critical to both
Broadly defined here to encompass both “classic” metadata and “contextual information” (rules, assumptions, ontologies, schema, documentation, etc.) that impart deeper understanding or facilitate use
Metainformation: Critical to our ability to conduct integrated studies
Critical to maintaining long-term access
Should be, but very often is not, formally captured and preserved all along the information life-cycle
(End of digression)
What was learned … SIM is not easy
Despite advances in technology, many tasks: Remain time-consuming
Require significant involvement by scientists (sometimes at the expense of their science)
Lack incentives to “do the right thing”
Volume outpacing resources Legacy data may already be beyond saving
SIM is not an option
Good stewardship of data, information, physical artifacts, and associated metainformation is an obligation of the research community: As a matter of self interest (e.g., as precondition for
being viewed as a “trusted source”)
Data and information is of little value if it cannot be found or delivered in a timely or usable condition
Reproducibility of results – a hallmark of the scientific method – may impaired or impossible without it
Meeting the challenges … There is hope!
Communities of practice, if encouraged and supported, offer several benefits: Strength in numbers:
Multiple perspectives and insights brought to bear on problems
Yield better solutions, faster
Organizational adaptability: Ability to coalesce rapidly around
issues driven by changing technologies, research needs, or other challenges without time-consuming organizational realignments
There is hope!
Cost-effectiveness: Fewer development “stovepipes” Less likely to “reinvent the wheel” Useful knowledge, tools, and techniques are rapidly
distributed throughout the organization Standardization, interoperability more likely
Collaborative learning: Participation increases knowledge and skills of all participants Overall organizational competence is enhanced Knowledge is more likely to be preserved for the next
generation