cyber-infrastructure challenges & ecoinformatics: an ecologist’s perspective william michener...

67
Cyber-Infrastructure Challenges & Ecoinformatics: An Ecologist’s Perspective William Michener LTER Network Office Department of Biology University of New Mexico

Upload: alison-kelly

Post on 25-Dec-2015

216 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Cyber-Infrastructure Challenges & Ecoinformatics: An Ecologist’s Perspective William Michener LTER Network Office Department of Biology University of New

Cyber-Infrastructure Challenges & Ecoinformatics:

An Ecologist’s Perspective

William MichenerLTER Network Office

Department of BiologyUniversity of New Mexico

Page 2: Cyber-Infrastructure Challenges & Ecoinformatics: An Ecologist’s Perspective William Michener LTER Network Office Department of Biology University of New

Today’s Road Map

• Science • Cyber-Infrastructure Challenges• Ecoinformatics

Page 3: Cyber-Infrastructure Challenges & Ecoinformatics: An Ecologist’s Perspective William Michener LTER Network Office Department of Biology University of New

Today’s Road Map

• Science • Cyber-Infrastructure Challenges• Ecoinformatics

Page 4: Cyber-Infrastructure Challenges & Ecoinformatics: An Ecologist’s Perspective William Michener LTER Network Office Department of Biology University of New

Most studies use a single Most studies use a single scale of observation --scale of observation --

Commonly 1 mCommonly 1 m22

The literature is biased toward The literature is biased toward single and small scale resultssingle and small scale results

Page 5: Cyber-Infrastructure Challenges & Ecoinformatics: An Ecologist’s Perspective William Michener LTER Network Office Department of Biology University of New

Time (yrs)Time (yrs)

Var

iab

leV

aria

ble

ChangeChange

transition from one stateor condition to another

Page 6: Cyber-Infrastructure Challenges & Ecoinformatics: An Ecologist’s Perspective William Michener LTER Network Office Department of Biology University of New

Space

Space

ParametersParameters

Tim

eTim

e

Thinking Thinking OutsideOutside the “Box” the “Box”

LTERLTER

BiocomplexityBiocomplexity

????????

Increase in breadth and depth of understanding.....Increase in breadth and depth of understanding.....

Page 7: Cyber-Infrastructure Challenges & Ecoinformatics: An Ecologist’s Perspective William Michener LTER Network Office Department of Biology University of New

24 7

11

10

86

5

4

3

2

20

1 918

16

15

17

1413

12

23

22

21

19

LTER24 NSF LTER Sites in the U.S. and the Antarctic: > 1500 Scientists; 6,000+ Data Sets—different themes, methods, units, structure, ….

Page 8: Cyber-Infrastructure Challenges & Ecoinformatics: An Ecologist’s Perspective William Michener LTER Network Office Department of Biology University of New
Page 9: Cyber-Infrastructure Challenges & Ecoinformatics: An Ecologist’s Perspective William Michener LTER Network Office Department of Biology University of New

Today’s Road Map

• Science • Cyber-Infrastructure Challenges• Ecoinformatics

Page 10: Cyber-Infrastructure Challenges & Ecoinformatics: An Ecologist’s Perspective William Michener LTER Network Office Department of Biology University of New

Knowledge

Data

Information

Phenomena

Page 11: Cyber-Infrastructure Challenges & Ecoinformatics: An Ecologist’s Perspective William Michener LTER Network Office Department of Biology University of New

Knowledge

Data

Information

Phenomena

Page 12: Cyber-Infrastructure Challenges & Ecoinformatics: An Ecologist’s Perspective William Michener LTER Network Office Department of Biology University of New

Abstraction of

phenomena

Date Site Species Density 10/1/1993 N654 Picea

rubens 13

10/3/1994 N654 Picea rubens

14.5

10/1/1993 N654 Betula papyifera

3

10/31/1993 1 Picea rubens

13.5

10/31/1993 1 Betula papyifera

1.6

11/14/1994 1 Picea rubens

8.4

11/14/1994 1 Betula papyifera

1.8

Page 13: Cyber-Infrastructure Challenges & Ecoinformatics: An Ecologist’s Perspective William Michener LTER Network Office Department of Biology University of New

Knowledge

Data

Information

Phenomena

Page 14: Cyber-Infrastructure Challenges & Ecoinformatics: An Ecologist’s Perspective William Michener LTER Network Office Department of Biology University of New

YonYon

HitherHither

Hunter-gatherers

Page 15: Cyber-Infrastructure Challenges & Ecoinformatics: An Ecologist’s Perspective William Michener LTER Network Office Department of Biology University of New

A Paradigm Shift

Taxon 1Taxon 1

Taxon 2Taxon 2

Taxon 3Taxon 3

Taxon 4Taxon 4

AbioticAbioticfactorsfactors

Integrated, Integrated, InterdisciplinaryInterdisciplinaryDatabasesDatabases

Harvesters

Page 16: Cyber-Infrastructure Challenges & Ecoinformatics: An Ecologist’s Perspective William Michener LTER Network Office Department of Biology University of New

Info

rmat

ion

Co

nte

nt

Time

Time of publication

Specific details

General details

Accident

Retirement or career change

Death

(Michener et al. 1997)

Data Entropy

Page 17: Cyber-Infrastructure Challenges & Ecoinformatics: An Ecologist’s Perspective William Michener LTER Network Office Department of Biology University of New

Date Site Species Area Count 10/1/1993 N654 PIRU 2 26 10/3/1994 N654 PIRU 2 29 10/1/1993 N654 BEPA 1 3

Date Site picrub betpap 31Oct1993 1 13.5 1.6 14Nov1994 1 8.4 1.8

Date Site Species Density 10/1/1993 N654 Picea

rubens 13

10/3/1994 N654 Picea rubens

14.5

10/1/1993 N654 Betula papyifera

3

10/31/1993 1 Picea rubens

13.5

10/31/1993 1 Betula papyifera

1.6

11/14/1994 1 Picea rubens

8.4

11/14/1994 1 Betula papyifera

1.8

A B

• Schema transform• Coding transform• Taxon Lookup• Semantic transform

Imagine scaling!!

C

Date Site Species Area Count 10/1/1993 N654 PIRU 2 26 10/3/1994 N654 PIRU 2 29 10/1/1993 N654 BEPA 1 3

Date Site Species Density

10/1/1993 N654 Picea rubens

13

10/3/1994 N654 Picea rubens

14.5

10/1/1993 N654 Betula papyifera

3

10/31/1993 1 Picea rubens

13.5

10/31/1993 1 Betula papyifera

1.6

11/14/1994 1 Picea rubens

8.4

11/14/1994 1 Betula papyifera

1.8

A B

C

Semantics

Page 18: Cyber-Infrastructure Challenges & Ecoinformatics: An Ecologist’s Perspective William Michener LTER Network Office Department of Biology University of New

Semantics—Linking Taxonomic Semantics to Ecological Data

Rhynchospora plumosa s.l.

Elliot 1816

Gray 1834

Kral 1998

Peet 2002?

Chapman1860

R. plumosa

R. plumosa

R. Plumosav. intermedia

R. plumosav. plumosa

R. Plumosav. interrupta

R. plumosa

R. intermedia

R. pineticola

R. plumosav. plumosa

R. plumosav. pinetcola

R. sp. 1

Taxon concepts change over time (and space)Multiple competing concepts coexistNames are re-used for multiple concepts

from R. Peet

Date Species # 1830 R.plumosa 39 1840 R.plumosa 49 1900 R.plumosa 42 1985 R.plumosa 48 1995 R.plumosa 22 2000 R.plumosa 19

A B C0

10

20

30

40

50

60

1/1/00 1/2/00 1/3/00 1/4/00 1/5/00 1/6/00

Page 19: Cyber-Infrastructure Challenges & Ecoinformatics: An Ecologist’s Perspective William Michener LTER Network Office Department of Biology University of New

Knowledge

Data

Information

Phenomena

Page 20: Cyber-Infrastructure Challenges & Ecoinformatics: An Ecologist’s Perspective William Michener LTER Network Office Department of Biology University of New

Characteristics of Characteristics of Ecological DataEcological Data

Complexity/Metadata RequirementsComplexity/Metadata Requirements

SatelliteImages

DataDataVolumeVolume(per(perdataset)dataset)

LowLow

HighHigh

HighHigh

Soil CoresSoil Cores

PrimaryPrimaryProductivityProductivity

GISGIS

Population DataPopulation Data

BiodiversityBiodiversitySurveysSurveys

Gene Sequences

Business Data

WeatherStations Most EcologicalMost Ecological

DataData

Most Most SoftwareSoftware

Page 21: Cyber-Infrastructure Challenges & Ecoinformatics: An Ecologist’s Perspective William Michener LTER Network Office Department of Biology University of New

What Users Really Want…

Page 22: Cyber-Infrastructure Challenges & Ecoinformatics: An Ecologist’s Perspective William Michener LTER Network Office Department of Biology University of New

Data Collection

Analysis

Translation

Use By Non-Scientists

Publish For Other Scientists

Page 23: Cyber-Infrastructure Challenges & Ecoinformatics: An Ecologist’s Perspective William Michener LTER Network Office Department of Biology University of New

Today’s Road Map

• Science • Cyber-Infrastructure Challenges

• Ecoinformatics

Page 24: Cyber-Infrastructure Challenges & Ecoinformatics: An Ecologist’s Perspective William Michener LTER Network Office Department of Biology University of New

Ecological InformaticsEcological Informatics

A broad interdisciplinary science A broad interdisciplinary science

thatthat

incorporates both conceptual and incorporates both conceptual and practical tools practical tools

for thefor the

understanding, generation, understanding, generation, processing, and propagation of processing, and propagation of ecological data and information.ecological data and information.

Page 25: Cyber-Infrastructure Challenges & Ecoinformatics: An Ecologist’s Perspective William Michener LTER Network Office Department of Biology University of New

Data Designand

Metadata

Data Acquisitionand

Quality Control

Accessand

Archiving

Analysisand

Interpretation

Data Manipulationand

Quality Assurance

ProjectInitiation

Publication

Page 26: Cyber-Infrastructure Challenges & Ecoinformatics: An Ecologist’s Perspective William Michener LTER Network Office Department of Biology University of New

Experimental DesignMethods

Data DesignData Forms

Data Entry

Field Computer Entry

ElectronicallyInterfaced Field

EquipmentElectronicallyInterfaced Lab

Equipment

Raw Data File

Quality Assurance Checks

Data Contamination

Data verified?

Data ValidatedArchive Data File

Archival Mass StorageMagnetic Tape / Optical Disk / Printouts

Access Interface

Off-site Storage

Secondary Users

Publication

Synthesis

Investigators

Summary Analyses

Quality Control

Metadata

Research ProgramInvestigators

Studies

yes

no

Page 27: Cyber-Infrastructure Challenges & Ecoinformatics: An Ecologist’s Perspective William Michener LTER Network Office Department of Biology University of New

“Ecological Informatics” Activities

• Project / experimental design

• Data design [Porter & McCartney]

• Data acquisition

• QA/QC [Vanderbilt]

• Data documentation (metadata) [Michener]

• Data archival

Page 28: Cyber-Infrastructure Challenges & Ecoinformatics: An Ecologist’s Perspective William Michener LTER Network Office Department of Biology University of New

“Ecological Informatics” Activities

• Project / experimental design

• Data design

• Data acquisition

• QA/QC

• Data documentation (metadata)

• Data archival

Page 29: Cyber-Infrastructure Challenges & Ecoinformatics: An Ecologist’s Perspective William Michener LTER Network Office Department of Biology University of New

Project / Experimental Design

ExperimentaExperimentallDesignDesign

AnalysesAnalyses

Data / Data / DatabaseDatabaseDesignDesign

Page 30: Cyber-Infrastructure Challenges & Ecoinformatics: An Ecologist’s Perspective William Michener LTER Network Office Department of Biology University of New

Project / Experimental Design

• Some Classic References– Green, R.H. 1979. Sampling Design and Statistical Methods for

Environmental Biologists. John Wiley & Sons, Inc., New York.

– Resetarits, Jr., W.J. and J. Bernardo (eds.). 1998. Experimental Ecology. Oxford University Press, New York.

– Scheiner, S.M. and J. Gurevitch (eds.). 1993. Design and Analysis of Ecological Experiments. Chapman & Hall, New York.

– Sokal, R.R. and F.J. Rohlf. 1995. Biometry. W.H. Freeman & Company, New York.

– Underwood, A.J. 1997. Experiments in Ecology: Their Logical Design and Interpretation Using Analysis of Variance. Cambridge University Press, Cambridge, UK.

Page 31: Cyber-Infrastructure Challenges & Ecoinformatics: An Ecologist’s Perspective William Michener LTER Network Office Department of Biology University of New

“Ecological Informatics” Activities

• Project / experimental design

• Data design

• Data acquisition

• QA/QC

• Data documentation (metadata)

• Data archival

Page 32: Cyber-Infrastructure Challenges & Ecoinformatics: An Ecologist’s Perspective William Michener LTER Network Office Department of Biology University of New

Data Design

• Conceptualize and implement a logical structure within and among data sets that will facilitate data acquisition, entry, storage, retrieval and manipulation.

Page 33: Cyber-Infrastructure Challenges & Ecoinformatics: An Ecologist’s Perspective William Michener LTER Network Office Department of Biology University of New

Data Set Design: Best Practices

• Assign descriptive file names• Use consistent and stable file formats• Define the parameters• Use consistent data organization• Perform basic quality assurance• Assign descriptive data set titles• Provide documentation (metadata)

from Cook et al. 2000

Page 34: Cyber-Infrastructure Challenges & Ecoinformatics: An Ecologist’s Perspective William Michener LTER Network Office Department of Biology University of New

1. Assign descriptive file names

• File names should be unique and reflect the file contents

• Bad file names– Mydata– 2001_data

• A better file name– Sevilleta_LTER_NM_2001_NPP.asc

• Sevilleta_LTER is the project name• NM is the state abbreviation• 2001 is the calendar year• NPP represents Net Primary Productivity data• asc stands for the file type--ASCII

Page 35: Cyber-Infrastructure Challenges & Ecoinformatics: An Ecologist’s Perspective William Michener LTER Network Office Department of Biology University of New

2. Use consistent and stable file formats

• Use ASCII file formats – avoid proprietary formats• Be consistent in formatting

– don’t change or re-arrange columns– include header rows (first row should contain file name, data

set title, author, date, and companion file names)– column headings should describe content of each column,

including one row for parameter names and one for parameter units

– within the ASCII file, delimit fields using commas, pipes (|), tabs, or semicolons (in order of preference)

Page 36: Cyber-Infrastructure Challenges & Ecoinformatics: An Ecologist’s Perspective William Michener LTER Network Office Department of Biology University of New

3. Define the parameters

• Use commonly accepted parameter names that describe the contents (e.g., precip for precipitation)

• Use consistent capitalization (e.g., not temp, Temp, and TEMP in same file)

• Explicitly state units of reported parameters in the data file and the metadata (SI units are recommended)

• Choose a format for each parameter, explain the format in the metadata, and use that format throughout the file– e.g., use yyyymmdd; January 2, 1999 is 19990102

Page 37: Cyber-Infrastructure Challenges & Ecoinformatics: An Ecologist’s Perspective William Michener LTER Network Office Department of Biology University of New

4. Use consistent data organization (one good approach)

Station Date Temp Precip

Units YYYYMMDD C mm

HOGI 19961001 12 0

HOGI 19961002 14 3

HOGI 19961003 19 -9999

Note: -9999 is a missing value code for the data set

Page 38: Cyber-Infrastructure Challenges & Ecoinformatics: An Ecologist’s Perspective William Michener LTER Network Office Department of Biology University of New

4. Use consistent data organization (a second good approach)

Station Date Parameter Value Unit

HOGI 19961001 Temp 12 C

HOGI 19961002 Temp 14 C

HOGI 19961001 Precip 0 mm

HOGI 19961002 Precip 3 mm

Page 39: Cyber-Infrastructure Challenges & Ecoinformatics: An Ecologist’s Perspective William Michener LTER Network Office Department of Biology University of New

5. Perform basic quality assurance

• Assure that data are delimited and line up in proper columns

• Check that there no missing values for key parameters

• Scan for impossible and anomalous values• Perform and review statistical summaries• Map location data (lat/long) and assess errors• Verify automated data transfers• For manual data transfers, consider double

keying data and comparing 2 data sets

Page 40: Cyber-Infrastructure Challenges & Ecoinformatics: An Ecologist’s Perspective William Michener LTER Network Office Department of Biology University of New

6. Assign descriptive data set titles

• Data set titles should ideally describe the type of data, time period, location, and instruments used (e.g., Landsat 7).

• Titles should be restricted to 80 characters.• Data set title should be similar to names of

data files– Good: “Shrub Net Primary Productivity at the

Sevilleta LTER, New Mexico, 2000-2001”– Bad: “Productivity Data”

Page 41: Cyber-Infrastructure Challenges & Ecoinformatics: An Ecologist’s Perspective William Michener LTER Network Office Department of Biology University of New

7. Provide documentation (metadata)

• To be discussed in detail

Page 42: Cyber-Infrastructure Challenges & Ecoinformatics: An Ecologist’s Perspective William Michener LTER Network Office Department of Biology University of New

Database Types

• File-system based

• Hierarchical

• Relational

• Object-oriented

• Hybrid (e.g., combination of relational and object-oriented schema)

Porter 2000

Page 43: Cyber-Infrastructure Challenges & Ecoinformatics: An Ecologist’s Perspective William Michener LTER Network Office Department of Biology University of New

File-system-based Database

Directory

Files

Porter 2000

Page 44: Cyber-Infrastructure Challenges & Ecoinformatics: An Ecologist’s Perspective William Michener LTER Network Office Department of Biology University of New

Hierarchical Database

Project

Data sets Investigators

Variables Locations

Codes Methods

Porter 2000

Page 45: Cyber-Infrastructure Challenges & Ecoinformatics: An Ecologist’s Perspective William Michener LTER Network Office Department of Biology University of New

Relational Database

Projects

Data setsLocations

Location_idData_id

Location_id

Porter 2000

Page 46: Cyber-Infrastructure Challenges & Ecoinformatics: An Ecologist’s Perspective William Michener LTER Network Office Department of Biology University of New

“Ecological Informatics” Activities

• Project / experimental design

• Data design

• Data acquisition

• QA/QC

• Data documentation (metadata)

• Data archival

Page 47: Cyber-Infrastructure Challenges & Ecoinformatics: An Ecologist’s Perspective William Michener LTER Network Office Department of Biology University of New

High-quality data depend on:

• Proficiency of the data collector(s)• Instrument precision and accuracy• Consistency (e.g., standard methods

and approaches)– Design and ease of data entry

• Sound QA/QC• Comprehensive metadata (e.g.,

documentation of anomalies, etc.)

Page 48: Cyber-Infrastructure Challenges & Ecoinformatics: An Ecologist’s Perspective William Michener LTER Network Office Department of Biology University of New

How are data to be acquired?

• Automatic Collection ?

• Tape Recorder

• Data Sheet

• Field entry into hand-held computer

Page 49: Cyber-Infrastructure Challenges & Ecoinformatics: An Ecologist’s Perspective William Michener LTER Network Office Department of Biology University of New

Plant Life Stage______________ _____________________________ _____________________________ _____________________________ _____________________________ _____________________________ _____________________________ _____________________________ _______________

What’s wrong with this data sheet?

Page 50: Cyber-Infrastructure Challenges & Ecoinformatics: An Ecologist’s Perspective William Michener LTER Network Office Department of Biology University of New

Important questions

• How well does the data sheet reflect the data set design?

• How well does the data entry screen (if available) reflect the data sheet?

Page 51: Cyber-Infrastructure Challenges & Ecoinformatics: An Ecologist’s Perspective William Michener LTER Network Office Department of Biology University of New

Plant Life Stageardi P/G V B FL FR M S D NParpu P/G V B FL FR M S D NPatca P/G V B FL FR M S D NPbamu P/G V B FL FR M S D NPzigr P/G V B FL FR M S D NP

P/G V B FL FR M S D NPP/G V B FL FR M S D NP

PHENOLOGY DATA SHEET Rio Salado - Transect 1

Collectors:_________________________________Date:___________________ Time:_________Notes: ________________________________________________________________________________________

P/G = perennating or germinating M = dispersingV = vegetating S = senescingB = budding D = deadFL = flowering NP = not presentFR = fruiting

Page 52: Cyber-Infrastructure Challenges & Ecoinformatics: An Ecologist’s Perspective William Michener LTER Network Office Department of Biology University of New

PHENOLOGY DATA SHEET Rio Salado - Transect 1

Collectors Troy Maddux

Date: 16 May 1991 Time: 13:12

Notes: Cloudy day, 3 gopher burrows on transect

ardi P/G V B FL FR M S D NP

Y N Y N Y N Y NY NY NY N Y N Y N

arpu P/G V B FL FR M S D NP

Y N Y N Y N Y NY NY NY N Y N Y N

asbr P/G V B FL FR M S D NP

Y N Y N Y N Y NY NY NY N Y N Y N

deob P/G V B FL FR M S D NP

Y N Y N Y N Y NY NY NY N Y N Y N

Page 53: Cyber-Infrastructure Challenges & Ecoinformatics: An Ecologist’s Perspective William Michener LTER Network Office Department of Biology University of New

“Ecological Informatics” Activities

• Project / experimental design

• Data design

• Data acquisition

• QA/QC

• Data documentation (metadata)

• Data archival

Page 54: Cyber-Infrastructure Challenges & Ecoinformatics: An Ecologist’s Perspective William Michener LTER Network Office Department of Biology University of New

Experimental DesignMethods

Data DesignData Forms

Data Entry

Field Computer EntryElectronically

Interfaced FieldEquipment

ElectronicallyInterfaced Lab

Equipment

Raw Data File

Quality Assurance Checks

Data Contamination

Data verified?

Data ValidatedArchive Data File

Archival Mass StorageMagnetic Tape / Optical Disk / Printouts

Access Interface

Off-site Storage

Secondary Users

Publication

Synthesis

Investigators

Summary Analyses

Quality Control

Metadata

Research ProgramInvestigators

Studies

yes

no

Brunt 2000

Generic Data Processing

Page 55: Cyber-Infrastructure Challenges & Ecoinformatics: An Ecologist’s Perspective William Michener LTER Network Office Department of Biology University of New

“Ecological Informatics” Activities

• Project / experimental design

• Data design

• Data acquisition

• QA/QC

• Data documentation (metadata)

• Data archival

Page 56: Cyber-Infrastructure Challenges & Ecoinformatics: An Ecologist’s Perspective William Michener LTER Network Office Department of Biology University of New

“Ecological Informatics” Activities

• Project / experimental design

• Data design

• Data acquisition

• QA/QC

• Data documentation (metadata)

• Data archival

Page 57: Cyber-Infrastructure Challenges & Ecoinformatics: An Ecologist’s Perspective William Michener LTER Network Office Department of Biology University of New

Traditional Fates of Data Post-Publication

• Paper to filing cabinets

• Data to floppy disks or tape

• Data and information lost over time

entropy

Page 58: Cyber-Infrastructure Challenges & Ecoinformatics: An Ecologist’s Perspective William Michener LTER Network Office Department of Biology University of New

Data Archive

• A collection of data sets, usually electronic, stored in such a way that a variety of users can locate, acquire, understand and use the data.

• Examples:– ESA’s Ecological Archive– NASA’s DAACs (Distributed Active Archive

Centers)

Page 59: Cyber-Infrastructure Challenges & Ecoinformatics: An Ecologist’s Perspective William Michener LTER Network Office Department of Biology University of New

Planning

Problem

Analysis and

modeling

Cycles of Research“A Conventional View”

Collection

Publicati

ons Data

Page 60: Cyber-Infrastructure Challenges & Ecoinformatics: An Ecologist’s Perspective William Michener LTER Network Office Department of Biology University of New

Cycles of Research“A New View”

PlanningProblem Definition

(Research Objectives)

Analysis and

modeling

Planning

CollectionSelection andextraction

Archive of Data

OriginalObservations

SecondaryObservations

Publicati

ons

Page 61: Cyber-Infrastructure Challenges & Ecoinformatics: An Ecologist’s Perspective William Michener LTER Network Office Department of Biology University of New

•Start small and keep it simple – building on simple successes is much easier than failing on large inclusive attempts.

Involve scientists - ecological data management is a scientific endeavor that touches every aspect of the research program. Scientists should be involved in the planning and operation of a data management system.

Support science – data management must be driven by the research and not the other way around, a data management system must produce the products and services that are needed by the community.

Keys to Success

Page 62: Cyber-Infrastructure Challenges & Ecoinformatics: An Ecologist’s Perspective William Michener LTER Network Office Department of Biology University of New

Data

Valu

e

Time

SerendipitousDiscovery

Inter-siteSynthesis

Gradual IncreaseIn Data Equity

Methodological Flaws, Instrumentation

Obsolescence

Non-scientific Monitoring

Increasing value of data over time

Page 63: Cyber-Infrastructure Challenges & Ecoinformatics: An Ecologist’s Perspective William Michener LTER Network Office Department of Biology University of New

LTER Data Access Policy

1) There are two types of data:

Type I (data that is freely available within 2-3 years) with minimum restrictions and,

Type II (Exceptional data sets that are available only with written permission from the PI/investigator(s)). Implied in this timetable, is the assumption that some data sets require more effort to get on-line and that no "blanket policy" is going to cover all data sets at all sites. However, each site would pursue getting all of their data on-line in the most expedient fashion possible.

2) The number of data sets that are assigned TYPE II status should be rare in occurrence and that the justification for exceptions must be well documented and approved by the lead PI and site data manager. Some examples of Type II data may include: locations of rare or endangered species, data that are covered by copyright laws (e.g. TM and/or SPOT satellite data) or some types of census data involving human subjects.

Page 64: Cyber-Infrastructure Challenges & Ecoinformatics: An Ecologist’s Perspective William Michener LTER Network Office Department of Biology University of New

Reasons to Not Share Data:

• Fear of getting scooped• Number of publications will decrease• People will find errors• Someone will misinterpret my data

Page 65: Cyber-Infrastructure Challenges & Ecoinformatics: An Ecologist’s Perspective William Michener LTER Network Office Department of Biology University of New

Benefits of Data Sharing• Publicity, accolades, media attention• Renewed or increased funding • Teaching:

• long-term data sets adapted for teaching & texts

• Archival: back-up copy of critical data sets • Research:

• new synthetic studies • peer-reviewed publications

• Document global and regional change• Conservation and resource management:

• species and natural areas protection• new environmental laws

Page 66: Cyber-Infrastructure Challenges & Ecoinformatics: An Ecologist’s Perspective William Michener LTER Network Office Department of Biology University of New

Brunt (2000) Ch. 2 in Michener and Brunt (2000)

Porter (2000) Ch. 3 in Michener and Brunt (2000)

Edwards (2000) Ch. 4 in Michener and Brunt (2000)

Michener (2000) Ch. 7 in Michener and Brunt (2000)

Cook, R.B., R.J. Olson, P. Kanciruk, and L.A. Hook. 2000. Best practices for preparing ecological and ground-based data sets to share and archive. (online at http://www.daac.ornl.gov/cgi-bin/MDE/S2K/bestprac.html)

Page 67: Cyber-Infrastructure Challenges & Ecoinformatics: An Ecologist’s Perspective William Michener LTER Network Office Department of Biology University of New

Thanks !!!