ecological information management (eim) 2008 lter information management committee meeting, july...

19
Ecological Information Management (EIM) 2008 LTER Information Management Committee Meeting, July 23-25, 2013 Don Henshaw H.J. Andrews Experimental Forest Considering best practices in managing sensor data

Upload: joella-holt

Post on 29-Dec-2015

217 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Ecological Information Management (EIM) 2008 LTER Information Management Committee Meeting, July 23-25, 2013 Don Henshaw H.J. Andrews Experimental Forest

Ecological Information Management (EIM) 2008LTER Information Management Committee Meeting, July 23-25, 2013

Don HenshawH.J. Andrews Experimental Forest

Considering best practices in managing sensor data

Page 2: Ecological Information Management (EIM) 2008 LTER Information Management Committee Meeting, July 23-25, 2013 Don Henshaw H.J. Andrews Experimental Forest

COMMON THEMES FROM PARTICIPATING SITESJOINT NERC ENVIRONMENTAL SENSOR NETWORK/SENSOR NIS WORKSHOP, HUBBARD BROOK EXPERIMENTAL FOREST, NH, OCTOBER 25-27TH, 2011

Greatest Needs Middleware between sensor/data logger and

database/applications Programming supportTraining workshops to disseminate knowledge & solutions Ways to share experiences with software and tools that are

useful Clearinghouse for sharing code and solutions

Knowledge Base (web page) organized by topics (http://wiki.esipfed.org/index.php/EnviroSensing_Cluster)

LTER Information Management Committee Meeting, July 23-25, 2013

Page 3: Ecological Information Management (EIM) 2008 LTER Information Management Committee Meeting, July 23-25, 2013 Don Henshaw H.J. Andrews Experimental Forest

Joint NERC Environmental Sensor Network/LTER SensorNIS Workshop, October 25-27th, 2011

Page 4: Ecological Information Management (EIM) 2008 LTER Information Management Committee Meeting, July 23-25, 2013 Don Henshaw H.J. Andrews Experimental Forest

o Online resource guide outline• Sensor, site, and platform selection• Data acquisition and transmission• Sensor management, tracking, documentation• Streaming data management middleware• Sensor data quality assurance/quality control (QA/QC)• Sensor data archiving

ESIP EnviroSensing Cluster: Building a sensor network resource guide through

community participation

Software Tools for Sensor Networks, April 23-26, 2013

Page 5: Ecological Information Management (EIM) 2008 LTER Information Management Committee Meeting, July 23-25, 2013 Don Henshaw H.J. Andrews Experimental Forest

o Problem statement• Vast array of possible sensor/hardware packages for

multiple science applications• Communication among PI’s, techs, and specialists

o work together in considering options and planning • Deployment may be based on interacting factors

o e.g., permitting, geography, access• Considerations:

o seasonal weather patterns, power sources, communications options, land ownership, distance from managing institution, available personnel/expertise, and potential expansion/future-proofing

Sensor, site, and platform selection

LTER Information Management Committee Meeting, July 23-25, 2013

Page 6: Ecological Information Management (EIM) 2008 LTER Information Management Committee Meeting, July 23-25, 2013 Don Henshaw H.J. Andrews Experimental Forest

Data acquisition and transmissionProblem statement

• Manual downloads of environmental sensor data may not be sufficient to assure data security or data integrity, or allow direct control of devices

• Considerations: o need for immediate accesso need for one- or two-way transmission methodso bandwidth requirements to transfer the datao need for line-of-site communication or repeaterso hardware and network protocolso power consumption of the system components o physical and network security requirements

LTER Information Management Committee Meeting, July 23-25, 2013

Page 7: Ecological Information Management (EIM) 2008 LTER Information Management Committee Meeting, July 23-25, 2013 Don Henshaw H.J. Andrews Experimental Forest

Sensor management, tracking, and documentation

Problem statement• Documentation of field procedures need to be

sufficient to withstand personnel changes over time• Noted sensor issues and problems need to be quickly

communicated among field technicians, lead investigators and data managers

• Sensor histories are typically tracked in field notebooks or field check sheets and are essential for internal review of data streams, but are often inaccessible to data handlers

• Noted field problems may provide insight into quality control issues and data behavior and should be captured in data qualifier flags

LTER Information Management Committee Meeting, July 23-25, 2013

Page 8: Ecological Information Management (EIM) 2008 LTER Information Management Committee Meeting, July 23-25, 2013 Don Henshaw H.J. Andrews Experimental Forest

SENSOR MANAGEMENT, TRACKING, DOCUMENTATION

Software Tools for Sensor Networks Training, 1 May 2012

• Develop protocols for installation, calibration, maintenance, and removal of sensors

Track sensor events and history Record sensor events and failures, deployment

information, calibration events, maintenance history, operational dates, etc.

Record sensor descriptions, methodology changes, sampling frequency, geo-location, photo points, etc.

Documentation Standardize field notebooks or field checklists Build log files or databases for annotation of sensor

events, e.g.,• Timestamp (or range), DataloggerID, SensorID,

event category, description and note taker of event

LTER Information Management Committee Meeting, July 23-25, 2013

Page 9: Ecological Information Management (EIM) 2008 LTER Information Management Committee Meeting, July 23-25, 2013 Don Henshaw H.J. Andrews Experimental Forest

Sensor data quality assurance and quality control (QA/QC)

• Preventative QA measures in the field are desirable• Automated QC is necessary for

o near real-time use of datao efficient processing of high volume data streams

• Manual methods are unavoidableo a hybrid QC system will include subsequent manual

inspection and additional QC checking• QC system must

o provide qualifier flags to sensor datao accommodate feedback to policies and procedureso assure that all QC workflows are documented

LTER Information Management Committee Meeting, July 23-25, 2013

Page 10: Ecological Information Management (EIM) 2008 LTER Information Management Committee Meeting, July 23-25, 2013 Don Henshaw H.J. Andrews Experimental Forest

QUALITY ASSURANCE – PREVENTATIVE MEASURES Routine calibration and maintenance

Anticipate common repairs and replacement parts Record known events that may impact

measurements

Continuous monitoring and evaluating of sensor network Early detection of problems Automated alerts; in situ web cams

Sensor redundancy Ideal: Triple the sensor, triple the logger! Practical: Cheaper, lower cost, lower resolution

sensors, or correlated (proxy) sensors Alternative: Datalogger-independent sensor spot

checks; portable instrument package

LTER Information Management Committee Meeting, July 23-25, 2013

Page 11: Ecological Information Management (EIM) 2008 LTER Information Management Committee Meeting, July 23-25, 2013 Don Henshaw H.J. Andrews Experimental Forest

QUALITY CONTROL ON STREAMING DATA: POSSIBLE QUALITY CONTROL CHECKS IN NEAR REAL-TIME

Timestamp integrity (Date/time) Sequential, fixed intervals, i.e., checks for time step or frequency variation

Range checks Sensor specifications - identify impossible values; not unlikely ones Seasonal/reasonable historic values

Internal (plausibility) checks E.g., TMAX-TMIN>0, snow depth>snow water equivalence Consistency of derived values

Variance checks Sigma (standard deviation), Delta/step (difference of subsequent pairs),

change in slope checks e.g., outlier detections, indicator of sensor degradation Sensitivity is specific to site and sensor type

Persistence checks Check for repeating values that may indicate sensor failure

E.g., freezing, sensor capacity issues

Spatial checks Use correlations with redundant or nearby sensors, e.g., check for sensor

drift LTER Information Management Committee Meeting, July 23-25, 2013

Page 12: Ecological Information Management (EIM) 2008 LTER Information Management Committee Meeting, July 23-25, 2013 Don Henshaw H.J. Andrews Experimental Forest

QUALITY CONTROL ON STREAMING DATA:DATA QUALIFIERS (DATA FLAGS)

Many vocabularies of data flags Good approach

Rich vocabulary of fine-grained flags for streaming data – intended to guide local review site-specific flags

Simpler vocabulary of flags for “final” data for public consumption, e.g., ‘Accepted’, ‘Missing’, ‘Estimated’, ‘Suspicious’, estimate uncertainty

Certain types of qualifiers may be better as data columns Method shifts, sensor shifts Place key documentation as close to data value as possible

Image from Campbell et.al., Bioscience, In Press.

Page 13: Ecological Information Management (EIM) 2008 LTER Information Management Committee Meeting, July 23-25, 2013 Don Henshaw H.J. Andrews Experimental Forest

SENSOR DATA ARCHIVING

Archiving strategies create well documented data snapshots assign unique, persistent identifiers maintain data and metadata versioning store data in text-based formats

Partner with community supported archivesE.g., the LTER NIS, or federated archive initiatives such as

DataONEBest practices

develop an archival data management plan implement a sound data backup plan archive raw data (but they do not need to be online) make data publicly available that have appropriate QA/QC

procedures applied assign QC level to published data sets

LTER Information Management Committee Meeting, July 23-25, 2013

Page 14: Ecological Information Management (EIM) 2008 LTER Information Management Committee Meeting, July 23-25, 2013 Don Henshaw H.J. Andrews Experimental Forest

QUALITY CONTROL ON STREAMING DATA:QUALITY LEVELS

Quality control is performed at multiple levels Level 0 (Raw streaming data)

Raw data, no QC, no data qualifiers applied (data flags) Preservation of original data streams is essential

Level 1 (QC applied, qualifiers added) Provisional level (near real-time preparation)

if released, provisional data must be labeled clearly Published level (delayed release)

QC process is complete; data is unlikely to change

Level 2 (Gap-filled or estimated data) Involves interpretation – may be controversial Desirable when generating summarized data, but

transparency critical – flag estimated values

LTER Information Management Committee Meeting, July 23-25, 2013

Page 15: Ecological Information Management (EIM) 2008 LTER Information Management Committee Meeting, July 23-25, 2013 Don Henshaw H.J. Andrews Experimental Forest

• Examples: o Read, reformat, export of different data types or

structures (input/output)o Automated QA/QC on data streamso Integration of field notes and documentation with the

datao Archiving

Streaming data management middlewareo Definition/Purpose• “Middleware” in conjunction with sensor networks is

computer software that enables communication and management of data from field sensors to a client such as a database or a website

• Purpose of middleware includes the collection, analysis, and visualization of data

• Middleware is chained together into a scientific workflow

LTER Information Management Committee Meeting, July 23-25, 2013

Page 16: Ecological Information Management (EIM) 2008 LTER Information Management Committee Meeting, July 23-25, 2013 Don Henshaw H.J. Andrews Experimental Forest

Streaming data management middlewareo Middleware/software – Proprietary• Campbell Scientific LoggerNet

o functionality to set up and configure a network of loggerso tools to program, visualize, monitor, and publish data

• Vista Engineering: Vista Data Vision (VDV)o tools to store and organize data from various loggerso visualization, alarming, reporting, and web publishing features

• YSI EcoNet (for YSI monitoring instrumentation)o delivery of data from the field to the YSI web servero visualization, reports, alarms, and email notification tools

• NexSens: iCharto Windows-based data acquisition software packageo interfaces with popular products such as YSI, OTT, ISCO sensors

LTER Information Management Committee Meeting, July 23-25, 2013

Page 17: Ecological Information Management (EIM) 2008 LTER Information Management Committee Meeting, July 23-25, 2013 Don Henshaw H.J. Andrews Experimental Forest

SENSOR DATA MANAGEMENT MIDDLEWARE OPEN SOURCE ENVIRONMENTS FOR STREAMING DATA

Matlab GCE toolbox (Proprietary/ limited open source) GUI, visualization, metadata-based analysis, manages QA/QC

rules and qualifiers, tracks provenance

Open Source DataTurbine Initiative Streaming data engine, receives data from various sources

and sends to analysis and visualization tools, databases, etc.

Kepler Project (open source) GUI, reuse and share analytical components/workflows with

other users, tracks provenance, integrates software components and data sources

Page 18: Ecological Information Management (EIM) 2008 LTER Information Management Committee Meeting, July 23-25, 2013 Don Henshaw H.J. Andrews Experimental Forest

SENSOR MANAGEMENT BEST PRACTICES WORKSHOP PARTICIPANTS

Don Henshaw (AND) - organizerCorinna Gries (NTL) - organizerRenee Brown (SEV)Adam Kennedy (AND)Richard Cary (CWT)Mary Martin (HBR)Christine Laney (UTEP, JRN)Jennifer Morse (NWT)Chris Jones (DataONE)Branko Zdravkovic (Univ of Saskatchewan)Scotty Strachan (Univ of Nevada-Reno)

Jordan Read (USGS) - vtcWade Sheldon (GCE) - vtc

LTER Information Management Committee Meeting, July 23-25, 2013

Page 19: Ecological Information Management (EIM) 2008 LTER Information Management Committee Meeting, July 23-25, 2013 Don Henshaw H.J. Andrews Experimental Forest