linked sensor data cube

29
A Linked Sensor Data Cube for a 100 year homogenised daily temperature dataset CSIRO ICT CENTRE Laurent Lefort 5 th Semantic Sensor Network Workshop, 12 November 2012

Upload: laurent-lefort

Post on 05-Dec-2014

72 views

Category:

Technology


6 download

DESCRIPTION

Presentation at the SSN workshop at ISWC 2012 (best paper award)

TRANSCRIPT

Page 1: Linked Sensor Data cube

A Linked Sensor Data Cubefor a 100 year homogenised daily temperature dataset

CSIRO ICT CENTRE

Laurent Lefort5th Semantic Sensor Network Workshop, 12 November 2012

Page 2: Linked Sensor Data cube

Outline

• ACORN-SAT dataset• Role of SSN ontology• Role of RDF Data Cube vocabulary• Integration of SSN and RDF Data Cube• Lessons learned• Conclusions

A Linked Sensor Data Cube for a 100 year homogenised daily temperature dataset | Laurent Lefort2 |

Page 3: Linked Sensor Data cube

The ACORN-SAT dataset

• Released by Aus. Bureau of Meteorology (23 March 2012)• Available at http://www.bom.gov.au/climate/change/acorn-sat/ • 112 stations in total - 60 from 1910 to 2011• Homogenised (adjusted) daily temperatures• Tabular format (1 file per time series/station)

A Linked Sensor Data Cube for a 100 year homogenised daily temperature dataset | Laurent Lefort3 |

Page 4: Linked Sensor Data cube

A Linked Sensor Data Cube for a 100 year homogenised daily temperature dataset | Laurent Lefort

The Linked Data version of ACORN-SAT

• Experimental version of ACORN-SAT data • Available at http://lab.environment.data.gov.au/ • Developed for the Australian Bureau of Meteorology (BOM) by CSIRO in

cooperation with the Australian Government Information Management Office (AGIMO)

• Temperature (homogenised) plus Rainfall (not homogenised)

• First version presented at Australian GovHack Day• Alternative to tabular data

• Last version, uploaded to LOD cloud• http://thedatahub.org/dataset/acorn-sat

4 |

Page 5: Linked Sensor Data cube

Motivation: linked gov. agencies data in Australia

• Linked data (and well managed URIs) to build the bridges between the different agencies

• Current linked data pilot is one agency (BoM) and one server but applies solutions and schemes already in place in multi-agencies and multi-service providers context (e.g. UK)

• Thanks to AGIMO for helping us to set up http://lab.environment.data.gov.au/

Page 6: Linked Sensor Data cube

A Linked Sensor Data Cube for a 100 year homogenised daily temperature dataset | Laurent Lefort

SSN Ontology

• SSN-XG report http://www.w3.org/2005/Incubator/ssn/XGR-ssn/• SSN Ontology http://purl.oclc.org/NET/ssnx/ssn

• Navigable documentation on wiki auto derived http://www.w3.org/2005/Incubator/ssn/wiki/SSN

6 |

Page 7: Linked Sensor Data cube

SSN: deployed systems and observations

Skeleton

Device

Deployment

PlatformSite

System

ssn:System

onPlatform

hasSubsystem

hasDeployment

ssn:DeploymentRelatedProcess

ssn:Deployment

deploymentProcesPartdeployedSystem

ssn:Platform

deployedOnPlatform

attachedSystem

ssn:Device

ssn:Sensor

ssn:SensingDevice

observes

inDeployment

observedBy

ssn:PropertyobservedProperty

ssn:Observation

Page 8: Linked Sensor Data cube

A Linked Sensor Data Cube for a 100 year homogenised daily temperature dataset | Laurent Lefort

Specific challenges for the SSN Ontology

• ACORN-SAT data derived from multiple stations with complex history• Uses homogenisation algorithm to make adjustments to raw data• “Metadata” used by the algorithm to identify “breakpoints” in time series

– Site changes (moves, building or vegetation having an impact on the quality of observation), sensor (and sensor screens) changes, procedure changes (hours of observations)

• BoM station numbering system “somewhat confusing over time”• Desire to retain a single site number for upper-air observations at obs sites• Several numbering conventions have been used at one or more locations where an

overlap occurs between an old (comparison) and new site:– Old site retains old number, new site opens with new number.– Old site switches to new number for the duration of the comparison, new site

takes over old number from the start of its observations.– New site opens under new number then switches to old number after end of

comparison.

8 |

Page 9: Linked Sensor Data cube

A Linked Sensor Data Cube for a 100 year homogenised daily temperature dataset | Laurent Lefort

Linked ACORN SAT deployment data with SSN

• Data describing the deployment history • Available in ACORN-SAT station catalogue (pdf)• Not available in tabular format distribution

• ACORN-SAT composite stations – System composed of one or several BoM stations

• BoM (Bureau of Meteorology) stations – System composed of one or several station sharing the same codes

• Textual description of significant events

• Data describing the detailed conditions of observations • Sensors• Screens• Automatic Weather stations• Procedures e.g. hours of observation

9 |

Page 10: Linked Sensor Data cube

A Linked Sensor Data Cube for a 100 year homogenised daily temperature dataset | Laurent Lefort

Example (Darwin)Time series – Weather stations – Sites – (Sensors)

10 |

Darwin Post Office 014016 (1910-1942)

Darwin Airport014015 (1941-2007 & 2001-now)2 sites – 1km apart – same code used

Page 11: Linked Sensor Data cube

A Linked Sensor Data Cube for a 100 year homogenised daily temperature dataset | Laurent Lefort

Deployment phases in Darwin

11 |

Page 12: Linked Sensor Data cube

A Linked Sensor Data Cube for a 100 year homogenised daily temperature dataset | Laurent Lefort

RDF Data cube http://purl.org/linked-data/cube

• RDF Data Cube: a method to organise linked data in slices • A vocabulary published by the W3C

Government Linked Data (GLD) Working Group (Working Draft)• Also the method used to publish statistics data and environmental data in

Europe e.g. for Bathing Water Quality in UK http://www.epimorphics.com/web/projects/bathing-water-quality

• Advantages• Allows multiple views on the same data (similar to OLAP)• Generic approach which supports the links to domain-specific definitions

• Useable:• In any browser via Linked Data API (HTML output)• In JavaScript via Linked Data API (JSON output)• In R via SPARQL

12 |

Page 13: Linked Sensor Data cube

A Linked Sensor Data Cube for a 100 year homogenised daily temperature dataset | Laurent Lefort13 |

From: The RDF Data Cube VocabularyW3C Working Draft 05 April 2012http://www.w3.org/TR/vocab-data-cube/

Page 14: Linked Sensor Data cube

A Linked Sensor Data Cube for a 100 year homogenised daily temperature dataset | Laurent Lefort

Data cube, slice and observation

14 |

Dimension d6

Dimension d7

Dimension d1

Dimension d2

Dimension d3

Dimension d4

Dimension d5

Measure m1, m2, …Attribute a1, a2, …

Cube

Slice

Observation

Page 15: Linked Sensor Data cube

QB: Dataset, Slice, ObservationCube and Slice

qb:Slice

qb:Dataset

slice

qb:Observation

Cube observation

observation

subslice

Page 16: Linked Sensor Data cube

A Linked Sensor Data Cube for a 100 year homogenised daily temperature dataset | Laurent Lefort

Data Cube Structure: dimensions, measure, attributes

16 |

Observation

- MinTemperature- MaxTemperature- Rainfall

- Booleans for missing data

Day

(3) Month

(2) Year

(1) ACORN-SAT Series/System (station)

Current Data Cube structure (and URI/API logic)• Stations/time series

• Year• Month

• All linking to observations

Page 17: Linked Sensor Data cube

A Linked Sensor Data Cube for a 100 year homogenised daily temperature dataset | Laurent Lefort

Slices and URI scheme

17 |

Page 18: Linked Sensor Data cube

A Linked Sensor Data Cube for a 100 year homogenised daily temperature dataset | Laurent Lefort

Coupling SSN and RDF Data Cube

18 |

Page 19: Linked Sensor Data cube

A Linked Sensor Data Cube for a 100 year homogenised daily temperature dataset | Laurent Lefort19 |

acorn-system

bom-station:System

hasSubsystem

parentFeature, parentCountryparentADM1, parentADM2

acorn-deploy:Deployment

deployment-ProcessPart

currentSite

bom-station:Station deployedOnPlatform

acorn-system:System

acorn-series:xxx

acorn-seriesacorn-

series:TimeSeries

acorn-sat:Observationacorn-sat observation

acorn-site:Site

bom-station acorn-deploy

acorn-site

acorn-deploy:PreDeployment

acorn-deploy:PostDeployment

acorn-deploy:StandaloneOperation

observedBy

acorn-sat:xxx

etcddi:xxx

etcddi

time:Interval (Instant)

OWL TimeIntervals

interval:CalendarInterval (Instant)

raindist:RainfallDistrict

raindist

rainstate

rainstate:RainfallState

gn:Feature

geonames

parentADM1

locatedIn

Skeleton

Device

Deployment

PlatformSiteSystem

ssn:System

onPlatform

hasSubsystem

hasDeployment

ssn:DeploymentRelatedProcess

ssn:Deployment

deploymentProcesPartdeployedSystem ssn:Platform

deployedOnPlatform

attachedSystem

ssn:Device

ssn:Sensor

ssn:SensingDevice

observes

inDeployment

observedBy

ssn:Property

observedProperty

Cube and Slice

qb:Slice

qb:Dataset slice

qb:Observation

Cube observation

observation

qb:ComponentProperty

DSD

componentProperty

qb:DataStructureDefinitionstructure

qb:ComponentSpecification

component

Dataset

void:Dataset

Concepts

skos:Concept

concept

ssn:Observation

Page 20: Linked Sensor Data cube

A Linked Sensor Data Cube for a 100 year homogenised daily temperature dataset | Laurent Lefort

Access to data with Elda via http://lab.environment.data.gov.au/

20 |

ssn:hasSubSystemssn:hasDeployment

ssn:deploymentProcessPartssn:observedBy

Page 21: Linked Sensor Data cube

A Linked Sensor Data Cube for a 100 year homogenised daily temperature dataset | Laurent Lefort

Mashups

• Display the station locations and their average temperature readings on a map• http://lab.environment.data.gov.au/mashup/drilldown

• Select a Date range for climate readings for a given location• http://lab.environment.data.gov.au/mashup

21 |

Page 22: Linked Sensor Data cube

A Linked Sensor Data Cube for a 100 year homogenised daily temperature dataset | Laurent Lefort

Lessons learned

• Flexible URI scheme • ELDA-friendly, UK-style: using nested list endpoints and item endpoints

– http://lab.environment.data.gov.au/data/acorn/climate/slice/station – http://lab.environment.data.gov.au/data/acorn/climate/slice/station/014015

• Extra slice(s) easy to add to allow multiple access to the same observations• RDF Data Cube vocabulary (QB)

• Some clarifications needed for qb:structure, qb:sliceKey, qb:sliceStructure, qb:component and qb:componentAttachment properties e.g. through the publication of validation rules

• Coupling of SSN ontology and RDF Data Cube vocabulary• Different ecosystems (OWL vs. RDF/RDFS)

– OK for RDF Data Cube, not OK for other reused vocabularies e.g. UK Intervals (Jena Eyeball used for validation)

• Observed properties are classes in the SSN ontology and properties in the RDF Data Cube– Possibility to reuse/extend the qb:concept properties defined to manage references to

skos:Concept in QB

22 |

Page 23: Linked Sensor Data cube

A Linked Sensor Data Cube for a 100 year homogenised daily temperature dataset | Laurent Lefort

Conclusions

• Approach is applicable to all climate time series • Several climate-specific issues not addressed

• Transparency/reproducibility of homogenisation process– Require raw data plus extra (meta)data (sensors, screen types, sensors

exposure, “qualified” observed properties during a specific observation interval), plus data used/generated during homogenisation algorithm (ACORN-SAT uses different values for different value distribution percentiles)

– More ontology work needed (compared to SSN) on homogenisation algorithms parameters, types of breakpoints and types of adjustment lookup table

• Opportunities to link to other datasets (Australia, World)• Geo-features (e.g. GeoNames - done) for weather station sites, districts• Other climate data e.g. regional and world climate data archives, cyclone tracks

(not yet available as linked data)• Other environmental data (not yet available as linked data)

23 |

Page 24: Linked Sensor Data cube

Division/Unit NameLaurent LefortOntologistt +61 2 9123 4567e [email protected] ict.csiro.au

CSIRO ICT CENTRE

Thank you

Page 25: Linked Sensor Data cube

A Linked Sensor Data Cube for a 100 year homogenised daily temperature dataset | Laurent Lefort

Images credits

• Blair Trewin The ACORN-SAT station at Butlers Gorge in central Tasmania (surfacetemperatures.blogspot.com.au )

25 |

Page 26: Linked Sensor Data cube

A Linked Sensor Data Cube for a 100 year homogenised daily temperature dataset | Laurent Lefort

Reused ontologiesOntology Short   Description URL

DOLCE Ultra Lite (DUL)

A lightweight foundational ontology for modeling either physical or social contexts

http://www.loa-cnr.it/ontologies/DUL.owl Semantic

Sensor Network An ontology for the description of sensors and observations, and related concepts. http://purl.oclc.org/NET/ssnx/ssn

RDF Data Cube A vocabulary for the publication of multi-dimensional data as linked data http://purl.org/linked-data/cube

OWL Time An ontology of temporal concepts http://www.w3.org/2006/time

Intervals A vocabulary (and URI scheme) for the definition of instants and intervals.

http://reference.data.gov.uk/def/intervals

WGS84_Pos A vocabulary for representing latitude, longitude and  altitude information in the WGS84 geodetic reference datum

http://www.w3.org/2003/01/geo/wgs84_pos

GeoNames An ontology for the description of geographical features, their characteristics and relationships

http://www.geonames.org/ontology/ontology_v3.01.rdf

VoID (Vocabula-ry of Interlinked Datasets)

A vocabulary for expressing metadata about RDF datasets http://vocab.deri.ie/void

26 |

Page 27: Linked Sensor Data cube

A Linked Sensor Data Cube for a 100 year homogenised daily temperature dataset | Laurent Lefort

Developed ontologies

27 |

Ontology Short   Description URL

ETCCDI Indicators defined by the joint CCl/CLIVAR/JCOMM Expert Team on Climate Change Detection and Indices

http://purl.oclc.org/NET/ssnx/etccdi

Rainfall districts and states

Geographical areas defined as part of the Bureau's numbering system for observation sites

http://lab.environment.data.gov.au/def/stations/raindist …/rainstate

BoM Station Definition for the weather stations registered in the Bureau’s Weather Station Directory

http://lab.environment.data.gov.au/def/stations/station

Surface Air Temperature

ACORN-SAT observation (temperature, rainfall) for one day

http://lab.environment.data.gov.au/def/acorn/sat

Time Series Time series data defined as data cube slices (aggregated at different levels)

http://lab.environment.data.gov.au/def/acorn/time-series ACORN-SAT

deployment Phases and sub-phases recorded in the ACORN-SAT documentation pack

http://lab.environment.data.gov.au/def/acorn/deployment ACORN-SAT

system The sensing asset used for a deployment phases (or sub-phase)

http://lab.environment.data.gov.au/def/acorn/system ACORN-SAT

site The site used for a deployment phase (or sub-phase)

http://lab.environment.data.gov.au/def/acorn/site

Page 28: Linked Sensor Data cube

A Linked Sensor Data Cube for a 100 year homogenised daily temperature dataset | Laurent Lefort

RDF Data Cube (qb:ComponentAttachement)

28 |

Page 29: Linked Sensor Data cube

A Linked Sensor Data Cube for a 100 year homogenised daily temperature dataset | Laurent Lefort

Reference to skos:Concept

29 |