linked sensor data cube
DESCRIPTION
Presentation at the SSN workshop at ISWC 2012 (best paper award)TRANSCRIPT
A Linked Sensor Data Cubefor a 100 year homogenised daily temperature dataset
CSIRO ICT CENTRE
Laurent Lefort5th Semantic Sensor Network Workshop, 12 November 2012
Outline
• ACORN-SAT dataset• Role of SSN ontology• Role of RDF Data Cube vocabulary• Integration of SSN and RDF Data Cube• Lessons learned• Conclusions
A Linked Sensor Data Cube for a 100 year homogenised daily temperature dataset | Laurent Lefort2 |
The ACORN-SAT dataset
• Released by Aus. Bureau of Meteorology (23 March 2012)• Available at http://www.bom.gov.au/climate/change/acorn-sat/ • 112 stations in total - 60 from 1910 to 2011• Homogenised (adjusted) daily temperatures• Tabular format (1 file per time series/station)
A Linked Sensor Data Cube for a 100 year homogenised daily temperature dataset | Laurent Lefort3 |
A Linked Sensor Data Cube for a 100 year homogenised daily temperature dataset | Laurent Lefort
The Linked Data version of ACORN-SAT
• Experimental version of ACORN-SAT data • Available at http://lab.environment.data.gov.au/ • Developed for the Australian Bureau of Meteorology (BOM) by CSIRO in
cooperation with the Australian Government Information Management Office (AGIMO)
• Temperature (homogenised) plus Rainfall (not homogenised)
• First version presented at Australian GovHack Day• Alternative to tabular data
• Last version, uploaded to LOD cloud• http://thedatahub.org/dataset/acorn-sat
4 |
Motivation: linked gov. agencies data in Australia
• Linked data (and well managed URIs) to build the bridges between the different agencies
• Current linked data pilot is one agency (BoM) and one server but applies solutions and schemes already in place in multi-agencies and multi-service providers context (e.g. UK)
• Thanks to AGIMO for helping us to set up http://lab.environment.data.gov.au/
A Linked Sensor Data Cube for a 100 year homogenised daily temperature dataset | Laurent Lefort
SSN Ontology
• SSN-XG report http://www.w3.org/2005/Incubator/ssn/XGR-ssn/• SSN Ontology http://purl.oclc.org/NET/ssnx/ssn
• Navigable documentation on wiki auto derived http://www.w3.org/2005/Incubator/ssn/wiki/SSN
6 |
SSN: deployed systems and observations
Skeleton
Device
Deployment
PlatformSite
System
ssn:System
onPlatform
hasSubsystem
hasDeployment
ssn:DeploymentRelatedProcess
ssn:Deployment
deploymentProcesPartdeployedSystem
ssn:Platform
deployedOnPlatform
attachedSystem
ssn:Device
ssn:Sensor
ssn:SensingDevice
observes
inDeployment
observedBy
ssn:PropertyobservedProperty
ssn:Observation
A Linked Sensor Data Cube for a 100 year homogenised daily temperature dataset | Laurent Lefort
Specific challenges for the SSN Ontology
• ACORN-SAT data derived from multiple stations with complex history• Uses homogenisation algorithm to make adjustments to raw data• “Metadata” used by the algorithm to identify “breakpoints” in time series
– Site changes (moves, building or vegetation having an impact on the quality of observation), sensor (and sensor screens) changes, procedure changes (hours of observations)
• BoM station numbering system “somewhat confusing over time”• Desire to retain a single site number for upper-air observations at obs sites• Several numbering conventions have been used at one or more locations where an
overlap occurs between an old (comparison) and new site:– Old site retains old number, new site opens with new number.– Old site switches to new number for the duration of the comparison, new site
takes over old number from the start of its observations.– New site opens under new number then switches to old number after end of
comparison.
8 |
A Linked Sensor Data Cube for a 100 year homogenised daily temperature dataset | Laurent Lefort
Linked ACORN SAT deployment data with SSN
• Data describing the deployment history • Available in ACORN-SAT station catalogue (pdf)• Not available in tabular format distribution
• ACORN-SAT composite stations – System composed of one or several BoM stations
• BoM (Bureau of Meteorology) stations – System composed of one or several station sharing the same codes
• Textual description of significant events
• Data describing the detailed conditions of observations • Sensors• Screens• Automatic Weather stations• Procedures e.g. hours of observation
9 |
A Linked Sensor Data Cube for a 100 year homogenised daily temperature dataset | Laurent Lefort
Example (Darwin)Time series – Weather stations – Sites – (Sensors)
10 |
Darwin Post Office 014016 (1910-1942)
Darwin Airport014015 (1941-2007 & 2001-now)2 sites – 1km apart – same code used
A Linked Sensor Data Cube for a 100 year homogenised daily temperature dataset | Laurent Lefort
Deployment phases in Darwin
11 |
A Linked Sensor Data Cube for a 100 year homogenised daily temperature dataset | Laurent Lefort
RDF Data cube http://purl.org/linked-data/cube
• RDF Data Cube: a method to organise linked data in slices • A vocabulary published by the W3C
Government Linked Data (GLD) Working Group (Working Draft)• Also the method used to publish statistics data and environmental data in
Europe e.g. for Bathing Water Quality in UK http://www.epimorphics.com/web/projects/bathing-water-quality
• Advantages• Allows multiple views on the same data (similar to OLAP)• Generic approach which supports the links to domain-specific definitions
• Useable:• In any browser via Linked Data API (HTML output)• In JavaScript via Linked Data API (JSON output)• In R via SPARQL
12 |
A Linked Sensor Data Cube for a 100 year homogenised daily temperature dataset | Laurent Lefort13 |
From: The RDF Data Cube VocabularyW3C Working Draft 05 April 2012http://www.w3.org/TR/vocab-data-cube/
A Linked Sensor Data Cube for a 100 year homogenised daily temperature dataset | Laurent Lefort
Data cube, slice and observation
14 |
Dimension d6
Dimension d7
Dimension d1
Dimension d2
Dimension d3
Dimension d4
Dimension d5
Measure m1, m2, …Attribute a1, a2, …
Cube
Slice
Observation
QB: Dataset, Slice, ObservationCube and Slice
qb:Slice
qb:Dataset
slice
qb:Observation
Cube observation
observation
subslice
A Linked Sensor Data Cube for a 100 year homogenised daily temperature dataset | Laurent Lefort
Data Cube Structure: dimensions, measure, attributes
16 |
Observation
- MinTemperature- MaxTemperature- Rainfall
- Booleans for missing data
Day
(3) Month
(2) Year
(1) ACORN-SAT Series/System (station)
Current Data Cube structure (and URI/API logic)• Stations/time series
• Year• Month
• All linking to observations
A Linked Sensor Data Cube for a 100 year homogenised daily temperature dataset | Laurent Lefort
Slices and URI scheme
17 |
A Linked Sensor Data Cube for a 100 year homogenised daily temperature dataset | Laurent Lefort
Coupling SSN and RDF Data Cube
18 |
A Linked Sensor Data Cube for a 100 year homogenised daily temperature dataset | Laurent Lefort19 |
acorn-system
bom-station:System
hasSubsystem
parentFeature, parentCountryparentADM1, parentADM2
acorn-deploy:Deployment
deployment-ProcessPart
currentSite
bom-station:Station deployedOnPlatform
acorn-system:System
acorn-series:xxx
acorn-seriesacorn-
series:TimeSeries
acorn-sat:Observationacorn-sat observation
acorn-site:Site
bom-station acorn-deploy
acorn-site
acorn-deploy:PreDeployment
acorn-deploy:PostDeployment
acorn-deploy:StandaloneOperation
observedBy
acorn-sat:xxx
etcddi:xxx
etcddi
time:Interval (Instant)
OWL TimeIntervals
interval:CalendarInterval (Instant)
raindist:RainfallDistrict
raindist
rainstate
rainstate:RainfallState
gn:Feature
geonames
parentADM1
locatedIn
Skeleton
Device
Deployment
PlatformSiteSystem
ssn:System
onPlatform
hasSubsystem
hasDeployment
ssn:DeploymentRelatedProcess
ssn:Deployment
deploymentProcesPartdeployedSystem ssn:Platform
deployedOnPlatform
attachedSystem
ssn:Device
ssn:Sensor
ssn:SensingDevice
observes
inDeployment
observedBy
ssn:Property
observedProperty
Cube and Slice
qb:Slice
qb:Dataset slice
qb:Observation
Cube observation
observation
qb:ComponentProperty
DSD
componentProperty
qb:DataStructureDefinitionstructure
qb:ComponentSpecification
component
Dataset
void:Dataset
Concepts
skos:Concept
concept
ssn:Observation
A Linked Sensor Data Cube for a 100 year homogenised daily temperature dataset | Laurent Lefort
Access to data with Elda via http://lab.environment.data.gov.au/
20 |
ssn:hasSubSystemssn:hasDeployment
ssn:deploymentProcessPartssn:observedBy
A Linked Sensor Data Cube for a 100 year homogenised daily temperature dataset | Laurent Lefort
Mashups
• Display the station locations and their average temperature readings on a map• http://lab.environment.data.gov.au/mashup/drilldown
• Select a Date range for climate readings for a given location• http://lab.environment.data.gov.au/mashup
21 |
A Linked Sensor Data Cube for a 100 year homogenised daily temperature dataset | Laurent Lefort
Lessons learned
• Flexible URI scheme • ELDA-friendly, UK-style: using nested list endpoints and item endpoints
– http://lab.environment.data.gov.au/data/acorn/climate/slice/station – http://lab.environment.data.gov.au/data/acorn/climate/slice/station/014015
• Extra slice(s) easy to add to allow multiple access to the same observations• RDF Data Cube vocabulary (QB)
• Some clarifications needed for qb:structure, qb:sliceKey, qb:sliceStructure, qb:component and qb:componentAttachment properties e.g. through the publication of validation rules
• Coupling of SSN ontology and RDF Data Cube vocabulary• Different ecosystems (OWL vs. RDF/RDFS)
– OK for RDF Data Cube, not OK for other reused vocabularies e.g. UK Intervals (Jena Eyeball used for validation)
• Observed properties are classes in the SSN ontology and properties in the RDF Data Cube– Possibility to reuse/extend the qb:concept properties defined to manage references to
skos:Concept in QB
22 |
A Linked Sensor Data Cube for a 100 year homogenised daily temperature dataset | Laurent Lefort
Conclusions
• Approach is applicable to all climate time series • Several climate-specific issues not addressed
• Transparency/reproducibility of homogenisation process– Require raw data plus extra (meta)data (sensors, screen types, sensors
exposure, “qualified” observed properties during a specific observation interval), plus data used/generated during homogenisation algorithm (ACORN-SAT uses different values for different value distribution percentiles)
– More ontology work needed (compared to SSN) on homogenisation algorithms parameters, types of breakpoints and types of adjustment lookup table
• Opportunities to link to other datasets (Australia, World)• Geo-features (e.g. GeoNames - done) for weather station sites, districts• Other climate data e.g. regional and world climate data archives, cyclone tracks
(not yet available as linked data)• Other environmental data (not yet available as linked data)
23 |
Division/Unit NameLaurent LefortOntologistt +61 2 9123 4567e [email protected] ict.csiro.au
CSIRO ICT CENTRE
Thank you
A Linked Sensor Data Cube for a 100 year homogenised daily temperature dataset | Laurent Lefort
Images credits
• Blair Trewin The ACORN-SAT station at Butlers Gorge in central Tasmania (surfacetemperatures.blogspot.com.au )
25 |
A Linked Sensor Data Cube for a 100 year homogenised daily temperature dataset | Laurent Lefort
Reused ontologiesOntology Short Description URL
DOLCE Ultra Lite (DUL)
A lightweight foundational ontology for modeling either physical or social contexts
http://www.loa-cnr.it/ontologies/DUL.owl Semantic
Sensor Network An ontology for the description of sensors and observations, and related concepts. http://purl.oclc.org/NET/ssnx/ssn
RDF Data Cube A vocabulary for the publication of multi-dimensional data as linked data http://purl.org/linked-data/cube
OWL Time An ontology of temporal concepts http://www.w3.org/2006/time
Intervals A vocabulary (and URI scheme) for the definition of instants and intervals.
http://reference.data.gov.uk/def/intervals
WGS84_Pos A vocabulary for representing latitude, longitude and altitude information in the WGS84 geodetic reference datum
http://www.w3.org/2003/01/geo/wgs84_pos
GeoNames An ontology for the description of geographical features, their characteristics and relationships
http://www.geonames.org/ontology/ontology_v3.01.rdf
VoID (Vocabula-ry of Interlinked Datasets)
A vocabulary for expressing metadata about RDF datasets http://vocab.deri.ie/void
26 |
A Linked Sensor Data Cube for a 100 year homogenised daily temperature dataset | Laurent Lefort
Developed ontologies
27 |
Ontology Short Description URL
ETCCDI Indicators defined by the joint CCl/CLIVAR/JCOMM Expert Team on Climate Change Detection and Indices
http://purl.oclc.org/NET/ssnx/etccdi
Rainfall districts and states
Geographical areas defined as part of the Bureau's numbering system for observation sites
http://lab.environment.data.gov.au/def/stations/raindist …/rainstate
BoM Station Definition for the weather stations registered in the Bureau’s Weather Station Directory
http://lab.environment.data.gov.au/def/stations/station
Surface Air Temperature
ACORN-SAT observation (temperature, rainfall) for one day
http://lab.environment.data.gov.au/def/acorn/sat
Time Series Time series data defined as data cube slices (aggregated at different levels)
http://lab.environment.data.gov.au/def/acorn/time-series ACORN-SAT
deployment Phases and sub-phases recorded in the ACORN-SAT documentation pack
http://lab.environment.data.gov.au/def/acorn/deployment ACORN-SAT
system The sensing asset used for a deployment phases (or sub-phase)
http://lab.environment.data.gov.au/def/acorn/system ACORN-SAT
site The site used for a deployment phase (or sub-phase)
http://lab.environment.data.gov.au/def/acorn/site
A Linked Sensor Data Cube for a 100 year homogenised daily temperature dataset | Laurent Lefort
RDF Data Cube (qb:ComponentAttachement)
28 |
A Linked Sensor Data Cube for a 100 year homogenised daily temperature dataset | Laurent Lefort
Reference to skos:Concept
29 |