to boldly go pc-axis reference group, copenhagen, 2014 central statistics office, cork, ireland...

37
To Boldly Go PC-Axis Reference Group, Copenhagen, 2014 Central Statistics Office, Cork, Ireland Kevin Healy , [email protected] (00353 21) 453 5719 Eoin MacCuirc [email protected] (00 353 21) 453 5504

Upload: megan-maxwell

Post on 17-Dec-2015

215 views

Category:

Documents


2 download

TRANSCRIPT

To Boldly Go

PC-Axis Reference Group, Copenhagen, 2014

Central Statistics Office, Cork, IrelandKevin Healy , [email protected] (00353 21) 453 5719

Eoin MacCuirc [email protected] (00 353 21) 453 5504

Linked Open Data

The Tower of Babel

“If as one people speaking the same language they have begun to do this, then nothing they plan to do will be impossible for them. Come, let us go down and confuse their language so they will not understand each other.”

Tim Berners Lee – Founder of the Web“In an extreme view, the world can be seen as only connections, nothing else. We think of a dictionary as the repository of meaning, but it defines words only in terms of other words. I liked the idea that a piece of information is really defined only by what it's related to, and how it's related. There really is little else to meaning. The structure is everything. There are billions of neurons in our brains, but what are neurons? Just cells. The brain has no knowledge until connections are made between neurons. All that we know, all that we are, comes from the way our neurons are connected.”

How open is the data? - Linked Open Data star scheme

Tim Berners-Lee suggested a 5-star deployment scheme for Linked Open Data and Ed Summers provided a nice rendering of it. In the following, examples are given for each level. The example data used throughout is 'the temperature forecast for Galway, Ireland for the next 3 days':

★ make your stuff available on the Web (whatever format) under an open license 1 example ...

★★ make it available as structured data (e.g., Excel instead of image scan of a table) 2 example ...

★★★ use non-proprietary formats (e.g., CSV instead of Excel) 3 example ... ★★★★ use URIs to identify things, so that people can point at your stuff4

example ... ★★★★★ link your data to other data to provide context 5 example

http://lab.linkeddata.deri.ie/2010/star-scheme-by-example/

Linked Open Data cloud

http://lod-cloud.net/

Media

Government

Geo

Publications

User-generated

Life sciences

Cross-domain

Linked open data -The Semantic Web

Copenhagen – 99,100,000 hitslooking for a needle in a haystack

URI – Uniform Resource Identifier give the thing a name and an address

The following picture shows the desired relationships between a resource and its representing documents:

Tim’s cool URIs

Cool URIs don't changeWhat makes a cool URI?A cool URI is one which does not change.What sorts of URI change?URIs don't change: people change them.

It is the the duty of a Webmaster to allocate URIs which you will be able to stand by in 2 years, in 20 years, in 200 years. This needs thought, and organization, and commitment.

The Web of Things – The Internet of Things

The Internet of Things is coming, but it needs a semantic backbone to flourish. With some 25 billion devices expected to be connected to the Internet by 2015 and 50 billion by 2020, providing interoperability among the things on the IoT “is one of the most fundamental requirements to support object addressing, tracking, and discovery as well as information representation, storage, and exchange.” So write the authors of Semantics for the Internet of Things: Early Progress and Back to the Future, Payam Barnaghi and Wei Wang, Centre for Communication Systems Research, University of Surrey, Guildford, UK and Cory Henson, Kno.e.sis – Ohio Center of Excellence in Knowledge-enabled Computing.

“The suite of technologies developed in the Semantic Web … such as ontologies, semantic annotation, Linked Data and semantic Web services … can be used as principal solutions for the purpose of realizing the IoT,” they state. “Defining an ontology and using semantic descriptions for data will make it interoperable for users and stakeholders that share and use the same ontology.”

Where is the CSO with all this?

• In partnership with DERI/NUIG/INSIGHT

• One of the first NSIs in the world to upload census data as linked open data – data.cso.ie – Census 2011

• One of the organisations involved in the EU Open Cube pilot projects

• Launched apps4gaps competition

data.cso.ie

Census – Linked Open Data

• 12 million RDF triples from Census

• Geographical entities (counties, cities, etc.)

• Codelists

• Most technical work done by students/interns at NUIG

• CSO supplied data, use cases, and expertise• Lots of manual work and ad-hoc solutions• Results not fully “owned” by CSO• Skills needed to maintain/extend are mostly in

NUIG

18-19 November 2013 OpenCube kick-off meeting 15

CSO/NUIG collaboration summary position

18-19 November 2013 OpenCube kick-off meeting

Open Cube Project Pilots

Pilot Focus Tool/platform Data sets Type of users Number ofusers

EvaluationCycle

DCLG Publish Swirrl’s PublishMyData

50-100 open datasets regarding finance, planningPerformance, land use, housing and homlessness.

Public servants(members ofthe DCLG statistical data management team) as well as statisticians/ researchers

3-4 members of the data management team and 5 test users (statisticians, research analysts)

2 evaluationcycles: M9-M12 andM18-M21

Flemish Gov Publish/Reuse

FluidOps’ IWB

1100 open datasetsVRIND

A varied audience ranging from public servants to data scientists

5-10 2 evaluationcycles: M9-M12 andM18-M21

Central Statistics Office

Publish/Reuse

OpenCube toolkit

2011 Census dataset &StatBank dataset

Public servants

25 employees 2 evaluationcycles: M9-M12 andM18-M21

OpenCube Pilots

• Publishing statistics from StatBank as linked data

• Publishing statistics from StatBank as SDMX-ML

• Facilitate the creation of general reports aimed at the general public

• Assist with answering queries from the public• Help third parties to tell stories with CSO data

Open Cube business case for the CSO

• Own the data.cso.ie process and technology– Enable in-house maintenance, changes, etc.

• Publish StatBank* data as Linked Open Data– Ongoing publication process– Adhering to release schedule is critical– Publish data that are regularly updated (monthly, quarterly,

annual) as linked open data ( Census 2011 static data)*StatBank is the CSO published time series database (PC Axis)

• Deploy tools that enable analytics and exploitation of linked data– Both internally and externally

CSO goals (independent from OpenCube)

The Role of the CSO in the Future of Linked Data in Ireland

As the technology trends that drive adoption of Linked Data continue further, and theimportance of Open Data increases, the CSO is well-positioned to play a leading role as a “hub”in the Irish data Web.Some key steps include:1. Proactively encourage the adoption of standard classifications and metadata for Open Data that are published by different public bodies within Ireland. The CSO isalready documenting classifications on its StatCentral (Portal) website, and has more experiencein disseminating data on the Web than perhaps any other organisations in the publicsector. Ideally, the classifications themselves would be published as Linked Data.2. Going beyond pure classifications, encourage the use of standard identifiers (URIs) forgeographical areas.3. Support Linked Data as a new dissemination format for the CSO StatBank. Keyeconomic and demographic statistics are necessary in all sorts of data analysis tasks,and ideally they should be published as Linked Data directly by the source (CSO).

Application Programming Interface (API)

StatBank API

StatBank API – by theme

StatBank API – Downloadhttp://www.cso.ie/StatbankServices/StatbankServices.svc/jsonservice/responseinstance/AAA01

Key Indicators , quick tables and multi-quick tables

Key Economic Indicatorshttp://www.cso.ie/indicators/Maintable.aspx

Multi-quicktables http://www.cso.ie/multiquicktables/quickTables.aspx?id=qnq34

Public Sector Statistics Network (PSSN)

PSSN – Organisations hosted

OGP as a driverhttp://www.ogpireland.ie/

data.gov.ie – Irish OGP portalhttp://data.gov.ie/dataset

Context and Impact Indicators

CSO - Context and Impact 2011 2012 2013

Printed output

No. of releases and publications 238 306 304

Online output – CSO website

Visits 2,387,000 2,303,441 2,718,287

Page views 10,070,000 13,997,031 17,034,035

Downloaded files 1,539,000 1,733,833 1,856,176

StatBank table accesses 400,400 1,042,750 1,282,674

Online output – StatCentral site

Visits 131,400 158,117 179,527

Page views 300,200 418,564 451,788

Publication of statistics on social media

Followers (at year-end) 3,030 5,644 8,548

Burden Reduction

Annual reduction in statistical burden on business -28% -4.7% n/a

Context and Impact Indicators

Questions?