data linkage

19
Data Linkage Alasdair J G Gray [email protected] alasdairjggray.co.uk @gray_alasdair

Upload: alasdair-gray

Post on 17-Jul-2015

380 views

Category:

Data & Analytics


2 download

TRANSCRIPT

Data Linkage

Alasdair J G [email protected]

alasdairjggray.co.uk

@gray_alasdair

Estuarine Flooding

Financial implications Damage

Loss of business

Personal factors Emotional impact

Flood prediction Locations

Severity

Requires correlating Sea-state data

Weather forecasts Details of sea defences

Response Planning Evacuation routes

Personnel deployment

Requires more data Traffic reports

Shipping

8 April 2015 SICSA Env. & Social Databases 2

Image: http://www.metro.co.uk/

Flood Predication

Solent Use Case

Busy shipping

channel

Two major ports

Complex tidal

and

wave patterns

8 April 2015 SICSA Env. & Social Databases 3

Flood

defences

data

(database)

Flood Detection

“Detect overtopping

events in the Solent

region”

sea-level >

sea-defence

•Sea-level: sensors

•Defence heights:

databases

8 April 2015 SICSA Env. & Social Databases 4

Real-time

sensor dataWave,

Wind,

Tide

Meteorological

forecasts

Response Planning

“Provide contextual

information”

•Web feeds

•Other sources: maps,

models

• Real-time merging of

datasets

8 April 2015 SICSA Env. & Social Databases 5

Other sources:

Maps, models,

Data Linkage and Querying

Web of Data

8 April 2015 SICSA Env. & Social Databases 6

1. Global ID – URI

2. Resolvable ID

3. Useful content HTML for humans

RDF for machines

4. Link to other resources

Like the Web, but for data!

Linked Data Approach

8 April 2015 SICSA Env. & Social Databases 7

“RDF and OWL do not

solve the interoperability

problem, they just lay it

bare on the table!”

Olympics 2012

8 April 2015 SICSA Env. & Social Databases 8

Linking Data

8 April 2015 SICSA Env. & Social Databases 9

Querying Approach

Use ontologies as common model

Requires:

Representation of data:

sensors and databases

Establishing mappings between ontology

models and data source schemas

Accessing data sources through queries

over ontology model

Expressing continuous queries over sensors

8 April 2015 SICSA Env. & Social Databases 10

WSN Resource Concerns

Energy

Running off battery

Computation Capabilities

Limited CPU

Limited memory

Limited storage

Radio Transmission

Limited range

Energy impact

Lost transmissions

8 April 2015 SICSA Env. & Social Databases 12

Data Matching

Administrative Data Research Centre - Scotland

Messy data

Probabilistic matches

Schema matching

John Grant

Fisherman

Fiona Sinclair

Ian Grant

Smithy

Born: 1861

Stuart Adam

Wheelwright

Morag Scott

Flora Adam

Seamstress

Born: 1866

Married: 1884

John Grant

Farmer

Fiona Grant

Iain Grant

Born: 1860

13

Administrative Data Research Network

Administrative Data Research Centre - Scotland

Administrative Data Service

14

ADRC-Scotland

Administrative Data Research Centre - Scotland

Co-located with Farr Institute,

Scottish Government and NHS.

Universities of Aberdeen, Dundee,

Edinburgh, Glasgow, Herriot-Watt,

St Andrews and Stirling.

Expertise in administrative data and public

engagement, linkage, law and relevant computer

science techniques.

Provide research support, facilities, training

15

Research Focus

Administrative Data Research Centre - Scotland

http://www.gov.scot/Resource/0044/00442276-390.jpg

Schools, colleges and universities

The criminal and justice system

Social work services

Social welfare

Housing system

Transport system

Health system

Historical administrative data

16

Multiple Identities

Andy Law's Third Law

“The number of unique identifiers

assigned to an individual is never

less than the number of Institutions

involved in the study”http://bioinformatics.roslin.ac.uk/lawslaws/

8 April 2015 SICSA Env. & Social Databases 17

P12047X31045

GB:29384

http://rdf.ebi.ac.uk/resource/ch

embl/molecule/CHEMBL1642

https://www.ebi.ac.uk/chembl/compound/inspect/CHEMBL1642

Query Performance

Response time

Data freshness

Reliability

Volume of requests

Hosting resources

8 April 2015 SICSA Env. & Social Databases 18

Data

Source

Data

Source

Data Warehouse

Queries

Data

Source

Data

Source

Mediator

Queries

How FAIR is your Data?

8 April 2015 SICSA Env. & Social Databases 19

Summary

Web of Data

Global

Identifiers

Interoperable

data

Domain

ontologies

Challenges

Data matching

Multiple

identifiers

Query

performance

FAIR data

8 April 2015 SICSA Env. & Social Databases 20

www.alasdairjggray.co.uk

[email protected]

@gray_alasdair