urbanopoly: collection and quality assessment of geo-spatial linked data via a human computation...

16
Urbanopoly: Collection and Quality Assessment of Geo-spatial Linked Data via a Human Computation Game Irene Celino, Dario Cerizza, Simone Contessa, Marta Corubolo, Daniele Dell’Aglio, Emanuele Della Valle, Stefano Fumeo and Federico Piccinini Semantic Web Challenge @ ISWC 2012 - 2012/11/14

Upload: planetdata-network-of-excellence

Post on 27-Jan-2015

95 views

Category:

Technology


1 download

DESCRIPTION

by Irene Celino, Dario Cerizza, Simone Contessa, Marta Corubolo, Daniele Dell’Aglio, Emanuele Della Valle, Stefano Fumeo and Federico Piccinini Semantic Web Challenge @ ISWC 2012 - 2012/11/14

TRANSCRIPT

Page 1: Urbanopoly: Collection and Quality Assessment of Geo-spatial Linked Data via a Human Computation Game

Urbanopoly: Collection and Quality

Assessment of Geo-spatial Linked Data

via a Human Computation Game

Irene Celino, Dario Cerizza, Simone Contessa,

Marta Corubolo, Daniele Dell’Aglio, Emanuele Della Valle,

Stefano Fumeo and Federico Piccinini

Semantic Web Challenge @ ISWC 2012 - 2012/11/14

Page 2: Urbanopoly: Collection and Quality Assessment of Geo-spatial Linked Data via a Human Computation Game

Citizen Science?

Semantic Web Challenge @ ISWC 2012 - 2012/11/14 Urbanopoly 2

Mount R

ain

ier N

PS

http

://ww

w.flic

kr.c

om

/pho

tos/m

ou

ntra

inie

rnp

s/6

997

851

139

/

Glacier NPS http://www.flickr.com/photos/glaciernps/4427412443/

Gla

cie

r NP

S h

ttp://w

ww

.flickr.c

om

/pho

tos/g

lacie

rnps/4

42

741

6227

/

Page 3: Urbanopoly: Collection and Quality Assessment of Geo-spatial Linked Data via a Human Computation Game

What about those citizens?

Semantic Web Challenge @ ISWC 2012 - 2012/11/14 Urbanopoly 3

Gareth 1953 http://www.flickr.com/photos/gareth1953/6786545520/ paul_houle http://www.flickr.com/photos/paul_houle/3301438074/

Boris van Hoytema http://www.flickr.com/photos/borisvanhoytema/685879933/ graziano88 http://www.flickr.com/photos/8482460@N06/6884509346/

Page 4: Urbanopoly: Collection and Quality Assessment of Geo-spatial Linked Data via a Human Computation Game

Citizen Computation

Urbanopoly 4 Semantic Web Challenge @ ISWC 2012 - 2012/11/14

Human Computation

exploiting human capabilities

to solve computational tasks

difficult for machines

Citizen Science

exploiting volunteers

to collect scientific data or

to conduct experiments

"in the world"

Citizen Computation

exploiting human capabilities

to contribute to a mixed

computational system

by living "in the world"

Page 5: Urbanopoly: Collection and Quality Assessment of Geo-spatial Linked Data via a Human Computation Game

Human Computation Games with a Purpose

Urbanopoly 5 Semantic Web Challenge @ ISWC 2012 - 2012/11/14

Purpose outside the game:

You help image search engines by manually tagging images

Purpose

within

the game:

Page 6: Urbanopoly: Collection and Quality Assessment of Geo-spatial Linked Data via a Human Computation Game

Citizen Computation Game with a Purpose

Collect and verify information about your city

by playing with the neighborhood around you

Semantic Web Challenge @ ISWC 2012 - 2012/11/14 Urbanopoly 6

http://bit.ly/urbanopoly

Purpose within

the game:

Purpose outside

the game:

Create your venues' portfolio and become

the greatest landlord ever!

Page 7: Urbanopoly: Collection and Quality Assessment of Geo-spatial Linked Data via a Human Computation Game

Urbanopoly – high-level view

LinkedGeoData +

Lombardia Open Data

bootstrap of

"venues" data

players

game to buy / sell

venues with missions

data about

venues as

missions

GWAP approach to

consolidate data verified / improved data

+ new data

Game purpose: check and correct geo-spatial data

from pre-existing sources + collect missing data

Semantic Web Challenge @ ISWC 2012 - 2012/11/14

1

2

3

4

Urbanopoly 7

Page 8: Urbanopoly: Collection and Quality Assessment of Geo-spatial Linked Data via a Human Computation Game

Urbanopoly Input Data

OpenStreetMap (OSM)

http://www.openstreetmap.org/

via LinkedGeoData (LGD)

http://linkedgeodata.org/

data as linked data, described by an ontology

Lombardia Open Data

https://dati.lombardia.it

data about "agriturismo" places as CSV converted to RDF

Urbanopoly data bootstrap: venues are "instances" of

selected LGD "classes" with their OSM tags as features,

thus Urbanopoly data are RDF statements of the form:

<venue> <feature> <value>

Semantic Web Challenge @ ISWC 2012 - 2012/11/14 Urbanopoly 8

Page 9: Urbanopoly: Collection and Quality Assessment of Geo-spatial Linked Data via a Human Computation Game

Urbanopoly gameplay

Urbanopoly 9 Semantic Web Challenge @ ISWC 2012 - 2012/11/14

http://bit.ly/u-video

Page 10: Urbanopoly: Collection and Quality Assessment of Geo-spatial Linked Data via a Human Computation Game

Urbanopoly mini-games for Data Collection

Semantic Web Challenge @ ISWC 2012 - 2012/11/14

data acquisition challenges as

contributions to an advertising campaign

– left: inserting a value,

right: taking a picture

data validation challenges to check

pre-existing data or other players’

contribution – left: answering a quiz,

right: rating a poster

Urbanopoly 10

Page 11: Urbanopoly: Collection and Quality Assessment of Geo-spatial Linked Data via a Human Computation Game

Urbanopoly Data Consolidation

Each statement has a confidence score:

{ <venue> <feature> <value> . } <confidence>

which indicates the probability of the statement to be true

Each player action is taken as an evidence of the associated

knowledge and alters the confidence score

A weighted majority voting algorithm aggregates the evidences:

Difficulty to acquire the contribution (e.g., typing vs. check box)

Player’s reputation (e.g., number of errors)

Player’s distance to the venue at contribution time (as sensed by the device)

When the confidence score overcomes a threshold, the triple

<venue> <feature> <value> gets consolidated

Semantic Web Challenge @ ISWC 2012 - 2012/11/14 Urbanopoly 11

Page 12: Urbanopoly: Collection and Quality Assessment of Geo-spatial Linked Data via a Human Computation Game

Urbanopoly Data Publication

True statements published as linked open data

If a statement's confidence overcomes the threshold, the statement

is asserted: <venue> <feature> <value> (as in LGD/OSM)

But there's more interesting information to publish!

False statements, statements' confidence, provenance info, etc.

We created a Human Computation

ontology (http://swa.cefriel.it/ontologies/hc)

extending the W3C PROV-O ontology

We published this further

knowledge as annotations to the reified

<venue> <feature> <value> statements

Cf. http://swa.cefriel.it/linkeddata/

Semantic Web Challenge @ ISWC 2012 - 2012/11/14 Urbanopoly 12

aggregatedFrom

Contributor

Contribution

Human Computation

Task

provo:Agent

provo:Entity

provo:Activity

Consolidated Information

solvedBy

enabledBy

contributionFrom

solu

tionT

o

aggre

gate

dB

y

Human Computation

Algorithm

Page 13: Urbanopoly: Collection and Quality Assessment of Geo-spatial Linked Data via a Human Computation Game

Urbanopoly Evaluation (1/2)

"Enjoyability" of the game (engagement potential):

Average life play: ALP = Played Time / Active Players

~ 100 minutes very good result

"Effectiveness" of the GWAP mechanism:

Throughput = Solved Problems / Played Time

~ 287 collected evidences / hour very good

~ 5 consolidated statements / hour can be improved

"Precision" of the results (measured on results' subset)

Accuracy = ( (P – FP) + (N – FN) ) / (P + N)

~ 92 % very good result

Semantic Web Challenge @ ISWC 2012 - 2012/11/14 Urbanopoly 13

Page 14: Urbanopoly: Collection and Quality Assessment of Geo-spatial Linked Data via a Human Computation Game

Urbanopoly Evaluation (2/2)

"Playability" of the game

Evaluation survey at http://bit.ly/u-survey, with questions about

usability, social aspects, physical presence, motivation, etc.

Feedbacks very encouraging

"Sociability" through Facebook channel

With Facebook Insights (http://www.facebook.com/insights/),

tracking of installs, demographics,

log-ins, content sharing, etc.

Example of published "story" on

Facebook Timeline:

Statistics about "stories" and

"impressions":

Interesting results, but channel to be further exploited

Semantic Web Challenge @ ISWC 2012 - 2012/11/14 Urbanopoly 14

Page 15: Urbanopoly: Collection and Quality Assessment of Geo-spatial Linked Data via a Human Computation Game

Conclusions

Urbanopoly is an end-user mobile application with

a multi-language attractive user interface

Urbanopoly manages urban data at a real scale

(ca. 50,000 venues) from heterogeneous sources

The meaning of data is core to the application and

consolidated data are published as linked open data

Urbanopoly is aimed at geo-spatial data collection and

quality assurance, especially for dynamic data

Our rigorous evaluation shows the high accuracy of results

and feasibility of the approach

Urbanopoly shows a clear commercial potential: further

data collection or validation needs can be added as further

mini-games or challenges within the game

Semantic Web Challenge @ ISWC 2012 - 2012/11/14 Urbanopoly 15

Page 16: Urbanopoly: Collection and Quality Assessment of Geo-spatial Linked Data via a Human Computation Game

Thanks for your attention!

Questions?

Keep on playing

Urbanopoly!

Semantic Web Challenge @ ISWC 2012 - 2012/11/14

Irene Celino – CEFRIEL, ICT Institute Politecnico di Milano

email: [email protected] – web: http://swa.cefriel.it

slides at: http://www.slideshare.net/iricelino