triplecheckmate
DESCRIPTION
Presentation of the TripleCheckMate tool: http://aksw.org/Projects/TripleCheckMate.html @KESW 2013 (kesw.ifmo.ru/kesw2013/)TRANSCRIPT
![Page 1: TripleCheckMate](https://reader033.vdocuments.net/reader033/viewer/2022052505/556a0b66d8b42af0198b486c/html5/thumbnails/1.jpg)
TripleCheckMate: A Tool for Crowdsourcing the Quality Assessment of Linked Data
Dimitris Kontokostas, Amrapali Zaveri, Sören Auer and Jens Lehmann
KESW 2013 Oct 08, 2013
![Page 2: TripleCheckMate](https://reader033.vdocuments.net/reader033/viewer/2022052505/556a0b66d8b42af0198b486c/html5/thumbnails/2.jpg)
Outline
❏Data Quality❏Data Quality Assessment Methodology❏ Evaluation Methodology - Manual
❏ Phase I: Quality Problem Taxonomy❏ Phase II: Crowdsourcing Quality Assessment
❏ TripleCheckMate❏ Architecture❏Demo
❏Conclusion & Future Work
2
![Page 3: TripleCheckMate](https://reader033.vdocuments.net/reader033/viewer/2022052505/556a0b66d8b42af0198b486c/html5/thumbnails/3.jpg)
Data Quality
● Data Quality (DQ) is defined as:○ fitness for a certain use case*
● On the Data Web - varying quality of information covering various domains
● High quality datasets ○ curated over decades - life science domain○ crowdsourcing process - extracted from unstructured
and semi-structured information, e.g. DBpedia
* J. Juran. The Quality Control Handbook. McGraw-Hill, New York, 1974.3
![Page 4: TripleCheckMate](https://reader033.vdocuments.net/reader033/viewer/2022052505/556a0b66d8b42af0198b486c/html5/thumbnails/4.jpg)
Data Quality Assessment Methodology
4 Step Methodology:
❏ Step 1: Resource selection❏ Per Class❏ Completely random❏ Manual
❏ Step 2: Evaluation mode selection❏ Manual❏ Semi-automatic❏ Automatic
❏ Step 3: Resource evaluation
❏ Step 4: DQ improvement❏ Direct❏ Indirect
4
![Page 5: TripleCheckMate](https://reader033.vdocuments.net/reader033/viewer/2022052505/556a0b66d8b42af0198b486c/html5/thumbnails/5.jpg)
Evaluating Methodology - Manual
❏Phase I: Creation of quality problem taxonomy
❏Phase II: Crowdsourcing quality assessment
5
![Page 6: TripleCheckMate](https://reader033.vdocuments.net/reader033/viewer/2022052505/556a0b66d8b42af0198b486c/html5/thumbnails/6.jpg)
Phase I: Quality Problem Taxonomy
AZaveri, A. Rula, A. Maurino, R. Pietrobon, J. Lehmann, and S. Auer. Quality assessment methodologies for Linked Open Data: A Review. Under review, available at http://www.semantic-webjournal.net/content/quality-assessment-methodologieslinked-open-data.
6
![Page 7: TripleCheckMate](https://reader033.vdocuments.net/reader033/viewer/2022052505/556a0b66d8b42af0198b486c/html5/thumbnails/7.jpg)
Phase II: Crowdsourcing Quality Assessment
Crowdsourcing Our Approach
Type Human Intelligent Tasks (HITs)
Contest-based
Participants Labor market Linked Data (LD) experts
Task Detect quality issues in triples
Detect & classify quality issues in resources
Reward Per tasks/triple Most no. of resources evaluated
Tool Amazon Mechanical Turk, CrowdFlower etc.
TripleCheckMate
7
![Page 8: TripleCheckMate](https://reader033.vdocuments.net/reader033/viewer/2022052505/556a0b66d8b42af0198b486c/html5/thumbnails/8.jpg)
TripleCheckMate - Architecture (1/2)
8
![Page 9: TripleCheckMate](https://reader033.vdocuments.net/reader033/viewer/2022052505/556a0b66d8b42af0198b486c/html5/thumbnails/9.jpg)
TripleCheckMate - Architecture (2/2)
● Built on Java / GWT○ GWT compiles to native cross-browser HTML/JS
● Tomcat / Jetty & MySQL as minimal backend○ store/retrieve evaluation data only
● Application logic is built on the client○ SPARQL executed on client○ Portable
9
![Page 10: TripleCheckMate](https://reader033.vdocuments.net/reader033/viewer/2022052505/556a0b66d8b42af0198b486c/html5/thumbnails/10.jpg)
Evaluation storage schema
● Designed to support multiple campaigns and different ontologies
● Quality taxonomy is stored in the database which makes it easy to adapt
10
![Page 11: TripleCheckMate](https://reader033.vdocuments.net/reader033/viewer/2022052505/556a0b66d8b42af0198b486c/html5/thumbnails/11.jpg)
TripleCheckMate - Demo
http://tinyurl.com/TCM-Demohttp://tinyurl.com/TCM-Screencast
![Page 12: TripleCheckMate](https://reader033.vdocuments.net/reader033/viewer/2022052505/556a0b66d8b42af0198b486c/html5/thumbnails/12.jpg)
Conclusion & Future Work
● TripleCheckMate○ Tool for crowdsouring quality assessment○ Linked Data quality assessment○ Supports inter-rater agreement○ Can be used with any Linked Dataset
● Future Work○ Directly integrating semi-automatic methods○ Improve efficiency of quality assessment○ Include support for Patch Ontology* as output format
* M. Knuth, J. Hercher, and H. Sack. Collaboratively patching linked data. CoRR, 2012. 12
![Page 13: TripleCheckMate](https://reader033.vdocuments.net/reader033/viewer/2022052505/556a0b66d8b42af0198b486c/html5/thumbnails/13.jpg)
Thank YouQuestions?
http://nl.dbpedia.org:8080/TripleCheckMate-Demo/https://github.com/AKSW/TripleCheckMate
http://aksw.org/[email protected]
Twitter: @amrapaliz