news fact-checking: one practical application of linked statistics

1
News Fact-checking: One Practical Application of Linked Statistics 3. 3. LinkedSTAT http://linkedstat.spaziodati.eu ISTAT SDMX SOAP Web Service http://sdmx.istat.it/WS_T/NsiStdV20Service.asmx SDMX-ML SDMX-to-RDF XSL transformations https://github.com/csarven/linked-sdmx Virtuoso Quad Store http://www.ladige.it/articoli/2012/07/17/poverta-trentino-resta-isola-felice NEWS FACT-CHECKING is the process of verifying accuracy of facts in publications Fact-checking is a tedious, time-, resource-consuming and error-prone process * The original text is: “Nel 2011, secondo i dati Istat, ... la provincia di Trento (3,4%), la Lombardia (4,2%), la Valle d'Aosta e il Veneto (4,3%) presentano i valori più bassi dell'incidenza di povertà [relativa].” “In 2011, according to Istat, ... the province of Trento (3.4%), Lombardy (4.2%), Valle d'Aosta and Veneto (4.3%) have the lowest value of the incidence of [relative] poverty.” * RDF/XML RDF Data Cube Vocabulary PROV-O Ontology SKOS and XKOS SDMX-RDF, ... Fact-checking: How to find a right set of dimension/value pairs for a given fact to construct queries for it? ISTAT: new possibilities to disseminate statistics and facilitate data certification. DBpedia/Wikipedia: automatic updates of statistical data “In 2011, according to Istat, ... the province of Trento (3.4%) ... value of the incidence of [relative] poverty.” Dimension Value territory linked-istat-property:REF_AREA “Provincia Autonoma Trento” http://linkedstat.spaziodati.eu/code/1.1/ CL_REFAREA/ITD2 reference time period linked-istat-property:TIME_PERIOD “2011” <http://reference.data.gov.uk/id/year/2011> statistical indicator linked-istat-property:IND_TYPE “incidenza di povertà relativa familiare” <http://linkedstat.spaziodati.eu/code/1.1/ CL_AGGREG_FAMIGLIE/INCID_POVREL_ FAM> http://linkedstat.spaziodati.eu/sparql MANUAL FACT-CHECKING – review of the citations' content dedicated fact-checking departments - only major infrequent periodicals can afford them (Der Spiegel, The Guardian, Esquire, Forbes); no budget in small publishing organisations - impractical for frequent publications nonprofit fact-checking organisations (FactCheck.org, PolitiFact.com) crowd-checking platforms (FactCheckEU.org) Tatiana Tarasova [email protected] SpazioDati, Trento, Italy “Poverty, Trentino remains a happy island” l'Adige.it, 17 Luglio 2012 What if the facts would be linked to the underlying data sources? publishing ISTAT http://dati.istat.it/ as Linked Data http://dati.istat.it/ All the queries and scripts produced during the LinkedSTAT project are available at https: //www.assembla.com/spaces/linked-istat/ Fact-checking with LinkedSTAT SELECT DISTINCT ?dataset ?title ?structure WHERE { ?dataset a qb:DataSet . ?dataset dcterms:title ?title . FILTER(contains(str(?title), "Incidenza di povertà relativa")) ?dataset qb:structure ?structure .} SELECT DISTINCT ?codeList ?p ?o WHERE { <http://linkedstat.spaziodati.eu/property/REF_AREA> qb:codeList ? codeList . ?codeList ?p ?o} SELECT DISTINCT ?obs ?value WHERE { ?obs rdf:type qb:Observation . ?obs linked-istat-property:REF_AREA <http://linkedstat.spaziodati.eu/code/1.2/CL_REFAREA/ITD2> . ?obs linked-istat-property:TIME_PERIOD <http://reference.data.gov.uk/id/year/2011> . ?obs linked-istat-property:IND_TYPE <http://linkedstat.spaziodati.eu/code/1.1/CL_AGGREG_FAMIGLIE/INCID_POVREL_FAM> . ?obs linked-istat-property:OBS_VALUE ?value .} Step1: retrieve the structure of the relevant dataset Step2: retrieve code lists that provide values Step3: retrieve the value of the required observation Future

Upload: spaziodati

Post on 05-Jul-2015

487 views

Category:

Technology


0 download

DESCRIPTION

This is the poster for SemStat at ISWC 2014 in Riva del Garda. SemStat 2014 was the "Second International Workshop on Semantic Statistics". Our poster is about a use case on fact-checking using the potential of Linked Statistics.

TRANSCRIPT

Page 1: News Fact-checking: One Practical Application of Linked Statistics

  News Fact-checking:One Practical Application of Linked Statistics

33.133.1

LinkedSTAT http://linkedstat.spaziodati.eu

ISTAT SDMX SOAP Web Servicehttp://sdmx.istat.it/WS_T/NsiStdV20Service.asmx

SDMX-ML SDMX-to-RDF XSL transformationshttps://github.com/csarven/linked-sdmx

VirtuosoQuad Store

http://www.ladige.it/articoli/2012/07/17/poverta-trentino-resta-isola-felice

NEWS FACT-CHECKING is the process of verifying accuracy of facts in publications

Fact-checking is a tedious, time-, resource-consuming and error-prone process

* The original text is:“Nel 2011, secondo i dati Istat, ... la provincia di Trento (3,4%), la Lombardia (4,2%), la Valle d'Aosta e il Veneto (4,3%) presentano i valori più bassi  dell'incidenza di povertà [relativa].”

“In 2011, according to Istat, ...the province of Trento (3.4%),Lombardy (4.2%), Valle d'Aosta and Veneto (4.3%) have the lowest value of the incidence of [relative] poverty.” *

RDF/XML

RDF Data Cube VocabularyPROV-O OntologySKOS and XKOSSDMX-RDF, ...

Fact-checking: How to find a right set of dimension/valuepairs for a given fact to construct queries for it?

ISTAT: new possibilities to disseminate  statistics andfacilitate data certification.

●DBpedia/Wikipedia: automatic updates of statistical data

“In 2011, according to Istat, ... the province of Trento (3.4%) ... value  of the incidence of [relative] poverty.”

Dimension Value

territory

linked-istat-property:REF_AREA

“Provincia Autonoma Trento”

http://linkedstat.spaziodati.eu/code/1.1/CL_REFAREA/ITD2

reference time period

linked-istat-property:TIME_PERIOD

“2011”

<http://reference.data.gov.uk/id/year/2011>

statistical indicator

linked-istat-property:IND_TYPE

“incidenza di povertà relativa familiare”

<http://linkedstat.spaziodati.eu/code/1.1/CL_AGGREG_FAMIGLIE/INCID_POVREL_FAM>

http://linkedstat.spaziodati.eu/sparql

MANUAL FACT-CHECKING – review of the citations' content● dedicated fact-checking departments- only major infrequent periodicals can afford them (Der Spiegel, The Guardian, Esquire, Forbes);   no budget in small publishing organisations- impractical for frequent publications

● nonprofit fact-checking organisations (FactCheck.org, PolitiFact.com)● crowd-checking platforms (FactCheckEU.org)

Tatiana Tarasova [email protected] SpazioDati, Trento, Italy

“Poverty, Trentino remains a happy island” l'Adige.it, 17 Luglio 2012

What if the facts would be linked to the underlying data sources?

publishing ISTAT http://dati.istat.it/ as Linked Data

http://dati.istat.it/

All the queries and scripts produced during the LinkedSTAT project are available athttps: //www.assembla.com/spaces/linked-istat/

Fact-checking with LinkedSTAT

SELECT DISTINCT ?dataset ?title ?structureWHERE {?dataset a qb:DataSet .?dataset dcterms:title ?title .FILTER(contains(str(?title), "Incidenza di povertà relativa"))?dataset qb:structure ?structure .}

SELECT DISTINCT ?codeList ?p ?oWHERE {<http://linkedstat.spaziodati.eu/property/REF_AREA> qb:codeList ?codeList .?codeList ?p ?o}

SELECT DISTINCT ?obs ?valueWHERE {?obs rdf:type qb:Observation .?obs linked-istat-property:REF_AREA <http://linkedstat.spaziodati.eu/code/1.2/CL_REFAREA/ITD2> .?obs linked-istat-property:TIME_PERIOD <http://reference.data.gov.uk/id/year/2011> .?obs linked-istat-property:IND_TYPE <http://linkedstat.spaziodati.eu/code/1.1/CL_AGGREG_FAMIGLIE/INCID_POVREL_FAM> .?obs linked-istat-property:OBS_VALUE ?value .}

Step1: retrieve the structure of the relevant dataset

Step2: retrieve code lists that provide values

Step3: retrieve the value of the required observation

Future