semantic, terminological and linguistic analysis of xbrl

15
Semantic, Terminological & Linguistic Analysis of XBRL (eXtensible Business Reporting Language) Tobias Wunner DERI, National University of Ireland, Galway Copyright 2010 Digital Enterprise Research Institute. All rights us-gaap: GainLossOnSaleOfOilAndGasPropertyAbstract ifrs:ImpairmentLossRecognisedInProfitOrLossGoodwi xbrl- es:InstrumentosFinancierosLargoPlazoClasesActivoInstrumentos trimonio xbrl- de:ass.deficitNotCoveredByCapital.netIncome.show ndebit

Upload: tobias-wunner

Post on 05-Dec-2014

1.979 views

Category:

Education


1 download

DESCRIPTION

Analysis of financial vocabularies at EKAW 2010 terminology & ontology workshop in Lisbon. [1] REUSE AND ADAPTATION OF ONTOLOGIES AND TERMINOLOGIES, http://www-limbio.smbh.univ-paris13.fr/ReuseOnto-EKAW2010/ [2] http://ekaw2010.inesc-id.pt/ [3] Monnet Twitter, http://twitter.com/monnetproject

TRANSCRIPT

Page 1: Semantic, terminological and linguistic analysis of xbrl

Semantic, Terminological & Linguistic Analysis of XBRL

(eXtensible Business Reporting Language)

Tobias WunnerDERI, National University of Ireland, Galway

Copyright 2010 Digital Enterprise Research Institute. All rights reserved, Paul Buitelaar

us-gaap: GainLossOnSaleOfOilAndGasPropertyAbstract

ifrs:ImpairmentLossRecognisedInProfitOrLossGoodwill

xbrl-es:InstrumentosFinancierosLargoPlazoClasesActivoInstrumentosPatrimonio

xbrl-de:ass.deficitNotCoveredByCapital.netIncome.showndebit

Page 2: Semantic, terminological and linguistic analysis of xbrl

Multilingual Ontologies for networked knowledge

Context and Motivation

• Monnet Use Case in Financial Domain

– Query financial information in

your own language

– Across countries and languages

– Get results in your own language

• Monnet Research Challenges

1 XBRL taxonomy/ontology

translation

2 Ontology-driven cross-lingual

information extraction

http://www.monnet-project.eu

http://twitter.com/monnetproject

Page 3: Semantic, terminological and linguistic analysis of xbrl

Multilingual Ontologies for networked knowledge

Finance Terminology is complex!

“minimum finance lease payments receivable, at

present value, end of period not later than one

year”

representative term of financial domain

16 words complex structure (conceptually & linguistically)

Page 4: Semantic, terminological and linguistic analysis of xbrl

Multilingual Ontologies for networked knowledge

Preliminary Terminological Findings

Terminology(SAPTerm)

DomainRelated

Dictionary(WordNet)

DomainIndependent

XBRL(IFRS)

DomainSpecific

Page 5: Semantic, terminological and linguistic analysis of xbrl

Multilingual Ontologies for networked knowledge

Breaking down complexity

Semantic Terminological Linguistic

Three-faceted enrichment of terms

SemanticEnhancement

Termdecomposition

LinguisticEnrichment

Page 6: Semantic, terminological and linguistic analysis of xbrl

Multilingual Ontologies for networked knowledge

XBRL – Semantic Analysis

Page 7: Semantic, terminological and linguistic analysis of xbrl

Multilingual Ontologies for networked knowledge

XBRL – Semantic Analysis

Page 8: Semantic, terminological and linguistic analysis of xbrl

Multilingual Ontologies for networked knowledge

XBRL – Semantic Analysis

“Enhance semantics tofacilitate translation andinformation extraction.”

Page 9: Semantic, terminological and linguistic analysis of xbrl

Multilingual Ontologies for networked knowledge

DomainSpecific

DomainRelated

DomainIndependent

sapTerm:payments

XBRL – Terminological Analysis

Minimum finance lease payments receivable, at present value

ifrs:MinimumFinanceLeasePaymentsReceivable

ifrs:MinimumFinanceLeasePaymentsReceivableAtPresentValue

DomainRelated

DomainIndependent

DomainSpecific

googleDefine:leasePayments

googleDefine:Finance_lease

DomainIndependent

Page 10: Semantic, terminological and linguistic analysis of xbrl

Multilingual Ontologies for networked knowledge

XBRL – Linguistic Analysis

Financial text

“… lease payment …”

XBRL term

minimum finance lease payments receivable

Simple linguistic transformation

Page 11: Semantic, terminological and linguistic analysis of xbrl

Multilingual Ontologies for networked knowledge

XBRL – Linguistic Analysis

“… received minimum finance lease payments …”

Financial text

minimum finance lease payments receivable

XBRL term

Complex linguistic transformation

Page 12: Semantic, terminological and linguistic analysis of xbrl

Multilingual Ontologies for networked knowledge

lexicon

Application of STL

• Machine Translation of domain terms

• Ontology-based Information Extraction

ifrs:Revenueifrs:ProfitLossBeforeTaxifrs:MinimumFinanceLeasePaymentsPayable

ontologyifrs:ProfitLossBeforeTax(Tesco,3176)

ifrs:Revenue(Tesco,56910)

ifrs:ProfitLossBeforeTax(SAP,676)

ifrs:Revenue(SAP,2894)

Page 13: Semantic, terminological and linguistic analysis of xbrl

Multilingual Ontologies for networked knowledge

minimum finance lease payments receivable

decompose:

IFRS, sapTerm, googleDefine

Application in Machine Translation

minimum finance lease payments receivable

Google-Translate: Minimum finance lease payments receivable

mindest Finanzierungsleasing Zahlungen Forderungen

translate parts:IATE, DBPedia, leo

reconstruct

Mindestfinanzierungsleasingzahlungsforderungen

Google-Translate: Minimum Leasingzahlungen Forderungen

Official IFRS: Im Rahmen von Finanzierungs-Leasingverhältnissen zu erhaltende Mindestzahlungen

Page 14: Semantic, terminological and linguistic analysis of xbrl

Multilingual Ontologies for networked knowledge

linguistic analysis payments received receivables

Application in Information Extraction (IE)

:MinimumFinanceLeasePaymentsReceivable rdfs:subClassOf xbrli:monetaryItemType ; rdfs:label “Minimum finance lease payments receivable”@en .

semantically lifted

Minimum finance lease payments receivableterm analysis

Page 15: Semantic, terminological and linguistic analysis of xbrl

Multilingual Ontologies for networked knowledge

Ongoing and Future Work

• Lexicalization Experiment– Lexicalize XBRL vocabulary

– Evaluate IE and translation tasks

• Define data set– Financial dataset (reports & news text)

– Financial vocabulary (IFRS and xEBR - XBRL European Business Register)

• Future Work– Use case implementation & Demo