ontology modelling of an engineering document – perspectives of linguistics analysis

14
26.08.2012 Ontology Modelling of an Engineering Document – Perspectives of Linguistics Analysis

Upload: victor-agroskin

Post on 02-Dec-2014

2.400 views

Category:

Technology


2 download

DESCRIPTION

TechInvestLab.ru is starting a research program into an automation of formal modelling. The first project is developed together with ABBYY - the leading linguistic company. The project studies possibilities to build a Gellish-like formal model of a natural language technical document, for further transformation into an ISO 15926 compliant data model with TabLan.15926 engine. This presentation shows preliminary comparisons between syntactic and semantic structures parsed by ABBYY Compreno and manually prepared formal text models.

TRANSCRIPT

Page 1: Ontology Modelling of an Engineering Document – Perspectives of Linguistics Analysis

26.08.2012

Ontology Modelling of an Engineering Document –

Perspectives of Linguistics Analysis

Page 2: Ontology Modelling of an Engineering Document – Perspectives of Linguistics Analysis

2

First Step: Requirements Modelling

ROSENERGOATOM project, July 2011– Manual processing methodology for Technical

Requirements document– Special software for ISO 15926 data model

transformation– Sample Nuclear Power Plant requirements

processing:• Sample size: 12 paragraphs of text• Content identified: 16 requirements, 3 classifiers• Resulting model: 96 items, 35 relationships

Page 3: Ontology Modelling of an Engineering Document – Perspectives of Linguistics Analysis

3

Technical Document Semantic Modelling

TabLan methodology, March 2012– Manual processing methodology for technical

documents (English)– Using subset of Gellish http://

sourceforge.net/apps/trac/gellish/

– Mapping to the enhanced Initial Template Set– .15926 Editor for ISO 15926 data model

transformation

– Dowload free from http://techinvestlab.ru/files/TabLan/TabLan.rar

Page 4: Ontology Modelling of an Engineering Document – Perspectives of Linguistics Analysis

4

Document Modelling Lessons• Technical document modelling promise:

– Requirements verification

– Project IT systems customisation (classifiers for CAD/CAM/PLM/ERP/etc.)

– Data integration support (reference data library content generation)

– Tracing design decisions to requirements– Design decisions verification

• Formal modelling problems:– Labour-intensive process of manual modelling– Large volume of «dumb» preparatory work– Need for a professional engineering verification in a new

formalism unknown to engineers– Fragmented architecture of project IT environment — an

obstacle for model reuse

Page 5: Ontology Modelling of an Engineering Document – Perspectives of Linguistics Analysis

5

Preconditions for Automation of Technical Document Modelling

• Restricted and relatively formal engineering subset of natural language

• Contemporary developments in computer based natural language processing

• Contemporary developments in ontology extraction from natural language texts

• Controlled language for engineering (Gellish)• Gellish to ISO 15926 mapping development

Page 6: Ontology Modelling of an Engineering Document – Perspectives of Linguistics Analysis

Experimenting withABBYY Compreno

Technology That Translates from Human

into Computer Language http://www.abbyy.ru/science/techno

logies/business/compreno

Page 7: Ontology Modelling of an Engineering Document – Perspectives of Linguistics Analysis

ABBYY ComprenoABBYY Compreno is ABBYY’s innovative technology that performs full semantic and syntactic analysis for

comprehensive handling of natural language texts. ABBYY Compreno is the first ever practical implementation of fundamental linguistic research carried out internationally over the past fifty years. A result of seventeen years of intensive R&D, ABBYY Compreno offers robust solutions to many long-standing language processing problems of the information age, such as:

• Intelligent search and retrieval– Intelligent semantic search

– Multilingual search

– Semantic tagging of documents for more powerful searching

• Comprehensive text analysis– Information monitoring

– Controlling access to cofidential information

– Summarizing and annotating documents

– Sentiment analysis

• Efficient handling of text documents– Document classification and filtering

– Text comparison

• High quality machine translation

Page 8: Ontology Modelling of an Engineering Document – Perspectives of Linguistics Analysis

8

Research Plan

• Starting point – comparison between:• syntactic and semantic structure (parsed by ABBYY

Compreno)

• formal text model (manually prepared)

• Rule development for mapping between linguistic and engineering ontologies (current)

• Customisation with domain thesauri (plans)• Testing on a corpus of engineering texts (plans)

Page 9: Ontology Modelling of an Engineering Document – Perspectives of Linguistics Analysis

9

«The containment system shall include a primary containment and a secondary containment.»

ABBYY Compreno parser results: text view

Page 10: Ontology Modelling of an Engineering Document – Perspectives of Linguistics Analysis

10ABBYY Compreno parser results: tree view

Page 11: Ontology Modelling of an Engineering Document – Perspectives of Linguistics Analysis

11

Containment system A: is a whole for Primary containment

B: is a whole for Secondary containment

А is classified as a Requirement

B is classified as a Requirement

«The containment system shall include a primary containment and a secondary containment.»

Formal model:

Page 12: Ontology Modelling of an Engineering Document – Perspectives of Linguistics Analysis

12

«Inner surfaces should be smooth to prevent corrosion residue and to simplify decontamination.»

ABBYY Comprenoparser: tree view

Page 13: Ontology Modelling of an Engineering Document – Perspectives of Linguistics Analysis

13

Inner surfacesis a specialization of Surface

is a specialization of Inner

Inner surfaces

A: is a specialization of Smooth

Ais classified as a Requirement

is intended to achieve To prevent corrosion residue and to simplify decontamination

To prevent corrosion residue and to simplify decontamination

is a whole for To prevent corrosion residuehas as subject Corrosion residue

is a whole for To simplify decontaminationhas as subject Decontamination

«Inner surfaces should be smooth to prevent corrosion residue and to simplify decontamination.»

Formal model:

Page 14: Ontology Modelling of an Engineering Document – Perspectives of Linguistics Analysis

14

Thank you!Anatoly Levenchukhttp://ailev.ru (Rus)http://levenchuk.com (Eng)[email protected]

Victor [email protected]

.15926 Editor http://techinvestlab.ru/dot15926EditorFeedback and comments:

[email protected]://community.livejournal.com/dot15926/

TechInvestLab.ru+7 (495) 748-5388