ontology engineering
DESCRIPTION
Ontology Engineering. Introduction. First results are about bats and dolphins. Google Scholar search for query Sentiment Analysis. Why?. Web content is currently formatted for human readers rather than programs - PowerPoint PPT PresentationTRANSCRIPT
1
Ontology Engineering
Introduction
2
First results are about bats and dolphins
4
Google Scholar search for query Sentiment Analysis
Sr. No
Query on GS
Returned Results
Years needed to finish the reading (five papers per day)
Citations of top 10 results
Years needed to finish reading of citations (five papers per day)
1 Digital Libraries
1.8 million
936 years 3411 1.8 years
2 Ontology Evaluation
343,000
188 years 1073 Half year
3 Schema integration
427,000
234 years 3389 1.8 years
4 Requirement Engineering
1.4 million
761 years 341 0.2 years
5 Turing Machines
154,000
84 years 3957 2.2 years
6 Distributed computing models
2.2 million
1238 years 5469 3 years
7 Fuzzy Logic
1 million
507 years 27154 15 years
8 Hypermedia
176,000
96 years 12359 6.7 years
9 Virtual Reality
1.9 million
1030 years 17098 9.3 years
10 Fault Tolerance
601,000
329 years 7246 4 years
Why?
Web content is currently formatted for human readers rather than programs
HTML is the predominant language in which Web pages are written (directly or using tools)
Vocabulary describes presentation
6
HTML?
7
<HTML><BODY><H2 align=center>Nonmonotonic Reasoning: Context-
Dependent Reasoning</H2><P align=center>
<I>by<B>V. Marek</B> and <B>M Truszczynski</B></I>
<BR>Springer 1993<BR>ISBN 0387976892</P></BODY></HTML>
HTML?
Inability to cover any content aspects – HTML only describes the appearances of documents and cannot cover any content related aspects. It is therefore unsuitable for explicit queries.
Inability for semantic markup – Individual elements on a page cannot be marked semantically.
8
Why does this happen??
The Web content is not machine-accessible lack of semantics Not in a proper structure Not in a machine understandable
manner keyword-based search engines
(e.g. Google, AltaVista, Yahoo)
9
How to overcome these limitations Currents situation can be improved by adopting following
two strategies Use the content as it is represented today, and to develop
techniques based on artificial intelligence and computational linguistics. This approach has been followed for sometime now, but
despite advances that have been made the task still appears too ambitious.
Represent Web content in a form that is more easily machine processable
Then use intelligent techniques to take advantage of these representations (Semantic Web).
10
A Layered Approach
11
XML
12