application of ontology in semantic information retrieval by prof shahrul azman from fstm, ukm

54
Application of Ontology in Semantic Information Retrieval Presentation for MyREN Seminar Berjaya Hotel, Kuala Lumpur 27 November 2014 1

Upload: khirulnizam-abd-rahman

Post on 03-Jul-2015

255 views

Category:

Technology


0 download

DESCRIPTION

Application of Ontology in Semantic Information Retrieval by Prof Shahrul Azman from FSTM, UKM Presentation for MyREN Seminar 2014 Berjaya Hotel, Kuala Lumpur 27 November 2014

TRANSCRIPT

Page 1: Application of Ontology in Semantic Information Retrieval by Prof Shahrul Azman from FSTM, UKM

Application of Ontology in

Semantic Information Retrieval

Presentation for MyREN Seminar

Berjaya Hotel, Kuala Lumpur

27 November 2014

1

Page 2: Application of Ontology in Semantic Information Retrieval by Prof Shahrul Azman from FSTM, UKM

Brief speaker’s info

2

Shahrul Azman Mohd. Noah, Ph.D.Knowledge Technology Research GroupCenter for AI Technology (CAIT)[email protected]

Graduated in BSc(Mathematics) from UKM

Graduated in MSc(IS) from Sheffield U.

Graduated in PhD(IS) from Sheffield U. –

knowledge-based systems

From Muar, Johor

Page 3: Application of Ontology in Semantic Information Retrieval by Prof Shahrul Azman from FSTM, UKM

ONTOLOGY

5

Page 4: Application of Ontology in Semantic Information Retrieval by Prof Shahrul Azman from FSTM, UKM

What is ontology?

• Ontology may be considered as a kind of method to represent knowledge.

• From a philosophical discipline – the science of “what is”; the kinds and structures of objects, properties, events, processes and relations in every area of reality.

• Aristotle classification of animals is one

the first ontology developed.

6

Page 5: Application of Ontology in Semantic Information Retrieval by Prof Shahrul Azman from FSTM, UKM

Ontology in Computing

• An ontology is an engineering artifact: – It is constituted by a specific vocabulary used to describe a

certain reality, plus

– A set of explicit assumptions regarding the intended meaning of the vocabulary.

• Thus, an ontology describes a formal specification of a certain domain:– Shared understanding of a domain of interest

– Formal and machine manipulable model of a domain of interest

7

Page 6: Application of Ontology in Semantic Information Retrieval by Prof Shahrul Azman from FSTM, UKM

8

Ontology Definition

Formal, explicit specification of a shared conceptualization

commonly accepted

understanding

conceptual model

of a domain

(ontological theory)

unambiguous

terminology definitions

machine-readability

with computational

semantics

[Gruber93]

Page 7: Application of Ontology in Semantic Information Retrieval by Prof Shahrul Azman from FSTM, UKM

Source: Smith & Welty (2001)

a catalog

a set of

text files

a glossary

a thesaurus

a collection of

taxonomies

a set of

general logical

constraints

a collection of

frames

Complexity

An ontology is…

9

Page 8: Application of Ontology in Semantic Information Retrieval by Prof Shahrul Azman from FSTM, UKM

Various approaches to classify ontologies

10

Classify ontologies according to the information

the ontology needs to express and the richness

of its internal structure (Lassila & McGuiness,

2001)

Classify into 2 orthogonal dimensions: the amount

and type of structure and the subject (Van Heijst et

al., 1997)

Classify ontologies according to their level of

dependence on a particular task (Guarino, 1998)

Page 9: Application of Ontology in Semantic Information Retrieval by Prof Shahrul Azman from FSTM, UKM

Ontology language

• Ontology languages are formal languages used to construct ontologies – allow the encoding of knowledge about specific domains and often

– include reasoning rules that support the processing of that knowledge

• Various languages have been proposed: CycL, KL-One, Ontolingua, F-Logic, OCML, LOOM, Telos, RDF(S), OIL, DAML+OIL, XOL, SHOE, OWL etc.

• Usually based on Description Logic (DL).

• Summarised as (Kalibatiene & Vasilecas, 2011):

11

Page 10: Application of Ontology in Semantic Information Retrieval by Prof Shahrul Azman from FSTM, UKM

Example of ontologies

• Top level ontology -

12

Suggested Upper Merged Ontology (SUMO

Page 11: Application of Ontology in Semantic Information Retrieval by Prof Shahrul Azman from FSTM, UKM

13

Portion of SUMO ontology with

USGS Geo-concepts inserted

Page 12: Application of Ontology in Semantic Information Retrieval by Prof Shahrul Azman from FSTM, UKM

Example of ontologies (cont.)

• Lexical ontology - Wordnet

14

Page 13: Application of Ontology in Semantic Information Retrieval by Prof Shahrul Azman from FSTM, UKM

Example of ontologies (cont.)

• Domain ontology - Simple News and Press Ontologies

(SNaP)

15

Page 14: Application of Ontology in Semantic Information Retrieval by Prof Shahrul Azman from FSTM, UKM

Linked Data…?

16

Page 15: Application of Ontology in Semantic Information Retrieval by Prof Shahrul Azman from FSTM, UKM

Applications of ontology

• Searching & browsing

• Decision support system

• Question answering system

• Recommendation

• Data integration

• Etc.

17

Page 16: Application of Ontology in Semantic Information Retrieval by Prof Shahrul Azman from FSTM, UKM

INFORMATION RETRIEVAL

18

Page 17: Application of Ontology in Semantic Information Retrieval by Prof Shahrul Azman from FSTM, UKM

Concepts

• “Information retrieval (IR)is a field concerned with the structure, analysis, organization, storage, searching, and retrieval of information.” (Salton, 1968).

• Applications of IR: recommendations, Q&A, filtering… and of course searching.

20

Page 18: Application of Ontology in Semantic Information Retrieval by Prof Shahrul Azman from FSTM, UKM

Issues in IR

• Some issues in IR:

– Relevance

– Evaluation

– Users and information needs

• Context based search

• Semantic search

• Etc.

21

Page 19: Application of Ontology in Semantic Information Retrieval by Prof Shahrul Azman from FSTM, UKM

IR process

22

Page 20: Application of Ontology in Semantic Information Retrieval by Prof Shahrul Azman from FSTM, UKM

ONTOLOGY + INFORMATION RETRIEVAL

23

Page 21: Application of Ontology in Semantic Information Retrieval by Prof Shahrul Azman from FSTM, UKM

Ontology and semantic search

• Various ways to support semantic search:

– Query expansion –users query are expanded with related

terminological terms

– Disambiguation – resolving terms or concepts when they

refer to more than one topics

– Classifying – classify documents such as ads into

ontological topics to support semantic search

– Enhanced IR model – embed ontology into existing IR

model resulting a modified IR model

25

Page 22: Application of Ontology in Semantic Information Retrieval by Prof Shahrul Azman from FSTM, UKM

Query Expansion

• Query expansion (QE) is needed due to the

ambiguity of natural language.

• Main aim of QE – to add new meaningful terms to

the initial query.

26

Bhogal, J., Macfarlane, A. & Smith, A. 2007. A review of ontology based query expansion. Information

Processing and Management, 43: 866-886.

Page 23: Application of Ontology in Semantic Information Retrieval by Prof Shahrul Azman from FSTM, UKM

Query Expansion

27

Page 24: Application of Ontology in Semantic Information Retrieval by Prof Shahrul Azman from FSTM, UKM

Semantic index

• Textual documents are indexed according to some ontology model.

• Remember the concept of vocabulary in IR?

31

architecture

bus

computer

database

….

xmlcomputer science

collection index terms or vocabulary

of the collection

IndexingExtract

Page 25: Application of Ontology in Semantic Information Retrieval by Prof Shahrul Azman from FSTM, UKM

Semantic index

• Textual documents are indexed according to some ontology model.

• Remember the concept of vocabulary in IR?

32

computer science

collection Replace the index with ontological-index

IndexingExtract

architecture

bus

computer

database

….

xml

Page 26: Application of Ontology in Semantic Information Retrieval by Prof Shahrul Azman from FSTM, UKM

Examples

• Three research projects that illustrate the

applications of ontology-based IR:

– Semantic digital library

– Crime news retrieval

– Multi modality ontology-based image retrieval

35

Page 27: Application of Ontology in Semantic Information Retrieval by Prof Shahrul Azman from FSTM, UKM

Semantic digital library

• Proposed an approach for managing, organizing and populating ontology for document collections in digital library.

• The document metadata and content are inserted and populated to a knowledge base which allows sophisticated query and searching.

• Firstly to propose an ontology based information retrieval model which is based on the classic vector space model which includes document annotation, instance-based weighting and concept-based ranking.

36

Page 28: Application of Ontology in Semantic Information Retrieval by Prof Shahrul Azman from FSTM, UKM

Semantic digital library

• General architecture

37

Page 29: Application of Ontology in Semantic Information Retrieval by Prof Shahrul Azman from FSTM, UKM

Semantic digital library

• Involved three

ontologies – ACM

Topic hierarchies,

Geo ontology and

Dublin core

metadata

• Portion of domain

ontology focusing

on academic thesis

38

Page 30: Application of Ontology in Semantic Information Retrieval by Prof Shahrul Azman from FSTM, UKM

Semantic digital library

• Document

annotation

39

Page 31: Application of Ontology in Semantic Information Retrieval by Prof Shahrul Azman from FSTM, UKM

Semantic digital library

• The process

40

Page 32: Application of Ontology in Semantic Information Retrieval by Prof Shahrul Azman from FSTM, UKM

VSM Index #create Class Person

#create instance of Class Student

<Student rdf:ID="Student1">

<rdfs:label>Arifah Alhadi</rdfs:label>

</Student>

<Student rdf:ID="Student2">

<rdfs:label

rdf:datatype="http://www.w3.org/2001/XMLSchema#string"

>Asyraf Arifin</rdfs:label>

</Student>

#Create Instance of Class Supervisor

<Supervisor rdf:ID="Supervisor1">

<rdfs:label>PM Dr Shahrul Azman</rdfs:label>

<rdfs:label>Prof. Madya Dr. Shahrul Azman Mohd

Noah</rdfs:label>

</Supervisor>

<Supervisor rdf:ID="Supervisor2">

<rdfs:label>Prof Aziz Deraman</rdfs:label>

</Supervisor>

Concept Instance Document

s

http://www.ukm.my/thesis/supervisor#

http://www.ukm.my/thesis/person#Supervisor1 Doc1

http://ukm.my/thesis/student#

http://ukm.my/thesis/creator#

http://ukm.my/thesis/person#

Student1 Doc1

http://ukm.my/thesis/student#

http://ukm.my/thesis/creator#

http://ukm.my/thesis/person#

Student2 Doc1

Id Term TFIDF Frq Doc

Id

1 Arifah Alhadi 0.11 2 Doc1

2 Asyraf Arifin 0.123 1 Doc1

3 PM Dr Shahrul

Azman

0.45 1 Doc1

Page 33: Application of Ontology in Semantic Information Retrieval by Prof Shahrul Azman from FSTM, UKM

Ontology-based IR for crime news retrieval

• Each crime news must be classified into categories: Traffic Violation, Theft, Sex Crime, Murder, Kidnap, Fraud, Drugs, Cybercrime, Arson and Gang (Chen et al. 2004)

• Useful entities need to be identified: Person, Location, Organisation, Date/Time, Weapon, Amount, Vehicle, Drug, Personel properties, and Age.

• Clustering of crime news into topics, e.g. Nurin Jazlin murder, Canny Ong, Sosilawati etc.

• Clustering of specific topic into various

and chronological events.

• Mapping of named entities into news

ontology to support semantic querying and retrieval.

42

Page 34: Application of Ontology in Semantic Information Retrieval by Prof Shahrul Azman from FSTM, UKM

Example

43

Murder Kidnap Theft Gang

Nurin Jazlin Sosilawati Canny Ong

Investigation into Canny Ong case

include medical report and trialEvidence/Suspect into Canny

Ong caseDNA test

Family reacts into Canny Ong and

negligence suitCourt Sentence, plead guilty

(17) (6) (3) (9)(13)

………………..

Classification

Clustering

Cluster into topics

Page 35: Application of Ontology in Semantic Information Retrieval by Prof Shahrul Azman from FSTM, UKM

Required methods

• In order to support the aforementioned

requirements:

– Conventional text processing - tokenizing, indexing,

stopping, stemming etc.

– Named entity recognition (NER)

– Classification and clustering

– Ontology mapping

44

Page 36: Application of Ontology in Semantic Information Retrieval by Prof Shahrul Azman from FSTM, UKM

46

PRE-PROCESSING TASK

DOCUMENT REPRESENTATION

DOCUMENT ORGANIZATION

+

+

• Stopword removal

• Stemming

• Parsing

• Indexing

• Bag of words

• Named entity

recognition

• Classification

- AdaBoost

• Clustering –

KNN

• Semantic

mapping

Page 37: Application of Ontology in Semantic Information Retrieval by Prof Shahrul Azman from FSTM, UKM

Document representation

• Documents will be presented into meaningful

forms:

– BoW – Bag of Words

– Named Entity Recognition – used the GATE Annie and

Jape rules

– Adopt the Vector Space Model (VSM) but enhanced with

ontological model

48

Page 38: Application of Ontology in Semantic Information Retrieval by Prof Shahrul Azman from FSTM, UKM

Document representation

49

Page 39: Application of Ontology in Semantic Information Retrieval by Prof Shahrul Azman from FSTM, UKM

Document organization

• Documents need to be organised into categories,

topics and events.

– Classification – Adaboost algorithm

– Clustering – Used the KNN clustering

– Ontology mapping – we have develop a crime news

ontology by extending the existing SNaP ontology.

Includes classes/entities which are important to crime

such as classification of crimes, location and weapon.

50

Page 40: Application of Ontology in Semantic Information Retrieval by Prof Shahrul Azman from FSTM, UKM

51

Asset ontology

Event ontology

Page 41: Application of Ontology in Semantic Information Retrieval by Prof Shahrul Azman from FSTM, UKM

Extending the SNaP ontology and

mapping to entities in news documents

52

SNaP

Crimepne:Event

pna:Asset

pns:Stuff

pns:Tangible

pns:Organizationpns:Location

pns:Person

event:Event

rdfs:subClassOf

rdfs:subClassOf

rdfs:subClassOf

pns:Weapon

pns:Vehicle

pnc:Classification

<Murder><Kidnap>

rdf:typerdf:type

rdfs:subClassOf

pne:

subeventOf

rdfs:domain

rdfs:range

<Event 1>

rdf:type

pnt:Tag

rdfs:subClassOfrdfs:subClassOf

pnc:Classifiable

pnc:

isClassifiedBy

rdfs:subClassOf

rdf:domain

rdf:range

rdfs:subClassOf

rdfs:subClassOf

rdfs:subClassOf

Page 42: Application of Ontology in Semantic Information Retrieval by Prof Shahrul Azman from FSTM, UKM

The Application

• What we need/desire.

53

Page 43: Application of Ontology in Semantic Information Retrieval by Prof Shahrul Azman from FSTM, UKM

Ontology-based Image Retrieval

• Rapid growth of visual information (VI) – lead to difficulty in finding and accessing VI.

• Inability to capture the semantic content.

• Problem arise – lack of coincidence between information extracted from VI and user needs.

• Conventional approaches of image retrieval (IMR) - TBIR and CBIR have reached their limit in attempting to solve this problem.

• As a result – SBIR approach,

ontology-based provide an explicit

domain oriented semantic for

concept and relationship.

55

Page 44: Application of Ontology in Semantic Information Retrieval by Prof Shahrul Azman from FSTM, UKM

Ontology-based Image Retrieval

• Illustrate how images are describes based on it

visual, textual and domain semantic features.

• Proposed a multi-modality ontology: visual

ontology, textual ontology and domain ontology.

• Illustrate how such ontology can be integrated with

open source knowledge base (DBpedia) to support a

more comprehensive search.

56

Page 45: Application of Ontology in Semantic Information Retrieval by Prof Shahrul Azman from FSTM, UKM

Proposed Approach

57

Page 46: Application of Ontology in Semantic Information Retrieval by Prof Shahrul Azman from FSTM, UKM

Example of multi-modality ontology

58

Page 47: Application of Ontology in Semantic Information Retrieval by Prof Shahrul Azman from FSTM, UKM

Example of Multi-modality ontology with

DBpedia

59

Page 48: Application of Ontology in Semantic Information Retrieval by Prof Shahrul Azman from FSTM, UKM

Conclusion - Practical implementation of

ontology-based IR

60

TBox

ABox

Ontology

Documents

Index

Extractionbuild

Population

Annotation

Query

Processing

query

ranked docs

Page 49: Application of Ontology in Semantic Information Retrieval by Prof Shahrul Azman from FSTM, UKM

Research issues

• Index representation – most still based

on the conventional VSM.

• Ranking – weighting and ranking

mechanisms

• Automatic population – supervised and

unsupervised

• Extraction & annotation

• Multilingual and cross-language

61

Page 50: Application of Ontology in Semantic Information Retrieval by Prof Shahrul Azman from FSTM, UKM

References

• Castells, P., Fernandez, M.,Vallet, D. 2007. An Adaptation of Vector Space Model for Ontology Based Information Retrieval. IEEE Transaction on Knowledge and Data Engineering, 19(2):

• Shahrul Azman Noah, Nor Afni Raziah Alias, Nurul Aida Osman, ZuraidahAbdullah, Nazlia Omar, Yazrina Yahya, Maryati Mohd Yusof: Ontology-Driven Semantic Digital Library. AIRS 2010: 141-150.

• Shahrul Azman Noah, Datul Aida Ali: The Role of Lexical Ontology in Expanding the Semantic Textual Content of On-Line News Images. AIRS 2010: 193-202.

• Fernández, M., Cantador, I., López, V. , Vallet, D., Castells, P., & Motta, E. 2011. Semantically enhanced information retrieval: an ontology-based approach. Web Semantics: Science, Services and Agents on the World Wide Web, 9: 434-452.

• Kara, S. Alan, O., Sabuncu, O., Akpınar, S., Cicekli N.K., & Alpaslan, F.N. 2012. An ontology-based retrieval system using semantic indexing. Information Systems, 37: 294-305.

• Kohler, J., Philippi, S., Specht, M., & Ruegg, A. 2006. Ontology based text indexing and querying for the semantic web. Knowledge-Based Systems, 19: 744-754.

• Etc.

62

Page 51: Application of Ontology in Semantic Information Retrieval by Prof Shahrul Azman from FSTM, UKM

Example - advanced application of

ontology

64

Page 52: Application of Ontology in Semantic Information Retrieval by Prof Shahrul Azman from FSTM, UKM

Watson – the science behind an answer

65

Page 53: Application of Ontology in Semantic Information Retrieval by Prof Shahrul Azman from FSTM, UKM

66

1 2 3 4

5 6 7 8

9 10 11

Group members:

1. Shahrul Azman Mohd. Noah

2. Juhana Salim

3. Masnizah Mohd

4. Nazlia Omar

5. Mohd Juzaiddin Ab Aziz

6. Nazlena Mohamad Ali

7. Saidah Saad

8. Shereena Mohd Arif

9. Lailaltulqadri Zakaria

10. Sabrina Tiun

11. Maryati Mohd. Yusof

Page 54: Application of Ontology in Semantic Information Retrieval by Prof Shahrul Azman from FSTM, UKM

END

67