인공지능 특강 프로젝트 - development of decision tree algorithm for semantic web data -

16
인인인인 인인 인인인인 - Development of Decision Tree Algorithm for Semantic Web data - 2010313148 인인인

Upload: alesia

Post on 25-Jan-2016

40 views

Category:

Documents


0 download

DESCRIPTION

인공지능 특강 프로젝트 - Development of Decision Tree Algorithm for Semantic Web data -. 2010313148 전동규. Agenda. Project Purpose Motivation Related Work Algorithm Description of Problem. 1. Project Purpose. Relational Data - Semantic Web Data (Linked data) -. Semantic Decision Tree. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: 인공지능 특강 프로젝트 - Development of Decision Tree Algorithm for Semantic Web data -

인공지능 특강 프로젝트- Development of Decision Tree Algorithm for Semantic Web data -

2010313148

전동규

Page 2: 인공지능 특강 프로젝트 - Development of Decision Tree Algorithm for Semantic Web data -

2

Agenda

1. Project Purpose

2. Motivation

3. Related Work

4. Algorithm

5. Description of Problem

Page 3: 인공지능 특강 프로젝트 - Development of Decision Tree Algorithm for Semantic Web data -

3

Relational Data- Semantic Web Data (Linked data) -

Decision Tree Algorithm- C4.5 Algorithm -

Semantic Decision Tree

1. Project Purpose

Page 4: 인공지능 특강 프로젝트 - Development of Decision Tree Algorithm for Semantic Web data -

4

1. Project Purpose

• Goal of project

Development of new kind of Decision Tree algorithm which

supports decision making based on Semantic Web environmental

information

Solve the several problems which is already solved by other

related researches

• Data : Linked Data(Semantic Web ontology)

Page 5: 인공지능 특강 프로젝트 - Development of Decision Tree Algorithm for Semantic Web data -

5

1. Project Purpose

• Ontology

Class : Definition of Set

Property : Relations between instances

Instance : Individuals which are belonged in classes

Schema of Example Ontology

Datatype Property Object Property

Class

Range type

Boolean

IntString

String

Rich

name

working_a

t

hasParent

Person

Workplace Location

hasChil

d PersonDoctor

Teacher

Student

Person

Hospital

School

age

String

name

Page 6: 인공지능 특강 프로젝트 - Development of Decision Tree Algorithm for Semantic Web data -

6

• Definition of Semantic Web

The Semantic Web is an evolving development of the World Wide Web in which

the meaning (semantics) of information and services on the web is defined,

making it possible for the web to understand and satisfy the requests of people

and machines to use the web content

Semantic Web is based on ontologies which corresponds to Semantic Web data

• Linked Data

The term Linked Data is used to describe a method of exposing, sharing, and

connecting data on the Web

What is Semantic Web ?

[1] Berners-Lee, T. (2001), “ The Semantic Web,” Scientific American, Vol. 501.

2. Motivation

Page 7: 인공지능 특강 프로젝트 - Development of Decision Tree Algorithm for Semantic Web data -

7

• Increase of Semantic Web Data

Appearance of Semantic Web Document Search Engines

Falcons : Twenty millions over RDF/XML Documents

Swoogle : Three millions over Semantic Web Documents

Open data in Semantic Web

LINKINGOPENDATA :

The goal of the W3C SWEO Linking Open Data community project is

to extend the Web with a data commons by publishing various open

data sets as RDF on the Web and by setting RDF links between data

items from different data sources

The data sets consist of over 4.7 billion RDF triples, which are

interlinked by around 142 million RDF links (May 2009).

Development status of Semantic Web

Date RDF/XML Quadruple

Falcons2009-09-02 21,639,337 2,936,868,638

2009-05-29 19,919,364 2,177,084,709

Date Semantic Web Document Triple

Swoogle 2009-10-17 3,109,616 1,065,799,526

2. Motivation

Page 8: 인공지능 특강 프로젝트 - Development of Decision Tree Algorithm for Semantic Web data -

8

• Increase of Semantic Web Data

Semantic Web based Portal Site

Twine :

Twine is a Semantic Web Portal that making networks of information

based on user’s posts which consist of their own information and

favorite things.

Every information composing Twine is written in RDF and OWL

format.

Twine have millions of visitors in a month, and they have over

millions of relationships between 3 millions of semantic tags (March

2009)

The necessity of mining useful knowledge from huge size ontology is highly expected. Therefore, Data Mining methodology for Semantic Web should be ready for this necessity.

Development status of Semantic Web

2. Motivation

Page 9: 인공지능 특강 프로젝트 - Development of Decision Tree Algorithm for Semantic Web data -

9

• Traditional Decision Tree algorithm is impossible to apply in

Semantic Web

Semantic Web based Ontology has special characteristics for mining

Since Semantic Web document has network structure, multi-value issue

is occurred

Traditional Decision Tree just uses value of variables. Therefore,

additional information of Semantic Web are can not be applied

Converting Semantic Web data into single table style that used to use in

traditional decision tree algorithm is impossible

Decision Tree in Semantic Web

2. Motivation

Page 10: 인공지능 특강 프로젝트 - Development of Decision Tree Algorithm for Semantic Web data -

10

• Arno J.Knobbe[2] developed decision tree algorithm for Multi-relational database

Selection Graph is suggested to do decision tree on RDB

Selection Graph is composed of Node, Edge, and condition and it can be

expressed in SQL syntax

3. Related Work

This research suggested partial solution about multi-value issue which also happened in Semantic Web ontology. However, this methodology can not be applied to Semantic Web which contains a lot of information than RDB

• David Jensen[3] suggested methodology that converting social network data to single table data which can be applied to Traditional Decision Tree algorithm

‘QGraph' that kind of query language to get the local network from entire social

network is suggested

QGraph is composed of Node, Edge, and condition and it can query many

objects at once

Since ontology information are manually converted to single table form, missing information will be occurred a lot

[2] Arno J. Knobbe.,  Arno Siebes.,  Danil Van Der Wallen.,  Syllogic B. V. (1999). “Multi-Relational Decision Tree Induction,” In Proceedings of PKDD’99, [3] D. Jensen., and J. Neville.(to appear) (2002). “Data mining in networks,” Papers of the Symposium on Dynamic Social Network Modeling and Analysis.

Page 11: 인공지능 특강 프로젝트 - Development of Decision Tree Algorithm for Semantic Web data -

11

4. Algorithm

• Search procedure of algorithm follows C4.5 algorithm

• New methodologies are required to learn concepts in ontology

• ‘Constructor’ can be used as similar as attributes in traditional

Decision tree

• Related works used the terms ‘Refinement’ as an attribute in

Decision Tree

Page 12: 인공지능 특강 프로젝트 - Development of Decision Tree Algorithm for Semantic Web data -

12

4. Algorithm

• What is a Refinement?

Refinement is a condition for split branches in decision tree. In this algorithm,

property and class from ontology are used as a refinement.

When define a refinement, Role Constructors from Description Logic are applied

to make the best use of information in Semantic Web

• Type of Refinements

Concept Constructor Refinement : Applying type information of instances

Cardinality restriction Refinement : Applying cardinality information on

object property

Domain restriction Refinement : Applying value of datatype property

Qualification Refinement : Applying information of quantification restrictions

and range class of object property

Refinements

Page 13: 인공지능 특강 프로젝트 - Development of Decision Tree Algorithm for Semantic Web data -

13

Refinements

Example

Concept Constructor Refinement

Hospital

not Human

Domain restriction Refinement

Age.(≥ 21)

Cardinality restriction

Refinement ≥ 3 hasChild

Qualification Refinement

hasChild.Blond

4. Algorithm

• Developed Refinements

Page 14: 인공지능 특강 프로젝트 - Development of Decision Tree Algorithm for Semantic Web data -

14

• The list of syntax information which can be expressed in ontology

Language Syntax

RDF rdf:type

RDFS

rdfs:domain

rdfs:range

rdfs:subClassOf

rdfs:subPropertyOf

OWL

owl:AllDifferent

owl:allValuesFrom

owl:cardinality

owl:Class

owl:complementOf

owl:DatatypeProperty

owl:DataRange

owl:differentFrom

owl:disjointWith

owl:equivalentClass

Language Syntax

OWL

owl:equivalentProperty

owl:FunctionalProperty

owl:hasValue

owl:intersectionOf

owl:InverseFunctionalProperty

owl:inverseOf

owl:masCardinality

owl:minCardinality

owl:ObjectProperty

owl:oneOf

owl:sameAs

owl:someValuesFrom

owl:SymmetricProperty

owl:TransitiveProperty

owl:unionOf

4. Algorithm

Page 15: 인공지능 특강 프로젝트 - Development of Decision Tree Algorithm for Semantic Web data -

15

5. Description of Problem

Train problem

Ten trains

• After learning, found definition of eastbound train is as follows

Page 16: 인공지능 특강 프로젝트 - Development of Decision Tree Algorithm for Semantic Web data -

16

5. Description of Problem

• Artificial task of learning to predict whether a train is headed

east or west

• Data is consist of relation tuples

• Relations eastbound(T) : train T is eastbound

has-car(T,C) : C is a car of T

infront(C,D) : car C is in front of D

long(C) : car C is long

open-rectangle(C) : car C is shaped as an open rectangle similar relations for

five other shapes

jagged-top(C) : C has a jagged top

sloping-top(C) : C has a sloping top

open-top(C) : C is open

contains-load(C,L) : C contains load L

1-item(C) : C has one load item similar relations for two and three load items

2-wheels(C) : C has two wheels

3-wheels(C) : C has three wheels

Train problem