personalized ontology model for web information gathering.docx

88
PERSONALIZED ONTOLOGY MODEL FOR WEB INFORMATION GATHERING By A PROJECT REPORT Submitted to the Department of Computer Science & Engineering in the FACULTY OF ENGINEERING & TECHNOLOGY In partial fulfillment of the requirements for the award of the degree Of MASTER OF TECHNOLOGY IN COMPUTER SCIENCE & ENGINEERING APRIL 2012

Upload: anu-disney

Post on 16-Aug-2015

216 views

Category:

Documents


1 download

TRANSCRIPT

PERSONALIZED ONTOLOGY MODEL FOR WEB INFORMATION GATHERINGBy A PRO1ECT REPORTSubmitted to the Department of Computer Science & Engineering in theFACULTY OF ENGNEE!NG & TEC"NOLOGYIn partial fulfillment of the requirements for the award of the degreeOfMASTER OF TECHNOLOGYINCOMPUTER SCIENCE & ENGINEERINGAPRIL 2012BONAFIDE CERTIFICATECertified that thi# pro$ect report tit%ed&Personalized Ontology Model for WebInformation Gathering i#the bonafide 'or( of )r* +++++++++++++,ho carried outthe re#earch under m- #uper.i#ion Certified further/ that to the be#t of m- (no'%edge the'or( reported herein doe# not form part of an- other pro$ect report or di##ertation on theba#i# of 'hich a degree or a'ard 'a# conferred on an ear%ier occa#ion on thi# or an-other candidate*Signature of the Guide Signature of the H.O.DName Name CHAPTER 01ABSTRACT:A# a mode% for (no'%edge de#cription and forma%i0ation/ onto%ogie# are 'ide%-u#edtorepre#ent u#er profi%e# inper#ona%i0ed'ebinformationgathering* "o'e.er/'hen repre#enting u#er profi%e#/ man- mode%# ha.e uti%i0ed on%- (no'%edge from either ag%oba% (no'%edge ba#e or u#er %oca% information* n thi# paper/ a per#ona%i0ed onto%og-mode% i# propo#ed for (no'%edge repre#entation and rea#oning o.er u#er profi%e#* Thi#mode% %earn# onto%ogica% u#er profi%e# from both a 'or%d (no'%edge ba#e and u#er %oca%in#tance repo#itorie#* The onto%og- mode% i# e.a%uated b- comparing it again#tbenchmar(mode%#in'ebinformationgathering* There#u%t##ho'that thi#onto%og-mode% i# #ucce##fu%.PRO1ECT PURPOSE:,eb1ba#ed information a.ai%ab%e ha# increa#ed dramatica%%-* "o'to gather u#efu%information fromthe 'ebha# become a cha%%enging i##ue for u#er#* Current 'ebinformationgathering#-#tem#attempt to#ati#f-u#erre2uirement#b-capturingtheirinformation need#* For thi# purpo#e/ u#er profi%e# are createdfor u#er bac(ground(no'%edgede#cription* U#erprofi%e# repre#entthe conceptmode%# po##e##ed b- u#er#'hen gathering 'eb information* A concept mode% i# imp%icit%- po##e##ed b- u#er# and i#generated from their bac(ground (no'%edge* ,hi%e thi# concept mode% cannot be pro.enin %aboratorie#/ man- 'eb onto%ogi#t# ha.e ob#er.ed it in u#er beha.ior* PRO1ECT SCOPE:Onto%og- mining di#co.er# intere#ting and on1topic (no'%edge fromthe concept#/#emantic re%ation#/ and in#tance# in an onto%og-* n thi# #ection/ a 3D onto%og- miningmethod i# introduced4 Specificit- and E5hau#ti.it-* Specificit- 6denoted #pe7 de#cribe# a#ub$ect8# focu# on a gi.en topic* E5hau#ti.it-re#trict# a #ub$ect8# #emantic #pace dea%ing'iththetopic*Thi#methodaim#toin.e#tigatethe#ub$ect#andthe#trengthoftheira##ociation# in an onto%og-*PRODUCT FEATURES:Onto%og-mode% inthi# paper pro.ide# a #o%utiontoempha#i0ingg%oba% and%oca%(no'%edge in a #ing%e computationa% mode%* The finding# in thi# paper can be app%ied tothe de#ign of 'eb information gathering #-#tem#* The mode% a%#o ha# e5ten#i.econtribution# to the fie%d# of nformation !etrie.a%/ 'eb nte%%igence/ !ecommendationS-#tem#/ and nformation S-#tem#* Onto%og-techni2ue#/ c%u#tering/ and c%a##ification inparticu%ar/ can he%p to e#tab%i#h the reference/ a# in the 'or( conducted * The c%u#teringtechni2ue# groupthe document# intoun#uper.i#ed c%u#ter# ba#edonthe documentfeature#* The#e feature#/ u#ua%%- repre#ented b- term#/ can be e5tracted from the c%u#ter#*The- repre#ent the u#er bac(ground (no'%edge di#co.ered from the u#er* INTRODUCTION:The amount of 'eb1ba#ed information a.ai%ab%e ha# increa#ed dramatica%%-* "o'togatheru#efu% informationfromthe'ebha#becomeacha%%engingi##ueforu#er#*Current 'ebinformationgathering#-#tem# attempt to#ati#f-u#er re2uirement# b-capturingtheirinformationneed#* Forthi#purpo#e/ u#erprofi%e#arecreatedforu#erbac(ground (no'%edge de#cription *U#er profi%e# repre#ent the concept mode%# po##e##edb- u#er# 'hen gathering 'eb information* A concept mode% i# imp%icit%- po##e##ed b-u#er# andi# generatedfromtheir bac(ground(no'%edge* ,hi%ethi# concept mode%cannot be pro.en in %aboratorie#/ man- 'eb onto%ogi#t# ha.e ob#er.ed it in u#er beha.ior*,hen u#er# read through a document/ the- can ea#i%- determine 'hether or not it i# oftheirintere#t orre%e.ancetothem/ a$udgment that ari#e#fromtheirimp%icit conceptmode%#* f a u#er8# concept mode% can be #imu%ated/ then a #uperior repre#entation of u#erprofi%e# can be bui%t* To#imu%ate u#er concept mode%#/ onto%ogie#9a (no'%edgede#cription and forma%i0ation mode%9are uti%i0ed in per#ona%i0ed 'eb informationgathering* Such onto%ogie# are ca%%ed onto%ogica% u#er profi%e# or per#ona%i0ed onto%ogie#*To repre#ent u#er profi%e#/ man- re#earcher# ha.e attempted to di#co.er u#er bac(ground(no'%edge through g%oba% or %oca% ana%-#i#* G%oba% ana%-#i# u#e# e5i#ting g%oba%(no'%edge ba#e# for u#er bac(ground (no'%edge repre#entation* Common%- u#ed(no'%edgeba#e# inc%udegenericonto%ogie# 6e*g*/,ordNet7/ the#auru#e# 6e*g*/ digita%%ibrarie#7/ and on%ine (no'%edge ba#e# 6e*g*/ on%ine categori0ation# and ,i(ipedia7* Theg%oba% ana%-#i# techni2ue# produce effecti.e :erformance for u#er bac(ground (no'%edgee5traction* "o'e.er/ g%oba% ana%-#i# i# %imited b- the 2ua%it- of the u#ed (no'%edge ba#e*For e5amp%e/ ,or%dNet 'a# reported a# he%pfu% in capturing u#er intere#t in #ome area#but u#e%e## for other#* Loca% ana%-#i# in.e#tigate# u#er %oca% information or ob#er.e# u#erbeha.ior in u#er profi%e#* For e5amp%e/ Li and ;hong di#co.ered ta5onomica% pattern#from the u#er#8 %oca% te5t document# to %earn onto%ogie# for u#er profi%e#* Some group#%earned per#ona%i0edonto%ogie#adapti.e%- fromu#er8#bro'#inghi#tor-* A%ternati.e%-/Se(ine and Su0u(i ana%-0ed 2uer- %og# to di#co.er u#er bac(ground (no'%edge* n #ome'or(#/ #uch a#/ u#er# 'ere pro.ided 'ith a #et of document# and a#(ed for re%e.ancefeedbac(* U#er bac(ground (no'%edge 'a# then di#co.ered from thi# feedbac( for u#erprofi%e#* "o'e.er/ becau#e %oca% ana%-#i# techni2ue# re%- on data mining or c%a##ificationtechni2ue#for(no'%edgedi#co.er-/occa#iona%%-thedi#co.eredre#u%t#containnoi#-anduncertaininformation* A# are#u%t/ %oca% ana%-#i# #uffer# fromineffecti.ene## atcapturing forma% u#er (no'%edge* From thi#/ 'e can h-pothe#i0e that u#er bac(groundc%u#tering 'ere #ugge#ted* The#e#trategie# 'i%% be in.e#tigated in future 'or( to #o%.e thi# prob%em* The in.e#tigation 'i%%e5tendthe app%icabi%it-of theonto%og-mode% tothe ma$orit-of the e5i#ting'ebdocument# and increa#e the contribution and #ignificance of the pre#ent 'or(*EXISTING SYSTEM:1. Golden Model: TREC Model:The T!EC mode% 'a# u#ed to demon#trate the inter.ie'ing u#er profi%e#/ 'hich ref%ectedu#er concept mode%# perfect%-* For each topic/ T!EC u#er# 'ere gi.en a #et of document#to read and $udged each a# re%e.ant or nonre%e.ant to the topic* The T!EC u#er profi%e#perfect%- ref%ected the u#er#8 per#ona% intere#t#/ a# the re%e.ant $udgment# 'ere pro.idedb-the#amepeop%e 'ho created the topic# a#'e%%/fo%%o'ing the fact thaton%-u#er#(no' their intere#t# and preference# perfect%-*2. Baseline Model: Category ModelThi# mode% demon#trated the noninter.ie'ing u#er profi%e#/ a u#er8# intere#t# andpreference# are de#cribed b- a #et of 'eighted #ub$ect# %earned from the u#er8# bro'#inghi#tor-* The#e#ub$ect# are#pecified'iththe#emantic re%ation# of #uper c%a## and#ubc%a## in onto%og-* ,hen an O?,AN agent recei.e# the #earch re#u%t# for a gi.entopic/ it fi%ter# and reran(# the re#u%t# ba#ed on their #emantic #imi%arit- 'ith the #ub$ect#*The #imi%ar document# are a'arded and reran(ed higher on the re#u%t %i#t*3. Baseline Model: Web Model The 'eb mode% 'a# the imp%ementation of t-pica% #emi inter.ie'ing u#erprofi%e#* t ac2uired u#er profi%e# from the 'eb b- emp%o-ing a 'eb #earch engine* Thefeature term# referred to the intere#ting concept# of the topic* The noi#- term# referred tothe parado5ica% or ambiguou# concept#*LIMITATIONS OF EXISTING SYSTEM: The topic co.erage of T!EC profi%e# 'a# %imited* The T!EC u#er profi%e# hadgood preci#ion but re%ati.e%- poor reca%% performance* U#ing 'eb document# for training #et# ha# one #e.ere dra'bac(4 'eb informationha# much noi#e and uncertaintie#* A# a re#u%t/ the 'eb u#er profi%e# 'ere#ati#factor-interm#of reca%%/ but 'ea(interm#of preci#ion* There'a#nonegati.e training #et generated b- thi# mode%PROPOSED SYSTEM:The 'or%d(no'%edge andau#er8# %oca% in#tance repo#itor-6L!7 are u#edinthepropo#ed mode%* @7 ,or%d (no'%edge i# common#en#e (no'%edge ac2uired b- peop%e from e5perience andeducation 37 An L! i# a u#er8# per#ona% co%%ection of information item#* From a 'or%d (no'%edgeba#e/ 'econ#truct per#ona%i0edonto%ogie# b-adoptingu#er feedbac(onintere#ting(no'%edge* A mu%tidimen#iona% onto%og- mining method/ Specificit- and e5hau#ti.e%-/ i#a%#o introduced in the propo#ed mode% for ana%-0ing concept# #pecified in onto%ogie#* Theu#er#8 L!# are then u#edto di#co.er bac(ground (no'%edge and to popu%ate theper#ona%i0ed onto%ogie#* ADVANTAGES OF PROPOSED SYSTEM:Compared'iththe T!ECmode%/ theOnto%og-mode% hadbetterreca%%butre%ati.e%-'ea(er preci#ion performance* The Onto%og- mode% di#co.ered u#er bac(ground(no'%edge from u#er %oca% in#tance repo#itorie#/ rather than document# read and $udgedb- u#er#* Thu#/ the Onto%og- u#er profi%e# 'ere not a# preci#e a# the T!EC u#er profi%e#* The Onto%og- profi%e# had broad topic co.erage* The #ub#tantia% co.erage of po##ib%-1re%atedtopic# 'a#gainedfrom theu#eofthe ,