efficient interactive fuzzy keyword search

Click here to load reader

Post on 22-Feb-2016

67 views

Category:

Documents

0 download

Embed Size (px)

DESCRIPTION

Efficient Interactive Fuzzy Keyword Search. Shengyue Ji , Guoliang Li, Jianhua Feng , Chen Li University of California, Irvine WWW 2009 1 Dec 2011 Presentation @ IDB Lab. Seminar Presented by Jee -bum Park. Outline . Introduction Indexing Methods Single Keyword Multiple Keywords - PowerPoint PPT Presentation

TRANSCRIPT

Efficient Interactive Fuzzy Keyword Search

Efficient Interactive Fuzzy Keyword SearchShengyue Ji, Guoliang Li, Jianhua Feng , Chen LiUniversity of California, IrvineWWW 2009

1 Dec 2011Presentation @ IDB Lab. SeminarPresented by Jee-bum Park1Outline IntroductionIndexing MethodsSingle KeywordMultiple KeywordsExperimentsConclusions2Introductionhttp://searchenginewatch.com/article/2128218/Google-Searchers-Use-Autocomplete-Most-Ignore-Google-Instant-Eye-Tracking-Study

3

Introductionhttp://searchenginewatch.com/article/2128218/Google-Searchers-Use-Autocomplete-Most-Ignore-Google-Instant-Eye-Tracking-Study

4

Introductionhttp://searchenginewatch.com/article/2128218/Google-Searchers-Use-Autocomplete-Most-Ignore-Google-Instant-Eye-Tracking-Study

5

Introductionhttp://searchenginewatch.com/article/2128218/Google-Searchers-Use-Autocomplete-Most-Ignore-Google-Instant-Eye-Tracking-Study

6

Introductionhttp://searchenginewatch.com/article/2128218/Google-Searchers-Use-Autocomplete-Most-Ignore-Google-Instant-Eye-Tracking-Study

7

Introductionhttp://searchenginewatch.com/article/2128218/Google-Searchers-Use-Autocomplete-Most-Ignore-Google-Instant-Eye-Tracking-Study

8

Introductionhttp://searchenginewatch.com/article/2128218/Google-Searchers-Use-Autocomplete-Most-Ignore-Google-Instant-Eye-Tracking-Study

9

Introductionhttp://searchenginewatch.com/article/2128218/Google-Searchers-Use-Autocomplete-Most-Ignore-Google-Instant-Eye-Tracking-Study

10

Introductionhttp://searchenginewatch.com/article/2128218/Google-Searchers-Use-Autocomplete-Most-Ignore-Google-Instant-Eye-Tracking-Study

11

IntroductionA typical directory-search form12

IntroductionInteractive fuzzy search13

Introductioninteractive, fuzzy searchInteractiveThe system searches for the best answers on the fly as the user types in a keyword queryFuzzyThe system tries to find relevant records that include words similar to the keywords in the query, even if they do not match exactly

14Outline IntroductionIndexing MethodsSingle KeywordMultiple KeywordsExperimentsConclusions15Indexing MethodsList16Prefix queryInverted indexli1lin3, 4liu5lu4luis7Indexing MethodsListTyped li17Prefix queryInverted indexli1lin3, 4liu5lu4luis7Indexing MethodsListTyped lu18Prefix queryInverted indexli1lin3, 4liu5lu4luis7Indexing MethodsTrie1910: l0: \014: u15: i16: s11: i12: n13: u3, 45741Indexing MethodsTrieTyped li2010: l0: \014: u15: i16: s11: i12: n13: u3, 45741Indexing MethodsTrieTyped li2110: l0: \014: u15: i16: s11: i12: n13: u3, 45741Indexing MethodsTrieTyped li2210: l0: \014: u15: i16: s11: i12: n13: u3, 45741Indexing MethodsTrieTyped li2310: l0: \014: u15: i16: s11: i12: n13: u3, 45741Indexing MethodsTrieTyped li2410: l0: \014: u15: i16: s11: i12: n13: u3, 45741Indexing MethodsTrieTyped li2510: l0: \014: u15: i16: s11: i12: n13: u3, 45741Indexing MethodsTrieTyped li2610: l0: \014: u15: i16: s11: i12: n13: u3, 45741Outline IntroductionIndexing MethodsSingle KeywordMultiple KeywordsExperimentsConclusions27Single Keyword28

Single KeywordExampleQuery = nlis, edit distance threshold = 22910: l0: \014: u15: i16: s11: i12: n13: u3, 45741012Edit distanceSingle KeywordInitial state: Query = nlis, edit distance threshold = 2

3010: l0: \014: u15: i16: s11: i12: n13: u3, 45741012Edit distanceDeleteSubstituteMatchInsert

Single KeywordTyped: nQuery = nlis, edit distance threshold = 2

31012Edit distanceDeleteSubstituteMatchInsert

10: l0: \014: u15: i16: s11: i12: n13: u3, 45741Single KeywordTyped: nQuery = nlis, edit distance threshold = 2

3210: l0: \014: u15: i16: s11: i12: n13: u3, 45741012Edit distanceDeleteSubstituteMatchInsert

n, , , , Single KeywordTyped: nQuery = nlis, edit distance threshold = 2

33012Edit distancenDeleteSubstituteMatchInsert

10: l0: \014: u15: i16: s11: i12: n13: u3, 45741Single KeywordTyped: nlQuery = nlis, edit distance threshold = 2

34012Edit distancenDeleteSubstituteMatchInsert

10: l0: \014: u15: i16: s11: i12: n13: u3, 45741Single KeywordTyped: nlQuery = nlis, edit distance threshold = 2

35012Edit distancenDeleteSubstituteMatchInsert

10: l0: \014: u15: i16: s11: i12: n13: u3, 45741nl, , , Single KeywordTyped: nlQuery = nlis, edit distance threshold = 2

36012Edit distancenlDeleteSubstituteMatchInsert

10: l0: \014: u15: i16: s11: i12: n13: u3, 45741Single KeywordTyped: nliQuery = nlis, edit distance threshold = 2

37012Edit distancenlDeleteSubstituteMatchInsert

10: l0: \014: u15: i16: s11: i12: n13: u3, 45741Single KeywordTyped: nliQuery = nlis, edit distance threshold = 2

38012Edit distancenlDeleteSubstituteMatchInsert

10: l0: \014: u15: i16: s11: i12: n13: u3, 45741nli, , , , , Single KeywordTyped: nliQuery = nlis, edit distance threshold = 2

39012Edit distancenliDeleteSubstituteMatchInsert

10: l0: \014: u15: i16: s11: i12: n13: u3, 45741Single KeywordTyped: nlisQuery = nlis, edit distance threshold = 2

40012Edit distancenliDeleteSubstituteMatchInsert

10: l0: \014: u15: i16: s11: i12: n13: u3, 45741Single KeywordTyped: nlisQuery = nlis, edit distance threshold = 2

41012Edit distancenliDeleteSubstituteMatchInsert

10: l0: \014: u15: i16: s11: i12: n13: u3, 45741nlis, , , Single KeywordTyped: nlisQuery = nlis, edit distance threshold = 2

42012Edit distancenliDeleteSubstituteMatchInsert

10: l0: \014: u15: i16: s11: i12: n13: u3, 45741nlis, , , Outline IntroductionIndexing MethodsSingle KeywordMultiple KeywordsExperimentsConclusions43Multiple KeywordsChallenges in multiple keywordsIntersection of multiple lists of keywordsEach prefix query keyword hasMultiple predicted complete keywordsThe union of the lists of predicted keywords includes potential answersThe union lists of multiple query keywords need to be intersected in order to compute the answers to the queryCache-based incremental intersection

44Multiple KeywordsHYB (H. Bast, I. Weber. Type Less, Find More: Fast Autocompletion Search with a Succinct Index. In SIGIR 2006)

The intersections can be computed in

The union can be computed in

Total time complexity

45D.idD.content21apple iphone33php programming64apple juice91iphone programming172iphone galaxy tab308application iphone759difference ipv4 ipv6WNew Data Structure (HYB)ipho950(ipho)900(iph), 1000, ...64, 128, 256, 900(juice), 950(juice), ...iphjuiceiphone1, 5, 21, 91, 172, 300, 308, 3000, 3001, ...759(ipv4), 760, ...400, 759(ipv6), 800(ipv6), ...5(ipv), 6, 1100, 1200, ...5(tab), 172, 272, 800(tab), ...ipv4ipv6ipvtabiphonNULL5, 3000, 5123, ...ipW = { iphone, ipv4, ipv6 }D Dw = D = { 21, 172, 308, 759 }

Multiple KeywordsForward lists46

Outline IntroductionIndexing MethodsSingle KeywordMultiple KeywordsExperimentsConclusions47ExperimentsDBLPIt included about one million computer science publication recordsAuthors, title, conference or journal name, year, page numbers, URLMEDLINEIt had about 4 million latest publication records related to life sciences and biomedical informationAuthors, their affiliations, article title, journal name, journal issue

48

ExperimentsComputing prefixes similar to a keyword49

ExperimentsList intersection of multiple keywords50

ExperimentsScalability (MEDLINE)51

Outline IntroductionIndexing MethodsSingle KeywordMultiple KeywordsExperimentsConclusions52ConclusionsThey proposed an efficient incremental algorithm to answer single-keyword fuzzy queries

They studied various algorithms for computing the answers to a query with multiple keywords that are treated as fuzzy, prefix conditions

53Thank You!Any Questions or Comments?54