efficient interactive fuzzy keyword search
Click here to load reader
Post on 22-Feb-2016
67 views
Embed Size (px)
DESCRIPTION
Efficient Interactive Fuzzy Keyword Search. Shengyue Ji , Guoliang Li, Jianhua Feng , Chen Li University of California, Irvine WWW 2009 1 Dec 2011 Presentation @ IDB Lab. Seminar Presented by Jee -bum Park. Outline . Introduction Indexing Methods Single Keyword Multiple Keywords - PowerPoint PPT PresentationTRANSCRIPT
Efficient Interactive Fuzzy Keyword Search
Efficient Interactive Fuzzy Keyword SearchShengyue Ji, Guoliang Li, Jianhua Feng , Chen LiUniversity of California, IrvineWWW 2009
1 Dec 2011Presentation @ IDB Lab. SeminarPresented by Jee-bum Park1Outline IntroductionIndexing MethodsSingle KeywordMultiple KeywordsExperimentsConclusions2Introductionhttp://searchenginewatch.com/article/2128218/Google-Searchers-Use-Autocomplete-Most-Ignore-Google-Instant-Eye-Tracking-Study
3
Introductionhttp://searchenginewatch.com/article/2128218/Google-Searchers-Use-Autocomplete-Most-Ignore-Google-Instant-Eye-Tracking-Study
4
Introductionhttp://searchenginewatch.com/article/2128218/Google-Searchers-Use-Autocomplete-Most-Ignore-Google-Instant-Eye-Tracking-Study
5
Introductionhttp://searchenginewatch.com/article/2128218/Google-Searchers-Use-Autocomplete-Most-Ignore-Google-Instant-Eye-Tracking-Study
6
Introductionhttp://searchenginewatch.com/article/2128218/Google-Searchers-Use-Autocomplete-Most-Ignore-Google-Instant-Eye-Tracking-Study
7
Introductionhttp://searchenginewatch.com/article/2128218/Google-Searchers-Use-Autocomplete-Most-Ignore-Google-Instant-Eye-Tracking-Study
8
Introductionhttp://searchenginewatch.com/article/2128218/Google-Searchers-Use-Autocomplete-Most-Ignore-Google-Instant-Eye-Tracking-Study
9
Introductionhttp://searchenginewatch.com/article/2128218/Google-Searchers-Use-Autocomplete-Most-Ignore-Google-Instant-Eye-Tracking-Study
10
Introductionhttp://searchenginewatch.com/article/2128218/Google-Searchers-Use-Autocomplete-Most-Ignore-Google-Instant-Eye-Tracking-Study
11
IntroductionA typical directory-search form12
IntroductionInteractive fuzzy search13
Introductioninteractive, fuzzy searchInteractiveThe system searches for the best answers on the fly as the user types in a keyword queryFuzzyThe system tries to find relevant records that include words similar to the keywords in the query, even if they do not match exactly
14Outline IntroductionIndexing MethodsSingle KeywordMultiple KeywordsExperimentsConclusions15Indexing MethodsList16Prefix queryInverted indexli1lin3, 4liu5lu4luis7Indexing MethodsListTyped li17Prefix queryInverted indexli1lin3, 4liu5lu4luis7Indexing MethodsListTyped lu18Prefix queryInverted indexli1lin3, 4liu5lu4luis7Indexing MethodsTrie1910: l0: \014: u15: i16: s11: i12: n13: u3, 45741Indexing MethodsTrieTyped li2010: l0: \014: u15: i16: s11: i12: n13: u3, 45741Indexing MethodsTrieTyped li2110: l0: \014: u15: i16: s11: i12: n13: u3, 45741Indexing MethodsTrieTyped li2210: l0: \014: u15: i16: s11: i12: n13: u3, 45741Indexing MethodsTrieTyped li2310: l0: \014: u15: i16: s11: i12: n13: u3, 45741Indexing MethodsTrieTyped li2410: l0: \014: u15: i16: s11: i12: n13: u3, 45741Indexing MethodsTrieTyped li2510: l0: \014: u15: i16: s11: i12: n13: u3, 45741Indexing MethodsTrieTyped li2610: l0: \014: u15: i16: s11: i12: n13: u3, 45741Outline IntroductionIndexing MethodsSingle KeywordMultiple KeywordsExperimentsConclusions27Single Keyword28
Single KeywordExampleQuery = nlis, edit distance threshold = 22910: l0: \014: u15: i16: s11: i12: n13: u3, 45741012Edit distanceSingle KeywordInitial state: Query = nlis, edit distance threshold = 2
3010: l0: \014: u15: i16: s11: i12: n13: u3, 45741012Edit distanceDeleteSubstituteMatchInsert
Single KeywordTyped: nQuery = nlis, edit distance threshold = 2
31012Edit distanceDeleteSubstituteMatchInsert
10: l0: \014: u15: i16: s11: i12: n13: u3, 45741Single KeywordTyped: nQuery = nlis, edit distance threshold = 2
3210: l0: \014: u15: i16: s11: i12: n13: u3, 45741012Edit distanceDeleteSubstituteMatchInsert
n, , , , Single KeywordTyped: nQuery = nlis, edit distance threshold = 2
33012Edit distancenDeleteSubstituteMatchInsert
10: l0: \014: u15: i16: s11: i12: n13: u3, 45741Single KeywordTyped: nlQuery = nlis, edit distance threshold = 2
34012Edit distancenDeleteSubstituteMatchInsert
10: l0: \014: u15: i16: s11: i12: n13: u3, 45741Single KeywordTyped: nlQuery = nlis, edit distance threshold = 2
35012Edit distancenDeleteSubstituteMatchInsert
10: l0: \014: u15: i16: s11: i12: n13: u3, 45741nl, , , Single KeywordTyped: nlQuery = nlis, edit distance threshold = 2
36012Edit distancenlDeleteSubstituteMatchInsert
10: l0: \014: u15: i16: s11: i12: n13: u3, 45741Single KeywordTyped: nliQuery = nlis, edit distance threshold = 2
37012Edit distancenlDeleteSubstituteMatchInsert
10: l0: \014: u15: i16: s11: i12: n13: u3, 45741Single KeywordTyped: nliQuery = nlis, edit distance threshold = 2
38012Edit distancenlDeleteSubstituteMatchInsert
10: l0: \014: u15: i16: s11: i12: n13: u3, 45741nli, , , , , Single KeywordTyped: nliQuery = nlis, edit distance threshold = 2
39012Edit distancenliDeleteSubstituteMatchInsert
10: l0: \014: u15: i16: s11: i12: n13: u3, 45741Single KeywordTyped: nlisQuery = nlis, edit distance threshold = 2
40012Edit distancenliDeleteSubstituteMatchInsert
10: l0: \014: u15: i16: s11: i12: n13: u3, 45741Single KeywordTyped: nlisQuery = nlis, edit distance threshold = 2
41012Edit distancenliDeleteSubstituteMatchInsert
10: l0: \014: u15: i16: s11: i12: n13: u3, 45741nlis, , , Single KeywordTyped: nlisQuery = nlis, edit distance threshold = 2
42012Edit distancenliDeleteSubstituteMatchInsert
10: l0: \014: u15: i16: s11: i12: n13: u3, 45741nlis, , , Outline IntroductionIndexing MethodsSingle KeywordMultiple KeywordsExperimentsConclusions43Multiple KeywordsChallenges in multiple keywordsIntersection of multiple lists of keywordsEach prefix query keyword hasMultiple predicted complete keywordsThe union of the lists of predicted keywords includes potential answersThe union lists of multiple query keywords need to be intersected in order to compute the answers to the queryCache-based incremental intersection
44Multiple KeywordsHYB (H. Bast, I. Weber. Type Less, Find More: Fast Autocompletion Search with a Succinct Index. In SIGIR 2006)
The intersections can be computed in
The union can be computed in
Total time complexity
45D.idD.content21apple iphone33php programming64apple juice91iphone programming172iphone galaxy tab308application iphone759difference ipv4 ipv6WNew Data Structure (HYB)ipho950(ipho)900(iph), 1000, ...64, 128, 256, 900(juice), 950(juice), ...iphjuiceiphone1, 5, 21, 91, 172, 300, 308, 3000, 3001, ...759(ipv4), 760, ...400, 759(ipv6), 800(ipv6), ...5(ipv), 6, 1100, 1200, ...5(tab), 172, 272, 800(tab), ...ipv4ipv6ipvtabiphonNULL5, 3000, 5123, ...ipW = { iphone, ipv4, ipv6 }D Dw = D = { 21, 172, 308, 759 }
Multiple KeywordsForward lists46
Outline IntroductionIndexing MethodsSingle KeywordMultiple KeywordsExperimentsConclusions47ExperimentsDBLPIt included about one million computer science publication recordsAuthors, title, conference or journal name, year, page numbers, URLMEDLINEIt had about 4 million latest publication records related to life sciences and biomedical informationAuthors, their affiliations, article title, journal name, journal issue
48
ExperimentsComputing prefixes similar to a keyword49
ExperimentsList intersection of multiple keywords50
ExperimentsScalability (MEDLINE)51
Outline IntroductionIndexing MethodsSingle KeywordMultiple KeywordsExperimentsConclusions52ConclusionsThey proposed an efficient incremental algorithm to answer single-keyword fuzzy queries
They studied various algorithms for computing the answers to a query with multiple keywords that are treated as fuzzy, prefix conditions
53Thank You!Any Questions or Comments?54