Transcript
Page 1: HyKSS: Hybrid Keyword and Semantic Search

HyKSS: Hybrid Keyword and Semantic Search

Andrew Zitzelberger

1

Page 2: HyKSS: Hybrid Keyword and Semantic Search

Keyword Search

2

Page 3: HyKSS: Hybrid Keyword and Semantic Search

Form Based Search

3

Page 4: HyKSS: Hybrid Keyword and Semantic Search

4

over 8,000 meters in elevation less than 100K miles faster than 100 mph

What about?

Page 5: HyKSS: Hybrid Keyword and Semantic Search

5

Page 6: HyKSS: Hybrid Keyword and Semantic Search

HyKSS

• Hybrid Keyword and Semantic Search• Semantics – extracted annotations–Multiple ontologies

• Keywords – text

6

Page 7: HyKSS: Hybrid Keyword and Semantic Search

Thesis Statement

• HyKSS (hybrid search)– Outperforms keyword and semantic search– Dynamic query weighting outperforms various

other hybrid search approaches– Allows queries over multiple ontologies– Allows pay-as-you-go improvement

7

Page 8: HyKSS: Hybrid Keyword and Semantic Search

Extraction Ontologies

8

Page 9: HyKSS: Hybrid Keyword and Semantic Search

Data Frames

9

Page 10: HyKSS: Hybrid Keyword and Semantic Search

Indexing Architecture

10

Keyword Indexer Semantic Indexer

Keyword Index Semantic Index

Document Collection

Page 11: HyKSS: Hybrid Keyword and Semantic Search

Indexing Architecture Implementation

1111

Keyword Indexer

Semantic Indexer

Keyword Index

Semantic Index

Document Collection

OntoES

OntologyLibrary

Sesame

Lucene

Page 12: HyKSS: Hybrid Keyword and Semantic Search

Query Processing

12

Free Form Query

Execute Query

Post-Process Query

Combine Results

Pre-Process Query

Execute Query

Post-Process Query

Pre-Process Query

Keyword Processing Semantic Processing

Page 13: HyKSS: Hybrid Keyword and Semantic Search

Keyword Query Pre-Processing

13

• Remove Lucene special characters (except quotes)• Remove (inequality) comparison constraints• Remove non-phrase stopwords

hondas in "excellent condition" in orem for under 12 grand

hondas “excellent condition” orem

Page 14: HyKSS: Hybrid Keyword and Semantic Search

Keyword Query Execution and Post-Processing

• Executed by Lucene• Empty Post-Processing step

14

Page 15: HyKSS: Hybrid Keyword and Semantic Search

Semantic Query Pre-ProcessingIndividual Ontology Scoring

hondas in "excellent condition" in orem for under 12 grand

15

Page 16: HyKSS: Hybrid Keyword and Semantic Search

Semantic Query Pre-ProcessingOntology Set Creation

• For each ontology sorted by score:– For each remaining ontology:• Add point for each new or subsuming match• If added points > 0 add ontology

• Completely subsumed ontologies are removed during query generation

16

Page 17: HyKSS: Hybrid Keyword and Semantic Search

Semantic Query Pre-ProcessingOntology Set Creation

17

Price < 12000

LocationVehicle

ContractualServices Location

Vehicle

ContractualServices

Vehicle_Score + 1

US_City=“orem”

Price < 12000

Price < 12000

ContractualServices_Score + 1 Vehicle_Score

US_City=“orem”

Page 18: HyKSS: Hybrid Keyword and Semantic Search

Semantic Query Pre-ProcessingStructured Query Generation

• Open world assumption• SPARQL query

18

Page 19: HyKSS: Hybrid Keyword and Semantic Search

Semantic Query Execution and Post-Processing

• Sesame query execution• Semantic ranking:– 1 point for each requested projection satisfied– Normalized by # of projections requested

hondas in "excellent condition" in orem for under 12 grand– Projections on Make, Price and US_City

19

Page 20: HyKSS: Hybrid Keyword and Semantic Search

Hybrid Query Processing

• Linear interpolation:– (kw_weight * kw_score) + (sm_weight * sm_score)

• Dynamic solution:– # keywords remaining (#kw)– concept match score (cms)

= ½ * (selections + projections)– kw_weight = #kw/(#kw + cms)– sm_weight = cms/(#kw + cms)

20

Page 21: HyKSS: Hybrid Keyword and Semantic Search

Basic Search

21

Page 22: HyKSS: Hybrid Keyword and Semantic Search

Results Display

22

Page 23: HyKSS: Hybrid Keyword and Semantic Search

23

Form Based Search

Page 24: HyKSS: Hybrid Keyword and Semantic Search

Results Display

Page 25: HyKSS: Hybrid Keyword and Semantic Search

Experimental Setup – Ontology Libraries

• 5 Ontology Levels– Number– Generic Units– Vehicle Units– Vehicle– Vehicle+

25

Page 26: HyKSS: Hybrid Keyword and Semantic Search

Experimental Setup – Query Sets

• 113 syntactically unique queries from database students

• 60 syntactically unique queries from linguistic students

26

Page 27: HyKSS: Hybrid Keyword and Semantic Search

Experimental Setup – Document Collection

• 250 vehicle advertisements (Craigslist)– 100 training, 50 validation, 100 test

• 318 mountain pages (Wikipedia)• 66 roller coaster (Wikipedia)• 88 video game advertisements (Craigslist)

27

Page 28: HyKSS: Hybrid Keyword and Semantic Search

Experiments

1) Training queries over test vehicle documents2) Test queries over test vehicle documents3) Training queries over test vehicle documents +

additional noise4) Test queries over test vehicle documents + additional

noise5) 5 queries over noisy data (Generic Units only)

28

Page 29: HyKSS: Hybrid Keyword and Semantic Search

Experiments - Metric

• Mean Average Precision

29

Page 30: HyKSS: Hybrid Keyword and Semantic Search

Experimental Results

30

Page 31: HyKSS: Hybrid Keyword and Semantic Search

Experimental Results

31

Page 32: HyKSS: Hybrid Keyword and Semantic Search

Experimental Results

32

Page 33: HyKSS: Hybrid Keyword and Semantic Search

Conclusions

• Hybrid search outperforms keyword and semantic search

• HyKSS’s dynamic query weighting approach outperforms various other weighting techniques

• Using multiple does not outperform selecting and using a single ontology

33

Page 34: HyKSS: Hybrid Keyword and Semantic Search

External Image Citations• Slide 2 Google search screenshot: http://www.google.com (07/30/11)• Slide 3 partial car search form screenshots: http://autotrader.com/fyc (07/30/11)• Slide 4 mountain image: http://en.wikipedia.org/wiki/Lhotse (04/26/11)• Slide 4 car image: http://en.wikipedia.org/wiki/Honda (04/26/11)• Slide 4 roller coaster image: http://en.wikipedia.org/wiki/Kingda_Ka (04/26/11)• Slide 4 Wikipedia logo: http://en.wikipedia.org/wiki/Main_Page (04/26/11)• Slide 4 craigslist logo: http://provo.craigslist.org/ (04/26/11)

34


Top Related