slides

12
INFORMATION EXTRACTION FROM QUERIES Ed Snelson, Joaquin Quiñonero Candela, Ralf Herbrich, Thore Graepel

Upload: hondafanatics

Post on 18-Jan-2015

251 views

Category:

Documents


3 download

DESCRIPTION

 

TRANSCRIPT

Page 1: slides

INFORMATION EXTRACTION FROM QUERIESEd Snelson, Joaquin Quiñonero Candela, Ralf Herbrich, Thore Graepel

Page 2: slides

Information extraction from queries

Page 3: slides

Templates

Page 4: slides

Probabilistic query modelling

Page 5: slides

Key details

EP message passing for inference within single query model

ADF single pass through queries Sparse messages within query Bootstrap from initial seed sets of

instances/attributes Directed processing of queries based on

current top beliefs

Page 6: slides

Data

10 months, Live Search query logs 100 Million unique queries, with

associated counts Preliminary experiments on small

specific subsets e.g. 50,000 unique queries related to

actors, cars and national parks

Page 7: slides

Seed lists

Page 8: slides

Actors

Instances Attributes

tom cruise moviesbrad pitt picturesjohnny depp dealer.commatt damon photosgeorge clooney angelina joliecameron diaz nudescarlett johansson biographymel gibson newsgrand canyon heightsharon stone wedding

Page 9: slides

Cars

Instances Attributes

dealer {Year}honda civic partshonda accord hybridford mustang dealerdodge charger usedtoyota camry worldford explorer accessoriestoyota corolla fordford focus cleveland plaindodge durango wachovia

Page 10: slides

National Parks

Instances Attributes

grand canyon national parkyellowstone parkyosemite toursredwood lodgingdenali hotelseverglades lodgealgonquin westjoshua tree skywalkwest yellowstone gmcshenandoah college

Page 11: slides

Templates

Templates

[Inst] [Attr][Attr] [Inst]{Year} [Inst] [Attr][Attr] of [Inst][Inst] and [Attr][Attr] and [Inst][Attr] in [Inst]the [Attr] [Inst]how [Attr] is [Inst][Attr] [Inst] coupe[Attr] [Inst] partsthe [Inst] [Attr][Inst] 's [Attr][Inst] in [Attr]

Page 12: slides

Future improvements

Class/Attribute dependent templates A garbage class to deal with “noise” Reducing sensitivity to order of

processing initial queries Disambiguation, synonyms etc. Use of part-of-speech tagger Combination with standard hand-crafted

entity extraction techniques