learning to query: focused web page harvesting for entity ... · learning to query: focused web...

Learning to Query:Focused Web Page Harvesting for Entity Aspects

Yuan Fang1 Vincent Zheng2 Kevin Chang23

Problem: Learning to Query (L2Q)

1 Institute for Infocomm Research, Singapore2 Advanced Digital Sciences Center, Singapore3 University of Illinois at Urbana-Champaign, USA

Subproblem #1: Domain-aware L2Q

Subproblem #2: Context-aware L2Q

Result A: Effect of Domain+Context

Entity type Common aspects

celebrity spouse, age, net worth

car safety, cost, interior

business address, hours, phone

abundant but scattered

Business Analytics

Vertical portal or search

Overall Workflow: Iterative Querying

Keywords (uniquely) identifying the entitySeed query:

Target aspect:A pre-trained classifierfor the target aspect

Utility: (precision/recall)

In each iteration, ∗ argmaxa) Domain Pages b) Template Abstraction

Target Entity

Marc Snir

Peer Entities

Philips YuAndrew Ng

∗ +entity pages domain pages

Domain Phase Entity Phase

Utility regularization (i.e. supervision on target aspect)

c) Bridging Domain & Entity

Collective precision⋃ ⋂⋃Collective recall⋃ ⋂

pages for target aspect

pages of candidate query

RND: select query randomly

P/R: optimize precision/recall without domain and context

P/R+q: with domain pages but no templates, and without context

P/R+t: with domain pages and templates, without context

L2QP/L2QR: full approaches optimizing precision/recall

Result B: Compare with Indep. Baselines

L2QBAL: optimize for F-score, balancing L2QP & L2QR

LM: language feedback model

AQ: adaptive querying for text databases

HR: harvest rate for hidden structured databases

MQ: manually designed queries

learning to query: focused web page harvesting for entity ... · learning to query: focused web...

Documents

query processing and optimization -...

select, link and rank: diversified query expansion and...

query to knowledge: unsupervised entity extraction from...

national guideline for management, prevention and control...

named entity recognition in an intranet query log richard...

energy harvesting, tekhnologi harvesting

urbangreen rainwater harvesting brochure€¦ · urbangreen...

query: a framework for integrating entity resolution with...

rainwater harvesting and utilisation rainwater harvesting...

select, link and rank: diversiﬁed query expansion and...

query-specific knowledge summarization with entity ... ·...

entity framework query generation – linq-syntax –...

recovering question answering errors via query...

data models matter less than you...

admin พนักงาน - siam university€¦ ·...

entity query feature expansion using knowledge base...

query-driven approach to entity · pdf filequery-driven...

named entity recognition in query

exploiting entity relationship for query expansion in...

harvesting, processing and visualising “jožef stefan”...