lecture 5 introduction to queries to knowledge bases

57
Lecture 5 Introduction to Queries to Knowledge Bases 1. Is it Possible to Beat Apple, Amazon, Google? 2. Formal, Executable Representation 3. Database Queries in Natural language 4. Training with Paraphrases Silei Xu, Sina Semnani, Monica Lam 1 Stanford CS224v Course Conversational Virtual Assistants with Deep Learning

Upload: others

Post on 24-Jan-2022

4 views

Category:

Documents


0 download

TRANSCRIPT

Lecture 5

Introduction to Queries to Knowledge Bases

1. Is it Possible to Beat Apple, Amazon, Google?2. Formal, Executable Representation3. Database Queries in Natural language4. Training with Paraphrases

Silei Xu, Sina Semnani, Monica Lam 1

Stanford CS224v CourseConversational Virtual Assistants with Deep Learning

enie: Open Pretrained Assistant

Pretrained Language Models

Read Form-filling Instructions

Fill Web Forms

Customer Support

APIs

DB Access Dialogues

GroundingPrimitives

GenericDialogue

Models

Agents

Transaction Dialogue State Machine

Restaurants Play Songs Turn on Lights

Abstract APIDialogues

DB Schemas

Semantic Parsing for Database Queries

• Very importantLots of knowledge in structured data (public & private)

• CorrectnessCan we generate the correct queries?

• CoverageCan we handle what people ask?Head vs. tail queries?

• CompletenessCan we understand all possible dbqueries?

• CostWhat is the cost of training data?

Natural Language

Neural Semantic Parser

ExecutionDatabase Queries

Database Query

Outline

1. Is it Possible to Beat Apple, Amazon, Google?2. Formal, Executable Representation3. Database Queries in Natural language4. Training with Paraphrases

4

Asking Siri, Alexa, Google for Help with Restaurants

5

“Find me a Chinese restaurant”

Alexa Google Assistant Siri

Asking Siri, Alexa, Google for Help with Restaurants

6

Alexa Google Assistant Siri

“Find me a Chinese restaurant with at least 4.5 stars”

Asking Siri, Alexa, Google for Help with Restaurants

7

“Find me a Chinese restaurant with at least 4.5 stars”

Alexa Siri

Genie

Genie

8

Examples of Long-Tail Questions Alexa Google SiriShow me restaurants rated at least 4 stars with at least 100 reviewsShow restaurants in San Francisco rated higher than 4.5 ✓What is the highest rated Chinese restaurant near Stanford? ✓How far is the closest 4 star restaurant?Find a W3C employee that went to OxfordWho worked for Google and lives in Palo Alto?Who graduated from Stanford and won a Nobel prize? ✓ ✓Who worked for at least 3 companies?Show me hotels with checkout time later than 12PMWhich hotel has a pool in this area? ✓ ✓

Genie vs Commercial Assistants on Long-Tail Questions

Genie✓✓✓✓✓✓✓✓✓✓

Old Assumption

No real users à No NLP

9

Traditional Approach: Usage Driven

• Collect data from the wild

10

Show me Chinese restaurants in Stanford.

users@restaurant() filter servesCuisine =~ "chinese " &&location = new Location("Stanford")

Question

Logical form

developers

Collect questions from users

one training example

Developers annotate questions with logical form

AMRL: Alexa Meaning Representation Language

11

Restaurants

Music

Sports

News

IoT devices

Domains

Get me an upscale restaurantWhat are the restaurants around here?

What is the best restaurant?

Search for Spanish restaurants

Get me an upscale restaurant

Possible Requests

Neural Network

Train

Search for Spanish restaurants open at 10pm

English

Chinese

Hindi

Spanish

Arabic

Languages

Search for an upscale restaurant and then make a reservation for it

找一家高档餐厅,然后帮我预约

Manual annotation in AMRL

Correctness? Coverage? Completeness? Cost?Alexa: One employee dedicated to restaurants

Public Knowledge Bases: Wikidata

• A community-driven knowledge graph• 95M items• each Wikipedia page has an item in Wikidata

• 9,143 properties

12

item identifier

property value

What do people care about?

Public Knowledge Bases: Schema.org

• The web has a schema: Schema.org• Structure data to mark up web pages• Mainly used by search engines• It covers many domains, including

restaurants, hotels, people, recipes, products, news …

• Core: 792 types, 1447 properties • + extensions

13

<script type="application/ld+json">{

@type: "restaurant",name: "The French Laundry",servesCuisine: “French",aggregateRating: {

@type: "AggregateRating",reviewCount: 2527,ratingValue: 4.5

}...

} Schema.org markup on Yelp

Private Knowledge Bases

• Every company is a data company [Forbes 2018]

• Grocery store: customers, stock info• Insurance: customers, plan information • Restaurant: menu, opening hours

• Challenges in data acquisition• Company proprietary or confidential information• User privacy • Limited resources to collect data for question answering

14

Need to make it easier to answer questions on structured knowledge databaseDatabases exist because there is interest.

Traditional Approach: Usage Driven & Logical Form Representation

• Represent sentence in a logical form• Captures what the user is saying• Logical form is not executable

• Interpret the logical form to perform the action

15

Natural Language

Neural Semantic Parser

Interpret logical form& Execute

Logical Form

New Approach: Machine Driven

16

Natural Language

Neural Semantic Parser

ExecutionDatabase Queries

Database Query

• Go backwards, start with what the computer can do• If the user asks for new functionality,

design the function, repeat • A formal executable representation• No need to interpret a logical form• End-to-end:

no intermediate representation between human & computer

• Create a training set for what the computer can do (automatic + manual)

Quiz

• Machine-driven approach• Lots of correctly annotated data that are synthesized• Strengths?

• Weaknesses?

18

Quiz

• Is an executable representation suitable for all NLP problems?

19

Outline

1. Is it Possible to Beat Apple, Amazon, Google?2. Formal, Executable Representation3. Database Queries in Natural language4. Training with Paraphrases

21

How Database Query Works

• Relational databases• Query language: SQL• Relational algebra: 6 fundamental operations + derived operations

• Selection (σ)• Projection (Π)• Union (U)• Set difference (⎼)• Cartesian product (×)• Rename (ρ) -- not necessary in single commands in natural language

• Same algebra works for all domains!

22

Existing Database Queries in Natural Language

• Existing query languages are not designed for translation from natural language

• Designed for programmers• Many ways to write the same program• Do not match human’s mental model • E.g., variables, IF-ELSE statements

• Cannot easily interoperate with actions

23

Designed for consumers

ThingTalk is!

One way to write any programMatch human’s mental model

Inter-operate with actions

Selection

Show me restaurants in Palo Alto

@restaurant() filter geo == new Location(“Palo Alto”)

table [filter filter]?

Strong Typing

• All parameters and return values are strongly typed• Boolean, String, Number, Date, Time, Location• Measure(unit-parameter); e.g. C (Celsius), ft, … • Currency(ISO-code parameter)• Entity (type-parameter): the value is the identifier of an object

Type-parameter: picture, hashtag, username, path_name, url , phone_number, email_address, device, function

• Enum (e.g. on, off)• Array• Compound (structure)

25

Filter Operators

• Typing enables filters automatically• Previous approach: possible expressions are designed domain by domain

• Type-specific operators: • Numeric: >=, <=• Array: contains, in_array• String: =~ (soft match), starts_with, ends_with, prefix_of, suffix_of

• Logical:• &&, || , !

26

Selection

Show me Chinese restaurants in Palo Alto

@restaurant() filter geo == new Location(“Palo Alto”)&& servesCuisine =~ “Chinese”

table [filter filter]?

Selection

Show me Chinese restaurants in Palo Altowith at least 4.5 stars

@restaurant() filter geo == new Location(“Palo Alto”)&& servesCuisine =~ “Chinese”

&& rating >= 4.5

table [filter filter]?

Differences and Unions

• Differences: Popular use of differences is covered by “Not’’• Unions: Popular use of Unions is covered by “Or”

• Note: We focus on queries that can be expressed in a sentence

29

Projection

Show me Chinese restaurants in Palo Altowith at least 4.5 stars

@restaurant() filter geo == new Location(“Palo Alto”)&& servesCuisine =~ “Chinese”

&& rating >= 4.5

table [filter filter]?[field+] of

Projection

Show me the address of Chinese restaurants in Palo Altowith at least 4.5 stars

[address] of@restaurant() filter geo == new Location(“Palo Alto”)

&& servesCuisine =~ “Chinese”&& rating >= 4.5

table [filter filter]?[field+] of

Subquery

Show me the address of Chinese restaurants in Stanfordwith at least 4.5 stars

[address] of@restaurant() filter geo == new Location(“Stanford”)

&& servesCuisine =~ “Chinese”&& rating >= 4.5

table [filter filter]?[field+] of

Atom filter: param op value

Subquery filter: op(param, any(table [filter filter]?))

Subquery

Show me the address of Chinese restaurants in Stanfordreviewed by Bob

[address] of@restaurant() filter geo == new Location(“Stanford”)

&& servesCuisine =~ “Chinese”&& contains(reviews, any(@review() filter author =~ “Bob” ))

table [filter filter]?[field+] of

Atom filter: param op value

Subquery filter: op(param, any(table [filter filter]?))

Sorting

Show me the address of Chinese restaurants in Stanfordreviewed by Bob

[address] of@restaurant() filter geo == new Location(“Stanford”)

&& servesCuisine =~ “Chinese”&& contains(reviews, any(@review() filter author =~ “Bob” ))

table [filter filter]?sort ( field asc|desc of )

Sorting

sort ( rating desc of @restaurant() filter geo == new Location(“Stanford”)

&& servesCuisine =~ “Chinese” && contains(reviews, any(@review() filter author =~ “Bob” ))

Show me top-rated Chinese restaurants in Stanfordreviewed by Bob

table [filter filter]?sort ( field asc|desc of )

Joins

36

table [filter filter]? table [filter filter]?join [on filter]?

sort ( rating desc of @restaurant() filter geo == new Location(“Stanford”)

&& servesCuisine =~ “Chinese” && contains(reviews, any(@review() filter author =~ “Bob” ))

Show me top-rated Chinese restaurants in Stanfordreviewed by Bob

Joins

37

sort ( rating desc of @restaurant() filter geo == new Location(“Stanford”)

&& servesCuisine =~ “Chinese” join @review() on contains(first.reviews, second.id))

table [filter filter]? table [filter filter]?join

Show me top-rated Chinese restaurants in Stanfordwith their reviews

[on filter]?

ThingTalk Language Summary

38

Operator Examples

Filter operators and(,), or, not, >=, <=, contains, in_array, =~ , starts_with, ends_with, prefix_of, suffix_of

Selection Atom filter rating >= 4.5

Computation filter count(reviews) >= 100

Subquery contains(reviews, any(@review() filter author =~ “Bob” ))

Existential subquery any([speed] of @gps())

Projection [address] of @restaurant()

Boolean question [rating >= 4.5] of @restaurant()

Join @restaurant() join @review() on contains(first.reviews, second.id)

Sorting & Limiting sort (rating desc of @restaurant()) [1:5]

Aggregation count(@restaurant()) avg(rating of @restaurant())

Computation [distance(geo, $location)] of @restaurant()

ThingTalk: Designed for Semantic Parsing

• Similar in power as core SQL • Fully compositional• Constructs matching the user’s mental model

39

Question ThingTalk SQL

the restaurant @restaurant() SELECT * FROM restaurant

menu of the restaurant [menu] of @restaurant() SELECT menu FROM restaurant

number of reviews ofthe restaurant

[count(reviews)] of @restaurant() SELECT count(review.id) FROM restaurant JOIN review ON review.restaurant= restaurant.id GROUP by restaurant.id

menu and number of reviews of the restaurant

[menu, count(reviews)] of @restaurant()

SELECT count(review.id), restaurant.menu FROM restaurant JOIN review ON review.restaurant= restaurant.id GROUP by restaurant.id

Canonicalized ThingTalk

“show me a restaurant that serves Italian food or serves Chinese food”@restaurant(), servesCuisine =~ “Italian” ||

servesCuisine =~ “Chinese”

“show me a restaurant that serves Italian or Chinese food”@restaurant, in_array~(servesCuisine,

[“Italian”, “Chinese”])

• The two semantics are the same• Convert the first to the second one automatically• Filters are ordered alphabetically

40

Sentences with the same meaning are represented with identical ThingTalk

Quiz

• What is the purpose of canonicalization?

41

Discussion

• What is the purpose of strong typing?

• Can ThingTalk answer everything the user wants to know?

• How do we prove that ThingTalk is good for semantic parsing?

42

Quiz

• ThingTalk has a formal grammar. What can we do with the grammar?

43

Outline

1. Is it Possible to Beat Apple, Amazon, Google?2. Formal, Executable Representation3. Database Queries in Natural language4. Training with Paraphrases

45

Relational Operators

46

Relational Operators

RestaurantName Price Cuisine …

UserName Email Age …

CityName Geo Country …

ProductName UPC Price …

Domain Knowledge

DB Query

• Factor knowledge into relational algebra and domain knowledge

Can we do the same thing for natural language?

σ, Π, U, ⎼, ×

Relational Operators --> Natural Language Constructs

47

• Factor knowledge into relational algebra and domain knowledge

natural language construct

σ, Π, U, ⎼, ×Restaurant

Name Price Cuisine …

UserName Email Age …

CityName Geo Country …

ProductName UPC Price …

Domain Knowledge

DB Query

CanonicalNL

ConstructsNL Question

Relational Operators

Canonical NL Construct

48

Canonical NL Template

Database Schema+ Values

Selection (σ) Projection (Π)

<param> of <table>

name of user

country of cityprice of product

<table> with <param> equals <value>

restaurants with cuisine equal to Chinese users with name equal to Johncity with country equal to United Statesproduct with price equal to $10

How many NL templates do we need?

RelationalAlgebra

menu of restaurant

Quiz

• How many natural language templates do we need?

• Given a ThingTalk program, can we generate the natural language equivalent?

• What can do we with it?

49

Outline

1. Is it Possible to Beat Apple, Amazon, Google?2. Formal, Executable Representation3. Database Queries in Natural language4. Training with Paraphrases

50

Training Data Through Paraphrasing

• From the wild• Expensive to collect and annotate, privacy concerning

• Synthetic • Limited natural language variety

• Paraphrases • Overnight [Wang et al. 15]

1. Generate canonical utterance using templates 2. Ask crowd workers to paraphrase the synthetic data

51

Building a Semantic Parser Overnight

52

Via a set of domain-general grammar, generate oneclunky but understandable canonical utterance for each logical form

domain Seed Lexicon

Logical form and canonical utterances

by developers

domain-general

grammar

article -> TypeNP[article]publication date -> RelNP[publicationDate]

Developers provide one seed lexicon that specifies a canonical phrase for each relation and entity type <TypeNP> whose <RelNP> is <value>

article publication date Oct 15

article whose publication date is Oct 15 argmax(type.article, publicationDate)

Building a Semantic Parser Overnight

53

Via a set of domain-general grammar, generate oneclunky but understandable canonical utterance for each logical form

domain Seed Lexicon

Logical form and canonical utterances

by developers

domain-general

grammar

article -> TypeNP[article]publication date -> RelNP[publicationDate]

article whose publication date is Oct 15 argmax(type.article, publicationDate)

Developers provide one seed lexicon that specifies a canonical phrase for each relation and entity type

Building a Semantic Parser Overnight

54

Via a set of domain-general grammar, generate oneclunky but understandable canonical utterance for each logical form

domain Seed Lexicon

Logical form and canonical utterances

Paraphrasessemantic

parser

by developers

domain-general

grammar

via crowd-sourcing train

article -> TypeNP[article]publication date -> RelNP[publicationDate]

article whose publication date is Oct 15 argmax(type.article, publicationDate)

Show me articles published on Oct 15.

Crowd workers paraphrase canonical utterance to natural utterances, which is used for training

Developers provide one seed lexicon that specifies a canonical phrase for each relation and entity type

Evaluation

55

82.4

55.1 48.7 46

0

20

40

60

80

100

Paraphrase Developer validation Cheatsheet IFTTT

A parser for virtual assistant commands (44 skills) trained with synthetic data from an early version of Genie and human paraphrase data [Campagna 2019]

Test data provided by skill developers

Data collected from crowd workers who were shown with a cheat sheet of skills available (mimic real user for virtual assistant)

Test data converted from IFTTT commands, an independent source

Quiz

• What are the pros and cons of Overnight?

56

Pros and Cons of Overnight

• Cost• Cheaper than manual annotation, but still expensive to scale up• Always paraphrase the full sentence, not leveraging compositionality of queries

• Coverage • Limited templates not covering the space of queries• Limited variety in natural language • Workers are biased to the synthetic sentence• Often only make minimum changes

• Incorrect annotation• Workers makes mistakes, especially when the quality of synthetic sentence is poor

57

Key takeaway: Good for paraphrase test dataDoes not work on real input!

Summary: Methodology• Complete everything that the computer can do:

• What if the computer is missing functionality?

58

“Ask not what your country can do for you but

what you can do for your country?”

“Ask not what your user wants to know, but

what the computer can tell the user?”

John F. Kennedy:

Our Motto:

Summary: ThingTalk Query Representation• Completeness by mapping SQL to ThingTalk deliberately• Drop rename, full union, full difference

• Match user’s mental model: more natural to understand by humans too!• SQL is not the most intuitive

• Precise, unambiguous, canonicalized• Facilitate training accuracy

• Inter-operate with action APIs (to be discussed later)

59

Summary: Paraphrasing Methodology• Big plus: reduce cost of data acquistion• Still expensive• Lack of variety• Inaccurate annotations• Does not work with real input

60