margit bowler

27
1 Data Collection and Normalization for the Scenario-Based Lexical Knowledge Resource of a Text-to-Scene Conversion System Margit Bowler

Upload: chaney

Post on 05-Jan-2016

67 views

Category:

Documents


0 download

DESCRIPTION

Data Collection and Normalization for the Scenario-Based Lexical Knowledge Resource of a Text-to-Scene Conversion System. Margit Bowler. Who I Am. Rising senior at Reed College in Portland Linguistics major, concentration in Russian. Overview. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Margit Bowler

1

Data Collection and Normalization for the Scenario-

Based Lexical Knowledge Resource of a Text-to-Scene

Conversion System

Data Collection and Normalization for the Scenario-

Based Lexical Knowledge Resource of a Text-to-Scene

Conversion System

Margit BowlerMargit Bowler

Page 2: Margit Bowler

2

Who I AmWho I Am

Rising senior at Reed College in Portland Linguistics major, concentration in Russian

Rising senior at Reed College in Portland Linguistics major, concentration in Russian

Page 3: Margit Bowler

3

OverviewOverview WordsEye & Scenario-Based Lexical

Knowledge Resource (SBLR) Use of Amazon’s Mechanical Turk (AMT)

for data collection Manual normalization of the AMT data and

definition of semantic relations Automatic normalization techniques of

AMT data with respect to building the SBLR

Future automatic normalization techniques

WordsEye & Scenario-Based Lexical Knowledge Resource (SBLR)

Use of Amazon’s Mechanical Turk (AMT) for data collection

Manual normalization of the AMT data and definition of semantic relations

Automatic normalization techniques of AMT data with respect to building the SBLR

Future automatic normalization techniques

Page 4: Margit Bowler

4

WordsEye Text-to-Scene Conversion

WordsEye Text-to-Scene Conversion

the humongous white shiny bear is on the american mountain range. the mountain range is 100 feet tall. the ground is water. the sky is partly cloudy. the airplane is 90 feet in front of the nose of the bear. the airplane is facing right.

the humongous white shiny bear is on the american mountain range. the mountain range is 100 feet tall. the ground is water. the sky is partly cloudy. the airplane is 90 feet in front of the nose of the bear. the airplane is facing right.

Page 5: Margit Bowler

5

Scenario-Based Lexical Knowledge Resource (SBLR)

Scenario-Based Lexical Knowledge Resource (SBLR)

Information on semantic categories of words Semantic relations between predicates (verbs,

nouns, adjectives, prepositions) and their arguments

Contextual, common-sense knowledge about the visual scenes various actions and items occur in

Information on semantic categories of words Semantic relations between predicates (verbs,

nouns, adjectives, prepositions) and their arguments

Contextual, common-sense knowledge about the visual scenes various actions and items occur in

Page 6: Margit Bowler

6

How to build the SBLR… efficiently?

How to build the SBLR… efficiently?

Manual construction of the SBLR is time-consuming and expensive

Past methods have included mining information from external semantic resources (e.g. WordNet, FrameNet, PropBank) & information extraction techniques from other corpora

Manual construction of the SBLR is time-consuming and expensive

Past methods have included mining information from external semantic resources (e.g. WordNet, FrameNet, PropBank) & information extraction techniques from other corpora

Page 7: Margit Bowler

7

Amazon’s Mechanical Turk (AMT)

Amazon’s Mechanical Turk (AMT)

Online marketplace for work Anyone can work on AMT, however:

It is possible to screen workers by various criteria. We screened ours by: Located in the USA 99%+ approval rating

Online marketplace for work Anyone can work on AMT, however:

It is possible to screen workers by various criteria. We screened ours by: Located in the USA 99%+ approval rating

Page 8: Margit Bowler

8

AMT TasksAMT Tasks In each task, we asked for up to 10 responses. A

comment box was provided for >10 responses.

Task 1: Given the object X, name 10 locations where you would find X. (Locations)

Task 2: Given the object X, name 10 objects found near X. (Nearby Objects)

Task 3: Given the object X, list 10 parts of X. (Part- Whole)

In each task, we asked for up to 10 responses. A comment box was provided for >10 responses.

Task 1: Given the object X, name 10 locations where you would find X. (Locations)

Task 2: Given the object X, name 10 objects found near X. (Nearby Objects)

Task 3: Given the object X, list 10 parts of X. (Part- Whole)

Page 9: Margit Bowler

9

AMT Task ResultsAMT Task Results

17,200 total responses Spent $106.90 for all three tasks It took approximately 5 days to complete each task

17,200 total responses Spent $106.90 for all three tasks It took approximately 5 days to complete each task

Target Words

User Inputs Reward

Locations 342 6,850 $0.05

Objects 245 3,500 $0.07

Parts 342 6,850 $0.05

Page 10: Margit Bowler

10

Goal: How to automatically normalize data collected from AMT in such a way that AMT would be useful for building the Scenario-Based Lexical Knowledge Resource (SBLR)?

Goal: How to automatically normalize data collected from AMT in such a way that AMT would be useful for building the Scenario-Based Lexical Knowledge Resource (SBLR)?

Page 11: Margit Bowler

11

Manual Normalization of AMT Data

Manual Normalization of AMT Data

Removal of uninformative target item-response item pairs between which no relevant semantic relationship was held

Definition of the semantic relations held between the remaining target item-response item pairs

This manually normalized set of data was used as the standard against which we measured various automatic normalization techniques.

Removal of uninformative target item-response item pairs between which no relevant semantic relationship was held

Definition of the semantic relations held between the remaining target item-response item pairs

This manually normalized set of data was used as the standard against which we measured various automatic normalization techniques.

Page 12: Margit Bowler

12

Rejected Target-Response PairsRejected Target-Response Pairs

Misinterpretation of ambiguous target item (e.g. mobile)

Viable interpretation of target item was not contained within the SBLR (e.g. crawfish as food rather than a living animal)

Too generic responses (e.g. store in response to turntable)

Misinterpretation of ambiguous target item (e.g. mobile)

Viable interpretation of target item was not contained within the SBLR (e.g. crawfish as food rather than a living animal)

Too generic responses (e.g. store in response to turntable)

Page 13: Margit Bowler

13

Examples of Approved AMT Responses

Examples of Approved AMT Responses Locations:

mural - gallerylizard - desert

Nearby Objects:ambulance - stretchercauldron - fire

Part-Whole:scissors - blademonument - granite

Locations:mural - gallerylizard - desert

Nearby Objects:ambulance - stretchercauldron - fire

Part-Whole:scissors - blademonument - granite

Page 14: Margit Bowler

14

Semantic RelationsSemantic Relations Defined a total of 34 relations Focused on defining concrete, graphically

depictable relationships “Generic” relations accounted for most of the

labeled pairs (e.g. containing.r, next-to.r) Finer distinctions were made within these generic

semantic relations (e.g. habitat.r, residence.r within the overarching containing.r relation)

Defined a total of 34 relations Focused on defining concrete, graphically

depictable relationships “Generic” relations accounted for most of the

labeled pairs (e.g. containing.r, next-to.r) Finer distinctions were made within these generic

semantic relations (e.g. habitat.r, residence.r within the overarching containing.r relation)

Page 15: Margit Bowler

15

Example Semantic RelationsExample Semantic Relations

Locations:mural - gallery - containing.rlizard - desert - habitat.r

Nearby Objects:ambulance - stretcher - next-to.rcauldron - fire - above.r

Part-Whole:scissors - blade - object-part.rmonument - granite - stuff-object.r

Locations:mural - gallery - containing.rlizard - desert - habitat.r

Nearby Objects:ambulance - stretcher - next-to.rcauldron - fire - above.r

Part-Whole:scissors - blade - object-part.rmonument - granite - stuff-object.r

Page 16: Margit Bowler

16

Semantic Relations within Locations Task

Semantic Relations within Locations Task

We collected 6850 locations for 342 target objects from our 3D library.

We collected 6850 locations for 342 target objects from our 3D library.

Relation

Number of occurrences

Percentage of total scored pairs

containing.r 1194 38.01%

habitat.r 346 11.02%

on-surface.r 333 10.6%

geographical-location.r

306 9.74%

group.r 183 5.83%

Page 17: Margit Bowler

17

Semantic Relations within Nearby Objects Task

Semantic Relations within Nearby Objects Task

We collected 6850 nearby objects for 342 target objects from our 3D library.

We collected 6850 nearby objects for 342 target objects from our 3D library.

Relation

Number of occurrences

Percentage of total scored pairs

next-to.r 4988 75.66%

on-surface.r 375 5.69%

containing.r 293 4.44%

habitat.r 243 3.69%

object-part.r 153 2.32%

Page 18: Margit Bowler

18

Semantic Relations within Part-Whole Task

Semantic Relations within Part-Whole Task

We collected 3500 parts of 245 objects. We collected 3500 parts of 245 objects.

Relation

Number of occurrences

Percentage of total scored pairs

object-part.r 2675 79.12%

stuff-object.r 552 16.33%

containing.r 50 1.48%

habitat.r 36 1.06%

stuff-mass.r 17 0.5%

Page 19: Margit Bowler

19

Automatic Normalization TechniquesAutomatic Normalization Techniques

Collected AMT data was classified into higher-scoring versus lower-scoring sets by: Log-likelihood and log-odds of sentential co-

occurrences in the Gigaword English corpus WordNet path similarity Resnik similarity WordNet average pair-wise similarity WordNet matrix similarity

Accuracy evaluated by comparison against manually normalized data

Collected AMT data was classified into higher-scoring versus lower-scoring sets by: Log-likelihood and log-odds of sentential co-

occurrences in the Gigaword English corpus WordNet path similarity Resnik similarity WordNet average pair-wise similarity WordNet matrix similarity

Accuracy evaluated by comparison against manually normalized data

Page 20: Margit Bowler

20

Precision & RecallPrecision & Recall

AMT data is quite cheap to collect, so we were concerned predominantly with precision (obtaining highly accurate data) rather than recall (avoiding loss of some data).

In order to achieve more accurate data (high precision), we will lose a portion of our AMT data (low recall)

AMT data is quite cheap to collect, so we were concerned predominantly with precision (obtaining highly accurate data) rather than recall (avoiding loss of some data).

In order to achieve more accurate data (high precision), we will lose a portion of our AMT data (low recall)

Page 21: Margit Bowler

21

Locations TaskLocations Task

Achieved best precision with log-odds. Within high-scoring set, responses that were too general (e.g.

turntable - store) were rejected. Within low-scoring set, extremely specific locations that

were unlikely to occur within a corpus or WordNet’s synsets were approved (e.g. caliper - architect’s briefcase)

Achieved best precision with log-odds. Within high-scoring set, responses that were too general (e.g.

turntable - store) were rejected. Within low-scoring set, extremely specific locations that

were unlikely to occur within a corpus or WordNet’s synsets were approved (e.g. caliper - architect’s briefcase)

Base-

line

Log- likel.

Log-

odds

WN Path Sim. Resnik

WN Avg. PW

WN Matrix Sim.

Precision 0.5527 0.7502 0.7715 0.5462 0.5562 0.6014 0.4782

Recall 1.0 0.7945 0.6486 0.9649 0.9678 0.3454 1.0

Page 22: Margit Bowler

22

Nearby Objects TaskNearby Objects Task

Relatively few target-response pairs were discarded, resulting in high recall.

High precision due to open-ended nature of task; responses often fell under a relation, if not next-to.r.

Relatively few target-response pairs were discarded, resulting in high recall.

High precision due to open-ended nature of task; responses often fell under a relation, if not next-to.r.

Base-

line

Log- likel.

Log-

odds

WN Path Sim. Resnik

WN Avg. PW

WN Matrix Sim.

Precision 0.8934 0.8947 0.9048 0.9076 0.9085 0.9764 0.8795

Recall 1.0 1.0 0.8917 1.0 1.0 0.2659 1.0

Page 23: Margit Bowler

23

Part-Whole TaskPart-Whole Task

Rejected target-response pairs from the high-scoring set were often due to responses that named attributes, rather than parts, of the target item (e.g. croissant - flaky)

Approved pairs from the low-scoring set were mainly due to obvious, “common sense” responses that would usually be inferred, not explicitly stated (e.g. bunny - brain)

Rejected target-response pairs from the high-scoring set were often due to responses that named attributes, rather than parts, of the target item (e.g. croissant - flaky)

Approved pairs from the low-scoring set were mainly due to obvious, “common sense” responses that would usually be inferred, not explicitly stated (e.g. bunny - brain)

Base-

line

Log- likel.

Log-

odds

WN Path Sim. Resnik

WN Avg. PW

WN Matrix Sim.

Precision 0.7887 0.7832 0.8231 0.7963 0.7974 0.8823 0.8935

Recall 1.0 0.4129 0.4622 1.0 1.0 0.2621 0.2367

Page 24: Margit Bowler

24

Future Automatic Normalization Techniques

Future Automatic Normalization Techniques

Computing word association measures on much larger corpora (e.g. Google’s 1 trillion word corpus)

WordNet synonyms and hypernyms Latent Semantic Analysis to build word

similarity matrices

Computing word association measures on much larger corpora (e.g. Google’s 1 trillion word corpus)

WordNet synonyms and hypernyms Latent Semantic Analysis to build word

similarity matrices

Page 25: Margit Bowler

25

In Summary…In Summary…

WordsEye & Scenario-Based Lexical Knowledge Resource (SBLR)

Amazon’s Mechanical Turk & our tasks Manual normalization of AMT data Automatic normalization techniques used

on AMT data and results Possible future automatic normalization

methods

WordsEye & Scenario-Based Lexical Knowledge Resource (SBLR)

Amazon’s Mechanical Turk & our tasks Manual normalization of AMT data Automatic normalization techniques used

on AMT data and results Possible future automatic normalization

methods

Page 26: Margit Bowler

26

Thanks to…Thanks to…

Richard Sproat Masoud Rouhizadeh All the CSLU interns

Richard Sproat Masoud Rouhizadeh All the CSLU interns

Page 27: Margit Bowler

27

Questions?Questions?