naturallanguageprocessingfor ...lucylin/docs/slides-ghtc-2018.pdf · naturallanguageprocessingfor...
TRANSCRIPT
![Page 1: NaturalLanguageProcessingfor ...lucylin/docs/slides-ghtc-2018.pdf · NaturalLanguageProcessingfor AnalyzingDisasterRecoveryTrends ExpressedinLargeTextCorpora LucyH.Lin,ScottB.Miles,NoahA.Smith](https://reader033.vdocuments.net/reader033/viewer/2022043007/5f95a0b31811b075b24e7e1a/html5/thumbnails/1.jpg)
Natural Language Processing forAnalyzing Disaster Recovery TrendsExpressed in Large Text Corpora
Lucy H. Lin, Scott B. Miles, Noah A. SmithUniversity of Washington
19 October 2018
![Page 2: NaturalLanguageProcessingfor ...lucylin/docs/slides-ghtc-2018.pdf · NaturalLanguageProcessingfor AnalyzingDisasterRecoveryTrends ExpressedinLargeTextCorpora LucyH.Lin,ScottB.Miles,NoahA.Smith](https://reader033.vdocuments.net/reader033/viewer/2022043007/5f95a0b31811b075b24e7e1a/html5/thumbnails/2.jpg)
Introduction
Problem: empirical data describing disaster recovery is scarce⇒ constrains pre-/post-event recovery planning
Available data source: text corpora (e.g., news articles)
Manual analysis of large text corpora is slow…⇒ + natural language processing
![Page 3: NaturalLanguageProcessingfor ...lucylin/docs/slides-ghtc-2018.pdf · NaturalLanguageProcessingfor AnalyzingDisasterRecoveryTrends ExpressedinLargeTextCorpora LucyH.Lin,ScottB.Miles,NoahA.Smith](https://reader033.vdocuments.net/reader033/viewer/2022043007/5f95a0b31811b075b24e7e1a/html5/thumbnails/3.jpg)
Introduction
Proposition query:“Dealing with authorities is causing stress and anxiety.”
query corpus
Matched sentences:“Unfamiliar bureaucratic systems are causing the majority of the stress.”“Those in charge of recovery are making moves to appease the growinganger among homeowners.”
Frequency across time:
2011
2012
2013
2014
2015
freq
aggregate
![Page 4: NaturalLanguageProcessingfor ...lucylin/docs/slides-ghtc-2018.pdf · NaturalLanguageProcessingfor AnalyzingDisasterRecoveryTrends ExpressedinLargeTextCorpora LucyH.Lin,ScottB.Miles,NoahA.Smith](https://reader033.vdocuments.net/reader033/viewer/2022043007/5f95a0b31811b075b24e7e1a/html5/thumbnails/4.jpg)
Outline
1. Introduction
2. Case study: 2010-2011 Canterbury earthquake disaster
3. NLP method for semantic matching
4. User evaluation
5. Qualitative/quantitative output
6. Conclusions
![Page 5: NaturalLanguageProcessingfor ...lucylin/docs/slides-ghtc-2018.pdf · NaturalLanguageProcessingfor AnalyzingDisasterRecoveryTrends ExpressedinLargeTextCorpora LucyH.Lin,ScottB.Miles,NoahA.Smith](https://reader033.vdocuments.net/reader033/viewer/2022043007/5f95a0b31811b075b24e7e1a/html5/thumbnails/5.jpg)
Outline
1. Introduction
2. Case study: 2010-2011 Canterbury earthquake disaster
3. NLP method for semantic matching
4. User evaluation
5. Qualitative/quantitative output
6. Conclusions
![Page 6: NaturalLanguageProcessingfor ...lucylin/docs/slides-ghtc-2018.pdf · NaturalLanguageProcessingfor AnalyzingDisasterRecoveryTrends ExpressedinLargeTextCorpora LucyH.Lin,ScottB.Miles,NoahA.Smith](https://reader033.vdocuments.net/reader033/viewer/2022043007/5f95a0b31811b075b24e7e1a/html5/thumbnails/6.jpg)
2010–2011 Canterbury earthquake disaster: timeline
September 2010:
• Epicenter: 35km west of Christchurch• Moderate damage
February 2011:
• Epicenter: 10km southeast of Christchurch• Extremely high ground acceleration• 185 deaths, thousands of felt aftershocks
![Page 7: NaturalLanguageProcessingfor ...lucylin/docs/slides-ghtc-2018.pdf · NaturalLanguageProcessingfor AnalyzingDisasterRecoveryTrends ExpressedinLargeTextCorpora LucyH.Lin,ScottB.Miles,NoahA.Smith](https://reader033.vdocuments.net/reader033/viewer/2022043007/5f95a0b31811b075b24e7e1a/html5/thumbnails/7.jpg)
2010–2011 Canterbury earthquake disaster: impacts
Damages:
• Estimated $40 billion• Housing: 100k houses in need of repairs• Water, utilities, road infrastructure: extensive damage
Recovery groups:
• Government: CERA, SCIRT (sunset after 5 years)• Community: Regenerate Christchurch
Recovery still ongoing: public development projects,residential rezoning
![Page 8: NaturalLanguageProcessingfor ...lucylin/docs/slides-ghtc-2018.pdf · NaturalLanguageProcessingfor AnalyzingDisasterRecoveryTrends ExpressedinLargeTextCorpora LucyH.Lin,ScottB.Miles,NoahA.Smith](https://reader033.vdocuments.net/reader033/viewer/2022043007/5f95a0b31811b075b24e7e1a/html5/thumbnails/8.jpg)
2010–2011 Canterbury earthquake disaster: text data
Corpus: 982 NZ news articles (2010–2015) post-earthquakes
• stuff.co.nz, nzherald.co.nz
Proposition queries: 20 queries, covering
• Community wellbeing• Infrastructure• Decision-making
e.g.: “The council should have consulted residents beforemaking decisions.”
![Page 9: NaturalLanguageProcessingfor ...lucylin/docs/slides-ghtc-2018.pdf · NaturalLanguageProcessingfor AnalyzingDisasterRecoveryTrends ExpressedinLargeTextCorpora LucyH.Lin,ScottB.Miles,NoahA.Smith](https://reader033.vdocuments.net/reader033/viewer/2022043007/5f95a0b31811b075b24e7e1a/html5/thumbnails/9.jpg)
Outline
1. Introduction
2. Case study: 2010-2011 Canterbury earthquake disaster
3. NLP method for semantic matching
4. User evaluation
5. Qualitative/quantitative output
6. Conclusions
![Page 10: NaturalLanguageProcessingfor ...lucylin/docs/slides-ghtc-2018.pdf · NaturalLanguageProcessingfor AnalyzingDisasterRecoveryTrends ExpressedinLargeTextCorpora LucyH.Lin,ScottB.Miles,NoahA.Smith](https://reader033.vdocuments.net/reader033/viewer/2022043007/5f95a0b31811b075b24e7e1a/html5/thumbnails/10.jpg)
Semantic matching
Goal: find sentences with similar meaning to the query.
• Needs to be more powerful than word/phrase-levelmatching.
• Related to information retrieval, but want all matches.
![Page 11: NaturalLanguageProcessingfor ...lucylin/docs/slides-ghtc-2018.pdf · NaturalLanguageProcessingfor AnalyzingDisasterRecoveryTrends ExpressedinLargeTextCorpora LucyH.Lin,ScottB.Miles,NoahA.Smith](https://reader033.vdocuments.net/reader033/viewer/2022043007/5f95a0b31811b075b24e7e1a/html5/thumbnails/11.jpg)
Semantic matching: method overview
fast filter
corpus ofsentences
propositionquery
likely matches
syntax-based model
matched sentences
![Page 12: NaturalLanguageProcessingfor ...lucylin/docs/slides-ghtc-2018.pdf · NaturalLanguageProcessingfor AnalyzingDisasterRecoveryTrends ExpressedinLargeTextCorpora LucyH.Lin,ScottB.Miles,NoahA.Smith](https://reader033.vdocuments.net/reader033/viewer/2022043007/5f95a0b31811b075b24e7e1a/html5/thumbnails/12.jpg)
Semantic matching: method overview
fast filter
corpus ofsentences
propositionquery
likely matches
syntax-based model
matched sentences
![Page 13: NaturalLanguageProcessingfor ...lucylin/docs/slides-ghtc-2018.pdf · NaturalLanguageProcessingfor AnalyzingDisasterRecoveryTrends ExpressedinLargeTextCorpora LucyH.Lin,ScottB.Miles,NoahA.Smith](https://reader033.vdocuments.net/reader033/viewer/2022043007/5f95a0b31811b075b24e7e1a/html5/thumbnails/13.jpg)
Semantic matching: fast filter
Goal: quickly filter out unlikely matches.
Word vector based comparison between two sentences:
average
Unfamiliar ⊏⊐bureaucratic ⊏⊐
......
stress ⊏⊐
average
Dealing ⊏⊐with ⊏⊐...
...anxiety ⊏⊐
corpus sent. ⊏⊐
query sent. ⊏⊐
cosinesimilarity
![Page 14: NaturalLanguageProcessingfor ...lucylin/docs/slides-ghtc-2018.pdf · NaturalLanguageProcessingfor AnalyzingDisasterRecoveryTrends ExpressedinLargeTextCorpora LucyH.Lin,ScottB.Miles,NoahA.Smith](https://reader033.vdocuments.net/reader033/viewer/2022043007/5f95a0b31811b075b24e7e1a/html5/thumbnails/14.jpg)
Semantic matching: method overview
fast filter
corpus ofsentences
propositionquery
likely matches
syntax-based model
matched sentences
![Page 15: NaturalLanguageProcessingfor ...lucylin/docs/slides-ghtc-2018.pdf · NaturalLanguageProcessingfor AnalyzingDisasterRecoveryTrends ExpressedinLargeTextCorpora LucyH.Lin,ScottB.Miles,NoahA.Smith](https://reader033.vdocuments.net/reader033/viewer/2022043007/5f95a0b31811b075b24e7e1a/html5/thumbnails/15.jpg)
Semantic matching: syntax-based model
Finer-grained matching: take word order/syntax into account.
Intuition: transformation between sentences is indicative oftheir relationship.
![Page 16: NaturalLanguageProcessingfor ...lucylin/docs/slides-ghtc-2018.pdf · NaturalLanguageProcessingfor AnalyzingDisasterRecoveryTrends ExpressedinLargeTextCorpora LucyH.Lin,ScottB.Miles,NoahA.Smith](https://reader033.vdocuments.net/reader033/viewer/2022043007/5f95a0b31811b075b24e7e1a/html5/thumbnails/16.jpg)
Semantic matching: syntax-based model
unfamiliar bureaucratic systems are causing stress
root
candidate
dealing with authorities is causing stress
rootquery
?
+delete(unfamiliar)+delete(bureaucratic)
+relabel(systems)+relabel(are)
+insert(authorities)+insert(with)
![Page 17: NaturalLanguageProcessingfor ...lucylin/docs/slides-ghtc-2018.pdf · NaturalLanguageProcessingfor AnalyzingDisasterRecoveryTrends ExpressedinLargeTextCorpora LucyH.Lin,ScottB.Miles,NoahA.Smith](https://reader033.vdocuments.net/reader033/viewer/2022043007/5f95a0b31811b075b24e7e1a/html5/thumbnails/17.jpg)
Semantic matching: syntax-based model
unfamiliar bureaucratic systems are causing stress
root
candidate
dealing with authorities is causing stress
rootquery
?
systems are causing stress
root
+delete(unfamiliar)+delete(bureaucratic)
+relabel(systems)+relabel(are)
+insert(authorities)+insert(with)
![Page 18: NaturalLanguageProcessingfor ...lucylin/docs/slides-ghtc-2018.pdf · NaturalLanguageProcessingfor AnalyzingDisasterRecoveryTrends ExpressedinLargeTextCorpora LucyH.Lin,ScottB.Miles,NoahA.Smith](https://reader033.vdocuments.net/reader033/viewer/2022043007/5f95a0b31811b075b24e7e1a/html5/thumbnails/18.jpg)
Semantic matching: syntax-based model
unfamiliar bureaucratic systems are causing stress
root
candidate
dealing with authorities is causing stress
rootquery
?
dealing are causing stress
root
+delete(unfamiliar)+delete(bureaucratic)
+relabel(systems)
+relabel(are)+insert(authorities)
+insert(with)
![Page 19: NaturalLanguageProcessingfor ...lucylin/docs/slides-ghtc-2018.pdf · NaturalLanguageProcessingfor AnalyzingDisasterRecoveryTrends ExpressedinLargeTextCorpora LucyH.Lin,ScottB.Miles,NoahA.Smith](https://reader033.vdocuments.net/reader033/viewer/2022043007/5f95a0b31811b075b24e7e1a/html5/thumbnails/19.jpg)
Semantic matching: syntax-based model
unfamiliar bureaucratic systems are causing stress
root
candidate
dealing with authorities is causing stress
rootquery
?
dealing is causing stress
root
+delete(unfamiliar)+delete(bureaucratic)
+relabel(systems)+relabel(are)
+insert(authorities)+insert(with)
![Page 20: NaturalLanguageProcessingfor ...lucylin/docs/slides-ghtc-2018.pdf · NaturalLanguageProcessingfor AnalyzingDisasterRecoveryTrends ExpressedinLargeTextCorpora LucyH.Lin,ScottB.Miles,NoahA.Smith](https://reader033.vdocuments.net/reader033/viewer/2022043007/5f95a0b31811b075b24e7e1a/html5/thumbnails/20.jpg)
Semantic matching: syntax-based model
unfamiliar bureaucratic systems are causing stress
root
candidate
dealing with authorities is causing stress
rootquery
dealing with authorities is causing stress
root
+delete(unfamiliar)+delete(bureaucratic)
+relabel(systems)+relabel(are)
+insert(authorities)+insert(with)
![Page 21: NaturalLanguageProcessingfor ...lucylin/docs/slides-ghtc-2018.pdf · NaturalLanguageProcessingfor AnalyzingDisasterRecoveryTrends ExpressedinLargeTextCorpora LucyH.Lin,ScottB.Miles,NoahA.Smith](https://reader033.vdocuments.net/reader033/viewer/2022043007/5f95a0b31811b075b24e7e1a/html5/thumbnails/21.jpg)
Semantic matching: method overview
fast filter
corpus ofsentences
propositionquery
likely matches
syntax-based model
matched sentences
![Page 22: NaturalLanguageProcessingfor ...lucylin/docs/slides-ghtc-2018.pdf · NaturalLanguageProcessingfor AnalyzingDisasterRecoveryTrends ExpressedinLargeTextCorpora LucyH.Lin,ScottB.Miles,NoahA.Smith](https://reader033.vdocuments.net/reader033/viewer/2022043007/5f95a0b31811b075b24e7e1a/html5/thumbnails/22.jpg)
Outline
1. Introduction
2. Case study: 2010-2011 Canterbury earthquake disaster
3. NLP method for semantic matching
4. User evaluation
5. Qualitative/quantitative output
6. Conclusions
![Page 23: NaturalLanguageProcessingfor ...lucylin/docs/slides-ghtc-2018.pdf · NaturalLanguageProcessingfor AnalyzingDisasterRecoveryTrends ExpressedinLargeTextCorpora LucyH.Lin,ScottB.Miles,NoahA.Smith](https://reader033.vdocuments.net/reader033/viewer/2022043007/5f95a0b31811b075b24e7e1a/html5/thumbnails/23.jpg)
User evaluation
Questions:
• How good are the sentences matched by our method?• Do potential users think this kind of tool will be helpful?
User study: 20 emergency managers
![Page 24: NaturalLanguageProcessingfor ...lucylin/docs/slides-ghtc-2018.pdf · NaturalLanguageProcessingfor AnalyzingDisasterRecoveryTrends ExpressedinLargeTextCorpora LucyH.Lin,ScottB.Miles,NoahA.Smith](https://reader033.vdocuments.net/reader033/viewer/2022043007/5f95a0b31811b075b24e7e1a/html5/thumbnails/24.jpg)
User evaluation: output quality
Rated output from 20 proposition queries:
• Different method variants
• Different parts of method:• Not selected by filter• Selected by filter, but not part of final output• Top-scoring output from filter• Method output (from syntax-based model)
• 1-5 scale (Krippendorf’s α = 0.784)
![Page 25: NaturalLanguageProcessingfor ...lucylin/docs/slides-ghtc-2018.pdf · NaturalLanguageProcessingfor AnalyzingDisasterRecoveryTrends ExpressedinLargeTextCorpora LucyH.Lin,ScottB.Miles,NoahA.Smith](https://reader033.vdocuments.net/reader033/viewer/2022043007/5f95a0b31811b075b24e7e1a/html5/thumbnails/25.jpg)
User evaluation: example
Query: There is a shortage of construction workers.
“The quarterly report for Canterbury included analysis onGreater Christchurch Value of Work projections.”
(1: completely unrelated to the query)
![Page 26: NaturalLanguageProcessingfor ...lucylin/docs/slides-ghtc-2018.pdf · NaturalLanguageProcessingfor AnalyzingDisasterRecoveryTrends ExpressedinLargeTextCorpora LucyH.Lin,ScottB.Miles,NoahA.Smith](https://reader033.vdocuments.net/reader033/viewer/2022043007/5f95a0b31811b075b24e7e1a/html5/thumbnails/26.jpg)
User evaluation: example
Query: There is a shortage of construction workers.
“The construction sector’s workload was expected to peak inDecember.”
(3: related to but does not adequately express the query)
![Page 27: NaturalLanguageProcessingfor ...lucylin/docs/slides-ghtc-2018.pdf · NaturalLanguageProcessingfor AnalyzingDisasterRecoveryTrends ExpressedinLargeTextCorpora LucyH.Lin,ScottB.Miles,NoahA.Smith](https://reader033.vdocuments.net/reader033/viewer/2022043007/5f95a0b31811b075b24e7e1a/html5/thumbnails/27.jpg)
User evaluation: example
Query: There is a shortage of construction workers.
“Greater Christchurch’s labour supply for the rebuild was tightand was likely to remain that way.”
(5: expresses the query in its entirety)
![Page 28: NaturalLanguageProcessingfor ...lucylin/docs/slides-ghtc-2018.pdf · NaturalLanguageProcessingfor AnalyzingDisasterRecoveryTrends ExpressedinLargeTextCorpora LucyH.Lin,ScottB.Miles,NoahA.Smith](https://reader033.vdocuments.net/reader033/viewer/2022043007/5f95a0b31811b075b24e7e1a/html5/thumbnails/28.jpg)
User evaluation: results
1
1.5
2
2.5
3
1.06
2.03
3.1 3.22
Averagescore
Best performing system
Not selected by filter Selected by filter (unmatched)
![Page 29: NaturalLanguageProcessingfor ...lucylin/docs/slides-ghtc-2018.pdf · NaturalLanguageProcessingfor AnalyzingDisasterRecoveryTrends ExpressedinLargeTextCorpora LucyH.Lin,ScottB.Miles,NoahA.Smith](https://reader033.vdocuments.net/reader033/viewer/2022043007/5f95a0b31811b075b24e7e1a/html5/thumbnails/29.jpg)
User evaluation: results
1
1.5
2
2.5
3
1.06
2.03
3.1 3.22
Averagescore
Best performing system
Not selected by filter Selected by filter (unmatched)Highest-scoring by filter
![Page 30: NaturalLanguageProcessingfor ...lucylin/docs/slides-ghtc-2018.pdf · NaturalLanguageProcessingfor AnalyzingDisasterRecoveryTrends ExpressedinLargeTextCorpora LucyH.Lin,ScottB.Miles,NoahA.Smith](https://reader033.vdocuments.net/reader033/viewer/2022043007/5f95a0b31811b075b24e7e1a/html5/thumbnails/30.jpg)
User evaluation: results
1
1.5
2
2.5
3
1.06
2.03
3.1 3.22
Averagescore
Best performing system
Not selected by filter Selected by filter (unmatched)Highest-scoring by filter Matched by method
![Page 31: NaturalLanguageProcessingfor ...lucylin/docs/slides-ghtc-2018.pdf · NaturalLanguageProcessingfor AnalyzingDisasterRecoveryTrends ExpressedinLargeTextCorpora LucyH.Lin,ScottB.Miles,NoahA.Smith](https://reader033.vdocuments.net/reader033/viewer/2022043007/5f95a0b31811b075b24e7e1a/html5/thumbnails/31.jpg)
User evaluation: is this interesting?
Other feedback:
• 17/20 respondents interested in measuring ideas innews/other text corpora
![Page 32: NaturalLanguageProcessingfor ...lucylin/docs/slides-ghtc-2018.pdf · NaturalLanguageProcessingfor AnalyzingDisasterRecoveryTrends ExpressedinLargeTextCorpora LucyH.Lin,ScottB.Miles,NoahA.Smith](https://reader033.vdocuments.net/reader033/viewer/2022043007/5f95a0b31811b075b24e7e1a/html5/thumbnails/32.jpg)
User evaluation: round two
Follow-up study:
• Participant-supplied queries (18)• 7 return participants• Replicated findings of first user study
![Page 33: NaturalLanguageProcessingfor ...lucylin/docs/slides-ghtc-2018.pdf · NaturalLanguageProcessingfor AnalyzingDisasterRecoveryTrends ExpressedinLargeTextCorpora LucyH.Lin,ScottB.Miles,NoahA.Smith](https://reader033.vdocuments.net/reader033/viewer/2022043007/5f95a0b31811b075b24e7e1a/html5/thumbnails/33.jpg)
Outline
1. Introduction
2. Case study: 2010-2011 Canterbury earthquake disaster
3. NLP method for semantic matching
4. User evaluation
5. Qualitative/quantitative output
6. Conclusions
![Page 34: NaturalLanguageProcessingfor ...lucylin/docs/slides-ghtc-2018.pdf · NaturalLanguageProcessingfor AnalyzingDisasterRecoveryTrends ExpressedinLargeTextCorpora LucyH.Lin,ScottB.Miles,NoahA.Smith](https://reader033.vdocuments.net/reader033/viewer/2022043007/5f95a0b31811b075b24e7e1a/html5/thumbnails/34.jpg)
Recovery trends: example #1
2010 2011 2012 2013 2014 2015
0
3
6
9
12
Frequency
The power system was fully restored quickly.
![Page 35: NaturalLanguageProcessingfor ...lucylin/docs/slides-ghtc-2018.pdf · NaturalLanguageProcessingfor AnalyzingDisasterRecoveryTrends ExpressedinLargeTextCorpora LucyH.Lin,ScottB.Miles,NoahA.Smith](https://reader033.vdocuments.net/reader033/viewer/2022043007/5f95a0b31811b075b24e7e1a/html5/thumbnails/35.jpg)
Recovery trends: example #1
2010 2011 2012 2013 2014 2015
0
3
6
9
12
Frequency
The power system was fully restored quickly.
“Orion Energy CEO Roger Sutton says most of the west ofChristchurch now has fully restored power.”
![Page 36: NaturalLanguageProcessingfor ...lucylin/docs/slides-ghtc-2018.pdf · NaturalLanguageProcessingfor AnalyzingDisasterRecoveryTrends ExpressedinLargeTextCorpora LucyH.Lin,ScottB.Miles,NoahA.Smith](https://reader033.vdocuments.net/reader033/viewer/2022043007/5f95a0b31811b075b24e7e1a/html5/thumbnails/36.jpg)
Recovery trends: example #1
2010 2011 2012 2013 2014 2015
0
3
6
9
12
Frequency
The power system was fully restored quickly.
“He had no water but power had been restored in his area.”
![Page 37: NaturalLanguageProcessingfor ...lucylin/docs/slides-ghtc-2018.pdf · NaturalLanguageProcessingfor AnalyzingDisasterRecoveryTrends ExpressedinLargeTextCorpora LucyH.Lin,ScottB.Miles,NoahA.Smith](https://reader033.vdocuments.net/reader033/viewer/2022043007/5f95a0b31811b075b24e7e1a/html5/thumbnails/37.jpg)
Recovery trends: example #1
2010 2011 2012 2013 2014 2015
0
3
6
9
12
Frequency
The power system was fully restored quickly.
“TV3 reports that power has now been restored to 60 per centof Christchurch.”
![Page 38: NaturalLanguageProcessingfor ...lucylin/docs/slides-ghtc-2018.pdf · NaturalLanguageProcessingfor AnalyzingDisasterRecoveryTrends ExpressedinLargeTextCorpora LucyH.Lin,ScottB.Miles,NoahA.Smith](https://reader033.vdocuments.net/reader033/viewer/2022043007/5f95a0b31811b075b24e7e1a/html5/thumbnails/38.jpg)
Recovery trends: example #1
2010 2011 2012 2013 2014 2015
0
3
6
9
12
Frequency
The power system was fully restored quickly.
“It had been unable to access the electricity network torestore power and the situation could remain for the next fewdays.”
![Page 39: NaturalLanguageProcessingfor ...lucylin/docs/slides-ghtc-2018.pdf · NaturalLanguageProcessingfor AnalyzingDisasterRecoveryTrends ExpressedinLargeTextCorpora LucyH.Lin,ScottB.Miles,NoahA.Smith](https://reader033.vdocuments.net/reader033/viewer/2022043007/5f95a0b31811b075b24e7e1a/html5/thumbnails/39.jpg)
Recovery trends: example #2
2010 2011 2012 2013 2014 2015
0
2
4
6
Frequency
Dealing w/authorities is causing stress and anxiety.
![Page 40: NaturalLanguageProcessingfor ...lucylin/docs/slides-ghtc-2018.pdf · NaturalLanguageProcessingfor AnalyzingDisasterRecoveryTrends ExpressedinLargeTextCorpora LucyH.Lin,ScottB.Miles,NoahA.Smith](https://reader033.vdocuments.net/reader033/viewer/2022043007/5f95a0b31811b075b24e7e1a/html5/thumbnails/40.jpg)
Recovery trends: example #2
2010 2011 2012 2013 2014 2015
0
2
4
6
Frequency
Dealing w/authorities is causing stress and anxiety.
“The initial trauma may be over but […] Christchurch residentswill endure at least six years of ‘man-made’ stressors as theregion battles bureaucracy.” (5)
![Page 41: NaturalLanguageProcessingfor ...lucylin/docs/slides-ghtc-2018.pdf · NaturalLanguageProcessingfor AnalyzingDisasterRecoveryTrends ExpressedinLargeTextCorpora LucyH.Lin,ScottB.Miles,NoahA.Smith](https://reader033.vdocuments.net/reader033/viewer/2022043007/5f95a0b31811b075b24e7e1a/html5/thumbnails/41.jpg)
Recovery trends: example #2
2010 2011 2012 2013 2014 2015
0
2
4
6
Frequency
Dealing w/authorities is causing stress and anxiety.
“Add to this the growing frustration among the new, youthfulleaders of the community who emerged in the wake of thequakes.” (3)
![Page 42: NaturalLanguageProcessingfor ...lucylin/docs/slides-ghtc-2018.pdf · NaturalLanguageProcessingfor AnalyzingDisasterRecoveryTrends ExpressedinLargeTextCorpora LucyH.Lin,ScottB.Miles,NoahA.Smith](https://reader033.vdocuments.net/reader033/viewer/2022043007/5f95a0b31811b075b24e7e1a/html5/thumbnails/42.jpg)
Caveats
• Expected topics generally expressed in the output, butnot necessarily relationships/quantities• Except some domain-specific entities, e.g., CERA & SCIRT
• Measurement plots best explored jointly w/text output(i.e., quantitative & qualitative)
• Small sample size (25 sentences per query)• Reliance on sentences as unit of match
![Page 43: NaturalLanguageProcessingfor ...lucylin/docs/slides-ghtc-2018.pdf · NaturalLanguageProcessingfor AnalyzingDisasterRecoveryTrends ExpressedinLargeTextCorpora LucyH.Lin,ScottB.Miles,NoahA.Smith](https://reader033.vdocuments.net/reader033/viewer/2022043007/5f95a0b31811b075b24e7e1a/html5/thumbnails/43.jpg)
Outline
1. Introduction
2. Case study: 2010-2011 Canterbury earthquake disaster
3. NLP method for semantic matching
4. User evaluation
5. Qualitative/quantitative output
6. Conclusions
![Page 44: NaturalLanguageProcessingfor ...lucylin/docs/slides-ghtc-2018.pdf · NaturalLanguageProcessingfor AnalyzingDisasterRecoveryTrends ExpressedinLargeTextCorpora LucyH.Lin,ScottB.Miles,NoahA.Smith](https://reader033.vdocuments.net/reader033/viewer/2022043007/5f95a0b31811b075b24e7e1a/html5/thumbnails/44.jpg)
Conclusion
New NLP method to measure propositions in text corpora
• Potential applications: long-term recovery planning,exploratory research
• User study with participant interest in method• Future work: richer models, further user engagement
![Page 45: NaturalLanguageProcessingfor ...lucylin/docs/slides-ghtc-2018.pdf · NaturalLanguageProcessingfor AnalyzingDisasterRecoveryTrends ExpressedinLargeTextCorpora LucyH.Lin,ScottB.Miles,NoahA.Smith](https://reader033.vdocuments.net/reader033/viewer/2022043007/5f95a0b31811b075b24e7e1a/html5/thumbnails/45.jpg)
Thanks!
Contact: [email protected]
Website: homes.cs.washington.edu/~lucylin/research/semantic_matching.html
Funding: National Science Foundation(grant #1541025, graduate fellowship)
![Page 46: NaturalLanguageProcessingfor ...lucylin/docs/slides-ghtc-2018.pdf · NaturalLanguageProcessingfor AnalyzingDisasterRecoveryTrends ExpressedinLargeTextCorpora LucyH.Lin,ScottB.Miles,NoahA.Smith](https://reader033.vdocuments.net/reader033/viewer/2022043007/5f95a0b31811b075b24e7e1a/html5/thumbnails/46.jpg)
(more slides)
![Page 47: NaturalLanguageProcessingfor ...lucylin/docs/slides-ghtc-2018.pdf · NaturalLanguageProcessingfor AnalyzingDisasterRecoveryTrends ExpressedinLargeTextCorpora LucyH.Lin,ScottB.Miles,NoahA.Smith](https://reader033.vdocuments.net/reader033/viewer/2022043007/5f95a0b31811b075b24e7e1a/html5/thumbnails/47.jpg)
In relation to other NLP problems...
dynamics of languageacross a corpus(e.g., Blei & Lafferty, 2006)
paraphrase (Dolan et al., 2004),entailment (Dagan et al., 2006),semantic similarity (Agirre et al., 2012)
information retrieval,passage retrieval for QA(Tellex et al., 2003)
![Page 48: NaturalLanguageProcessingfor ...lucylin/docs/slides-ghtc-2018.pdf · NaturalLanguageProcessingfor AnalyzingDisasterRecoveryTrends ExpressedinLargeTextCorpora LucyH.Lin,ScottB.Miles,NoahA.Smith](https://reader033.vdocuments.net/reader033/viewer/2022043007/5f95a0b31811b075b24e7e1a/html5/thumbnails/48.jpg)
Fast filter details
Pre-trained word vectors:
• word2vec (Mikolov et al., 2013), pre-trained on Google News• paraphrastic word vectors (Wieting et al., 2015), based off thePPDB
![Page 49: NaturalLanguageProcessingfor ...lucylin/docs/slides-ghtc-2018.pdf · NaturalLanguageProcessingfor AnalyzingDisasterRecoveryTrends ExpressedinLargeTextCorpora LucyH.Lin,ScottB.Miles,NoahA.Smith](https://reader033.vdocuments.net/reader033/viewer/2022043007/5f95a0b31811b075b24e7e1a/html5/thumbnails/49.jpg)
Tree edit classifier details
Original model (Heilman and Smith, 2010):
• Extract 39 integer features from tree edit sequence:sequence length, counts of edit types
• Logistic regression (LR)→ m(sp, s)
![Page 50: NaturalLanguageProcessingfor ...lucylin/docs/slides-ghtc-2018.pdf · NaturalLanguageProcessingfor AnalyzingDisasterRecoveryTrends ExpressedinLargeTextCorpora LucyH.Lin,ScottB.Miles,NoahA.Smith](https://reader033.vdocuments.net/reader033/viewer/2022043007/5f95a0b31811b075b24e7e1a/html5/thumbnails/50.jpg)
Tree edit classifier details
New variation: input tree edit sequence into a LSTM
Each operation in the sequence is vectorized as:
• One-hot encoding of the operation type• Word vector ∆ between the sentences pre- andpost-operation• insert→ word embedding of new word• relabel→ difference between word embeddings• delete→ negated word embedding of deleted word
![Page 51: NaturalLanguageProcessingfor ...lucylin/docs/slides-ghtc-2018.pdf · NaturalLanguageProcessingfor AnalyzingDisasterRecoveryTrends ExpressedinLargeTextCorpora LucyH.Lin,ScottB.Miles,NoahA.Smith](https://reader033.vdocuments.net/reader033/viewer/2022043007/5f95a0b31811b075b24e7e1a/html5/thumbnails/51.jpg)
Tree edit classifier details
Training: SNLI corpus (Bowman et al., 2015)
• 570k pairs of sentences• labels: entailment, contradiction, neutral• e.g.,: “A soccer game with multiple males playing.”entails “Some men are playing a sport.”
Mapping to our problem:
• s→ premise, sp → hypothesis• match→ entailment,non-match→ contradiction/neutral
![Page 52: NaturalLanguageProcessingfor ...lucylin/docs/slides-ghtc-2018.pdf · NaturalLanguageProcessingfor AnalyzingDisasterRecoveryTrends ExpressedinLargeTextCorpora LucyH.Lin,ScottB.Miles,NoahA.Smith](https://reader033.vdocuments.net/reader033/viewer/2022043007/5f95a0b31811b075b24e7e1a/html5/thumbnails/52.jpg)
Disaster recovery queries
Residents are frustrated by the slow pace of recovery.
The repair programme is on schedule to be completed.
Money for repairs is running out.
The council should have consulted residents before makingdecisions.
Mental health rates have been rising.
Dealing with authorities is causing stress and anxiety.
Most eligible property owners have accepted insurance offers.
Confidence in Cera has been trending downwards.
Water quality declined after the earthquakes.
The power system was fully restored quickly.
![Page 53: NaturalLanguageProcessingfor ...lucylin/docs/slides-ghtc-2018.pdf · NaturalLanguageProcessingfor AnalyzingDisasterRecoveryTrends ExpressedinLargeTextCorpora LucyH.Lin,ScottB.Miles,NoahA.Smith](https://reader033.vdocuments.net/reader033/viewer/2022043007/5f95a0b31811b075b24e7e1a/html5/thumbnails/53.jpg)
Disaster recovery queries
Cera missed several recovery milestones.
Prices levelled off as more homes were fixed or rebuilt.
People are suffering because they’ve lost the intimacy of theirrelationships.
Coordination between rebuild groups has been problematic.
Few people said insurance companies had done a good job.
Having the art gallery back makes the city feel more whole.
Scirt has spent less money than predicted.
![Page 54: NaturalLanguageProcessingfor ...lucylin/docs/slides-ghtc-2018.pdf · NaturalLanguageProcessingfor AnalyzingDisasterRecoveryTrends ExpressedinLargeTextCorpora LucyH.Lin,ScottB.Miles,NoahA.Smith](https://reader033.vdocuments.net/reader033/viewer/2022043007/5f95a0b31811b075b24e7e1a/html5/thumbnails/54.jpg)
Disaster recovery queries
Traffic congestion was severe due to road repairs.
Some of the businesses forced out by the earthquake arereturning.
Some of the burden on mental health services is caused by lackof housing.