1 “is the sky pure today?” awkchecker: an assistive tool for detecting and correcting...
TRANSCRIPT
![Page 1: 1 “Is the Sky Pure Today?” AwkChecker: An Assistive Tool for Detecting and Correcting Collocation Errors Taehyun Park, Edward Lank, Pascal Poupart, Michael](https://reader036.vdocuments.net/reader036/viewer/2022062518/56649e855503460f94b8819d/html5/thumbnails/1.jpg)
1
“Is the Sky Pure Today?”AwkChecker: An Assistive Tool for Detecting and
Correcting Collocation Errors
Taehyun Park, Edward Lank, Pascal Poupart, Michael TerryDavid R. Cheriton School of Computer ScienceUniversity of Waterloo, Waterloo, ON, Canada, N2L 3G1
ACM UIST 2008
![Page 2: 1 “Is the Sky Pure Today?” AwkChecker: An Assistive Tool for Detecting and Correcting Collocation Errors Taehyun Park, Edward Lank, Pascal Poupart, Michael](https://reader036.vdocuments.net/reader036/viewer/2022062518/56649e855503460f94b8819d/html5/thumbnails/2.jpg)
2
Motivation
writing aids for non-native speakers
non-native speakers can learn a foreign language's rules for spelling and grammar, but not easy to learn word pairs.
Ex.
take their shoes down vs take their shoes off
more common expression
![Page 3: 1 “Is the Sky Pure Today?” AwkChecker: An Assistive Tool for Detecting and Correcting Collocation Errors Taehyun Park, Edward Lank, Pascal Poupart, Michael](https://reader036.vdocuments.net/reader036/viewer/2022062518/56649e855503460f94b8819d/html5/thumbnails/3.jpg)
3
AwkChecker detect collocation errors and suggest alternatives
![Page 4: 1 “Is the Sky Pure Today?” AwkChecker: An Assistive Tool for Detecting and Correcting Collocation Errors Taehyun Park, Edward Lank, Pascal Poupart, Michael](https://reader036.vdocuments.net/reader036/viewer/2022062518/56649e855503460f94b8819d/html5/thumbnails/4.jpg)
4
Contributions
Define collocation errors as a function of the relative frequency of phrase usage within a corpus. Presents algorithms for suggesting alternatives based on the specific types of errors made by NNSs.
1. Insertion (I went to home I went home)2. Deletion (I am student I am a student )3. Transposition (he’s talking with his full mouth he’s talking with his mouth full)4. Substitution (pure sky clear sky)
![Page 5: 1 “Is the Sky Pure Today?” AwkChecker: An Assistive Tool for Detecting and Correcting Collocation Errors Taehyun Park, Edward Lank, Pascal Poupart, Michael](https://reader036.vdocuments.net/reader036/viewer/2022062518/56649e855503460f94b8819d/html5/thumbnails/5.jpg)
5
Detecting Collocation Errors
acceptability of a phrase e
g(e): frequency of input phraseg(c): frequency of alternative phrasef (e,c): edit distance between e and c
If A(e) is less than a user-customizable threshold, the phrase e is flagged as a collocation error.
![Page 6: 1 “Is the Sky Pure Today?” AwkChecker: An Assistive Tool for Detecting and Correcting Collocation Errors Taehyun Park, Edward Lank, Pascal Poupart, Michael](https://reader036.vdocuments.net/reader036/viewer/2022062518/56649e855503460f94b8819d/html5/thumbnails/6.jpg)
6
Evaluation
- User Testing
five non-native speakers had never seen tools such a system before positive reactions employed AwkChecker to check articles and prepositions
pass judgment (to/on) <noun>
![Page 7: 1 “Is the Sky Pure Today?” AwkChecker: An Assistive Tool for Detecting and Correcting Collocation Errors Taehyun Park, Edward Lank, Pascal Poupart, Michael](https://reader036.vdocuments.net/reader036/viewer/2022062518/56649e855503460f94b8819d/html5/thumbnails/7.jpg)
7
Automatic Collocation Suggestion in Academic Writing
Jian-Cheng Wu1 Yu-Chia Chang1,* Teruko Mitamura2 Jason S. Chang1 1 National Tsing Hua University
Hsinchu, Taiwan2 Carnegie Mellon University
Pittsburgh, United States
ACM ACL 2010
![Page 8: 1 “Is the Sky Pure Today?” AwkChecker: An Assistive Tool for Detecting and Correcting Collocation Errors Taehyun Park, Edward Lank, Pascal Poupart, Michael](https://reader036.vdocuments.net/reader036/viewer/2022062518/56649e855503460f94b8819d/html5/thumbnails/8.jpg)
8
Goals
automate suggestions for verb-noun lexical collocation
Verb-noun collocations are recognized as presenting the most challenge to students (Howarth, 1996; Liu,2002). word choice of verbs in collocations which are considered as the most difficult ones for learners to master (Liu,2002; Chang, 2008).
![Page 9: 1 “Is the Sky Pure Today?” AwkChecker: An Assistive Tool for Detecting and Correcting Collocation Errors Taehyun Park, Edward Lank, Pascal Poupart, Michael](https://reader036.vdocuments.net/reader036/viewer/2022062518/56649e855503460f94b8819d/html5/thumbnails/9.jpg)
9
Collocation Inspector
![Page 10: 1 “Is the Sky Pure Today?” AwkChecker: An Assistive Tool for Detecting and Correcting Collocation Errors Taehyun Park, Edward Lank, Pascal Poupart, Michael](https://reader036.vdocuments.net/reader036/viewer/2022062518/56649e855503460f94b8819d/html5/thumbnails/10.jpg)
10
Algorithm of ProducingSuggestions
![Page 11: 1 “Is the Sky Pure Today?” AwkChecker: An Assistive Tool for Detecting and Correcting Collocation Errors Taehyun Park, Edward Lank, Pascal Poupart, Michael](https://reader036.vdocuments.net/reader036/viewer/2022062518/56649e855503460f94b8819d/html5/thumbnails/11.jpg)
11
Collocation Extraction
Ex. We introduce a novel method for learning to find documents on the web.
We proposed that the web-based model would be more effective than corpus-based one.
Use dependency parser (Stanford Parser)
dobj (introduce-2, method-4)
![Page 12: 1 “Is the Sky Pure Today?” AwkChecker: An Assistive Tool for Detecting and Correcting Collocation Errors Taehyun Park, Edward Lank, Pascal Poupart, Michael](https://reader036.vdocuments.net/reader036/viewer/2022062518/56649e855503460f94b8819d/html5/thumbnails/12.jpg)
12
Using a Classifier for the Suggestion task
![Page 13: 1 “Is the Sky Pure Today?” AwkChecker: An Assistive Tool for Detecting and Correcting Collocation Errors Taehyun Park, Edward Lank, Pascal Poupart, Michael](https://reader036.vdocuments.net/reader036/viewer/2022062518/56649e855503460f94b8819d/html5/thumbnails/13.jpg)
13
Effective Feature Selection Training algorithm: Maximum Entropy
- Use contextual features(head , ngram)
Ex: We introduce a novel method for learning to find documents on the web.
![Page 14: 1 “Is the Sky Pure Today?” AwkChecker: An Assistive Tool for Detecting and Correcting Collocation Errors Taehyun Park, Edward Lank, Pascal Poupart, Michael](https://reader036.vdocuments.net/reader036/viewer/2022062518/56649e855503460f94b8819d/html5/thumbnails/14.jpg)
14
ExampleInput :
There are many investigations about wireless network communication, especially it is important to add Internet transfer calculation speeds.
Result
![Page 15: 1 “Is the Sky Pure Today?” AwkChecker: An Assistive Tool for Detecting and Correcting Collocation Errors Taehyun Park, Edward Lank, Pascal Poupart, Michael](https://reader036.vdocuments.net/reader036/viewer/2022062518/56649e855503460f94b8819d/html5/thumbnails/15.jpg)
15
Experiment Training Corpus: CiteSeer (20,306 abstracts, 95,650 sentences)
790 verbal collocates are identified as tagged classes
Test data: randomly select 600 sentences not overlapping with the training set.
![Page 16: 1 “Is the Sky Pure Today?” AwkChecker: An Assistive Tool for Detecting and Correcting Collocation Errors Taehyun Park, Edward Lank, Pascal Poupart, Michael](https://reader036.vdocuments.net/reader036/viewer/2022062518/56649e855503460f94b8819d/html5/thumbnails/16.jpg)
16
The YouTube Video Recommendation System
James Davidson 、 Benjamin Liebald 、 Junning LiuTaylor Van Vleet 、 Palash Nandy
Google Inc
ACM RecSys 2010
![Page 17: 1 “Is the Sky Pure Today?” AwkChecker: An Assistive Tool for Detecting and Correcting Collocation Errors Taehyun Park, Edward Lank, Pascal Poupart, Michael](https://reader036.vdocuments.net/reader036/viewer/2022062518/56649e855503460f94b8819d/html5/thumbnails/17.jpg)
17
Personalized recommendations user’s previous activity on the site
![Page 18: 1 “Is the Sky Pure Today?” AwkChecker: An Assistive Tool for Detecting and Correcting Collocation Errors Taehyun Park, Edward Lank, Pascal Poupart, Michael](https://reader036.vdocuments.net/reader036/viewer/2022062518/56649e855503460f94b8819d/html5/thumbnails/18.jpg)
18
Goals Help users find high quality videos relevant to their interests.
Recommendations should be updated regularly and reflect a user’s recent activity on the site.
Maintain user privacy.
![Page 19: 1 “Is the Sky Pure Today?” AwkChecker: An Assistive Tool for Detecting and Correcting Collocation Errors Taehyun Park, Edward Lank, Pascal Poupart, Michael](https://reader036.vdocuments.net/reader036/viewer/2022062518/56649e855503460f94b8819d/html5/thumbnails/19.jpg)
19
Challenges
Videos as they are uploaded by users often have no or very poor metadata (title, description).
Videos on YouTube are mostly short form (under 10 minutes in length)
Many of the interesting videos on YouTube have a short life cycle.
![Page 20: 1 “Is the Sky Pure Today?” AwkChecker: An Assistive Tool for Detecting and Correcting Collocation Errors Taehyun Park, Edward Lank, Pascal Poupart, Michael](https://reader036.vdocuments.net/reader036/viewer/2022062518/56649e855503460f94b8819d/html5/thumbnails/20.jpg)
20
System Designseed
1
2
Videos rank using relevance anddiversity.
user
…
Top-N candidates
![Page 21: 1 “Is the Sky Pure Today?” AwkChecker: An Assistive Tool for Detecting and Correcting Collocation Errors Taehyun Park, Edward Lank, Pascal Poupart, Michael](https://reader036.vdocuments.net/reader036/viewer/2022062518/56649e855503460f94b8819d/html5/thumbnails/21.jpg)
21
Input Data(seed)
1. videos that were watched (potentially beyond a certain threshold)
2. videos that were explicitly favorited, “liked”, rated or added to playlists
![Page 22: 1 “Is the Sky Pure Today?” AwkChecker: An Assistive Tool for Detecting and Correcting Collocation Errors Taehyun Park, Edward Lank, Pascal Poupart, Michael](https://reader036.vdocuments.net/reader036/viewer/2022062518/56649e855503460f94b8819d/html5/thumbnails/22.jpg)
22
Related Videos(candidates) relatedness score
: total occurrence counts across all sessions for videos vi and vj
: global popularity for videos vi and vj
Threshold :overall view count
Top-N candidates of vi
![Page 23: 1 “Is the Sky Pure Today?” AwkChecker: An Assistive Tool for Detecting and Correcting Collocation Errors Taehyun Park, Edward Lank, Pascal Poupart, Michael](https://reader036.vdocuments.net/reader036/viewer/2022062518/56649e855503460f94b8819d/html5/thumbnails/23.jpg)
23
Generating Recommendation Candidates
Candidate set:
S: seed setR: related videosn: distance of n from any video in the seed set
![Page 24: 1 “Is the Sky Pure Today?” AwkChecker: An Assistive Tool for Detecting and Correcting Collocation Errors Taehyun Park, Edward Lank, Pascal Poupart, Michael](https://reader036.vdocuments.net/reader036/viewer/2022062518/56649e855503460f94b8819d/html5/thumbnails/24.jpg)
24
Ranking
video quality (view count ,commenting, sharing activity…) user specificity (consider properties of the seed video) diversification (videos that are too similar to each other are removed)
![Page 25: 1 “Is the Sky Pure Today?” AwkChecker: An Assistive Tool for Detecting and Correcting Collocation Errors Taehyun Park, Edward Lank, Pascal Poupart, Michael](https://reader036.vdocuments.net/reader036/viewer/2022062518/56649e855503460f94b8819d/html5/thumbnails/25.jpg)
25
Evaluation
![Page 26: 1 “Is the Sky Pure Today?” AwkChecker: An Assistive Tool for Detecting and Correcting Collocation Errors Taehyun Park, Edward Lank, Pascal Poupart, Michael](https://reader036.vdocuments.net/reader036/viewer/2022062518/56649e855503460f94b8819d/html5/thumbnails/26.jpg)
26
Text Cohesion Visualizer
Chakarida Nukoolkit, Praewphan Chansripiboon Pornchai Mongkolnam, Richard Watson Todd*
Computer Science Program, School of Information TechnologySchool of Liberal Arts*
King Mongkut’s University of Technology ThonburiBangkok, 10140 Thailand
IEEE ICCSE 2011
![Page 27: 1 “Is the Sky Pure Today?” AwkChecker: An Assistive Tool for Detecting and Correcting Collocation Errors Taehyun Park, Edward Lank, Pascal Poupart, Michael](https://reader036.vdocuments.net/reader036/viewer/2022062518/56649e855503460f94b8819d/html5/thumbnails/27.jpg)
27
Goals design of a prototype system developed to help analyze the lexical coherence of essays
provide visualized output as writing feedback to users
![Page 28: 1 “Is the Sky Pure Today?” AwkChecker: An Assistive Tool for Detecting and Correcting Collocation Errors Taehyun Park, Edward Lank, Pascal Poupart, Michael](https://reader036.vdocuments.net/reader036/viewer/2022062518/56649e855503460f94b8819d/html5/thumbnails/28.jpg)
28
System Flowchart
Preprocessing
Matching keywords
Creating bond table
(Stanford Part Of Speech tagger)
![Page 29: 1 “Is the Sky Pure Today?” AwkChecker: An Assistive Tool for Detecting and Correcting Collocation Errors Taehyun Park, Edward Lank, Pascal Poupart, Michael](https://reader036.vdocuments.net/reader036/viewer/2022062518/56649e855503460f94b8819d/html5/thumbnails/29.jpg)
29
Matching keywords count the number of matched words (link) between any two sentences
four types of matching: 1. repetition 2. complex repetition 3. paraphrase(synonyms,hypernyms) 4. pronoun
![Page 30: 1 “Is the Sky Pure Today?” AwkChecker: An Assistive Tool for Detecting and Correcting Collocation Errors Taehyun Park, Edward Lank, Pascal Poupart, Michael](https://reader036.vdocuments.net/reader036/viewer/2022062518/56649e855503460f94b8819d/html5/thumbnails/30.jpg)
30
Creating bond table
indicating whether or not there is a bond between sentences.
![Page 31: 1 “Is the Sky Pure Today?” AwkChecker: An Assistive Tool for Detecting and Correcting Collocation Errors Taehyun Park, Edward Lank, Pascal Poupart, Michael](https://reader036.vdocuments.net/reader036/viewer/2022062518/56649e855503460f94b8819d/html5/thumbnails/31.jpg)
31
![Page 32: 1 “Is the Sky Pure Today?” AwkChecker: An Assistive Tool for Detecting and Correcting Collocation Errors Taehyun Park, Edward Lank, Pascal Poupart, Michael](https://reader036.vdocuments.net/reader036/viewer/2022062518/56649e855503460f94b8819d/html5/thumbnails/32.jpg)
32
six types
![Page 33: 1 “Is the Sky Pure Today?” AwkChecker: An Assistive Tool for Detecting and Correcting Collocation Errors Taehyun Park, Edward Lank, Pascal Poupart, Michael](https://reader036.vdocuments.net/reader036/viewer/2022062518/56649e855503460f94b8819d/html5/thumbnails/33.jpg)
33
Conclusion We proposed an application that can detect the cohesion errors in text correctly as experts indicate. The system’s accuracy is at an acceptable level according to expert opinion.
In future work, we first plan to improve the process of matching keywords for more accurate results by augmenting the existing process with more specific linguistic rules.
![Page 34: 1 “Is the Sky Pure Today?” AwkChecker: An Assistive Tool for Detecting and Correcting Collocation Errors Taehyun Park, Edward Lank, Pascal Poupart, Michael](https://reader036.vdocuments.net/reader036/viewer/2022062518/56649e855503460f94b8819d/html5/thumbnails/34.jpg)
34
See-To-Retrieve: Efficient Processing of Spatio-VisualKeyword Queries
Chao Zhang 、 Lidan Shou 、 Ke Chen 、 Gang Chen
College of Computer ScienceZhejiang University, China
ACM SIGIR 2012
![Page 35: 1 “Is the Sky Pure Today?” AwkChecker: An Assistive Tool for Detecting and Correcting Collocation Errors Taehyun Park, Edward Lank, Pascal Poupart, Michael](https://reader036.vdocuments.net/reader036/viewer/2022062518/56649e855503460f94b8819d/html5/thumbnails/35.jpg)
35
Spatio-Visual Keyword
searches for introductory information about a distant grand church within her eyesight.
![Page 36: 1 “Is the Sky Pure Today?” AwkChecker: An Assistive Tool for Detecting and Correcting Collocation Errors Taehyun Park, Edward Lank, Pascal Poupart, Michael](https://reader036.vdocuments.net/reader036/viewer/2022062518/56649e855503460f94b8819d/html5/thumbnails/36.jpg)
36
Goals
visually conspicuous
semantically relevant
document spacephysical space
WYRIWYS(What-You-Retrieve-Is-
What-You-See)
![Page 37: 1 “Is the Sky Pure Today?” AwkChecker: An Assistive Tool for Detecting and Correcting Collocation Errors Taehyun Park, Edward Lank, Pascal Poupart, Michael](https://reader036.vdocuments.net/reader036/viewer/2022062518/56649e855503460f94b8819d/html5/thumbnails/37.jpg)
37
Motivation state-of-the-art spatial retrieval methods are mostly distance-based but overlook the visibility of objects.
Italianfood
![Page 38: 1 “Is the Sky Pure Today?” AwkChecker: An Assistive Tool for Detecting and Correcting Collocation Errors Taehyun Park, Edward Lank, Pascal Poupart, Michael](https://reader036.vdocuments.net/reader036/viewer/2022062518/56649e855503460f94b8819d/html5/thumbnails/38.jpg)
38
Visibility Analysis
System Flowchart
Ranking Mechanism
![Page 39: 1 “Is the Sky Pure Today?” AwkChecker: An Assistive Tool for Detecting and Correcting Collocation Errors Taehyun Park, Edward Lank, Pascal Poupart, Michael](https://reader036.vdocuments.net/reader036/viewer/2022062518/56649e855503460f94b8819d/html5/thumbnails/39.jpg)
39
![Page 40: 1 “Is the Sky Pure Today?” AwkChecker: An Assistive Tool for Detecting and Correcting Collocation Errors Taehyun Park, Edward Lank, Pascal Poupart, Michael](https://reader036.vdocuments.net/reader036/viewer/2022062518/56649e855503460f94b8819d/html5/thumbnails/40.jpg)
40
Experiment
Data set: 1.street objects in Los Angeles (contains 131,461 MBRs)
2.Gowalla (consists of 28,867 Web documents)
![Page 41: 1 “Is the Sky Pure Today?” AwkChecker: An Assistive Tool for Detecting and Correcting Collocation Errors Taehyun Park, Edward Lank, Pascal Poupart, Michael](https://reader036.vdocuments.net/reader036/viewer/2022062518/56649e855503460f94b8819d/html5/thumbnails/41.jpg)
41
柏安
亞婷 家愷 冠中 ???