detecting emotive sentences with pattern...
Embed Size (px)
TRANSCRIPT

DETECTING EMOTIVE SENTENCES WITH PATTERN-BASED
LANGUAGE MODELLING
Fumito Masui
Rafal Rzepka
Kenji Araki
Kitami Institute of Technology
Hokkaido University

PRESENTATION OUTLINE
1. Problem definition (emotive vs. non-emotive)
2. Language Modeling
3. Pattern Based Language Modeling
4. Experiment setup
5. Results and discussion
6. Conclusions and Future work

PROBLEM DEFINITION
MAINSTREAM:
POSITIVE VS. NEGATIVE
EMOTION TYPES
DISREGARDED OR AS SUBTASK:
IS THE SENTENCE EMOTIONAL / NEUTRAL (PRESELECTION)

PROBLEM DEFINITION
“Was the speaker in an emotional state?”
emotional neutral

PROBLEM DEFINITION
“Was the speaker in an emotional state?”
emotional neutral
Easy to ask laypeople
because everyone thinks
they are specialists in their
own emotions.

PROBLEM DEFINITION
“Was the speaker in an emotional state?”
emotional neutral
Easy to ask laypeople
because everyone thinks
they are specialists in their
own emotions.
1. Junko Minato, David B. Bracewell, Fuji Ren and Shingo Kuroiwa. 2006. Statistical Analysis
of a Japanese Emotion Corpus for Natural Language Processing. LNCS 4114, pp. 924-929
2. Saima Aman and Stan Szpakowicz. 2007. Identifying expressions of emotion in text. In
Proceedings of the 10th International Conference on Text, Speech, and Dialogue (TSD-
2007), Lecture Notes in Computer Science (LNCS), Springer-Verlag.
3. Alena Neviarouskaya, Helmut Prendinger and Mitsuru Ishizuka. 2011. Affect analysis
model: novel rule-based approach to affect sensing from text. Natural Language
Engineering, Vol. 17, No. 1 (2011), pp. 95-135.

PROBLEM DEFINITION
emotional neutral
subjective
objective
“Was the speaker in an emotional state?”
“Did the speaker present their the contents from first-person centric perspective or no specified perspective?”

PROBLEM DEFINITION
emotional neutral
subjective
objective
“Was the speaker in an emotional state?”
“Did the speaker present their the contents from first-person centric perspective or no specified perspective?”
Doesn’t have much to do with emotions
Only expressions of emotions tend
to be first person centric as well.

PROBLEM DEFINITION
emotional neutral
subjective
objective
“Was the speaker in an emotional state?”
“Did the speaker present their the contents from first-person centric perspective or no specified perspective?”
Doesn’t have much to do with emotions
Only expressions of emotions tend
to be first person centric as well.
Opinion could be subjective but neutral
• “I think it will rain tomorrow.”
• “In my opinion the government should
have applied a different policy.”
** Not talking about positive/negative.**

PROBLEM DEFINITION
emotional neutral
subjective
objective
“Was the speaker in an emotional state?”
“Did the speaker present their the contents from first-person centric perspective or no specified perspective?”
Doesn’t have much to do with emotions
Only expressions of emotions tend
to be first person centric as well.
Opinion could be subjective but neutral
• “I think it will rain tomorrow.”
• “In my opinion the government should
have applied a different policy.”
** Not talking about positive/negative.**
1. Janyce M. Wiebe, Rebecca F. Bruce and Thomas P. O’Hara. 1999. Development
and use of a gold-standard data set for subjectivity classifications. In Proceedings
of the Association for Computational Linguistics (ACL-1999), pp. 246-253.
2. Theresa Wilson and Janyce Wiebe. 2005. Annotating Attributions and Private
States. Proceedings of the ACL Workshop on Frontiers in Corpus Annotation II, pp.
53-60.
3. Hong Yu and Vasileios Hatzivassiloglou. 2003. Towards answering opinion
questions: separating facts from opinions and identifying the polarity of opinion
sentences. In Proceedings of Conference on Empirical Methods in Natural Language
Processing (EMNLP-2003), pp. 129-136.
4. Vasileios Hatzivassiloglou and Janice Wiebe. 2000. Effects of adjective orientation
and gradability on sentence subjectivity. In Proceedings of International
Conference on Computational Linguistics (COLING-2000), pp. 299-305.

PROBLEM DEFINITION
“Was the speaker in an emotional state?”
“Did the speaker present their the contents from first-person centric perspective or no specified perspective?”
emotional neutral
subjective
objective
subjective

PROBLEM DEFINITION
“Was the speaker in an emotional state?”
“Did the speaker present their the contents from first-person centric perspective or no specified perspective?”
“Was the sentence expressed with emphasis (distinguishable linguistic emotive features)?”
emotivenon-emotive
emotional neutral
subjective
objective
subjective

PROBLEM DEFINITION
• All emotive sentences are emotional.
• All neutral and objective sentences are non-emotive.
• Some emotional sentences could be non-emotive.
• Subjective sentences could be emotive/non-emotive as well as emotional/neutral.
emotivenon-emotive
emotional neutral
subjective
objective
subjective

PROBLEM DEFINITION
Emotive/non-emotive could help distinguishing between
emotional/neutral
subjective/objective
emotivenon-emotive
emotional neutral
subjective
objective
subjective

PROBLEM DEFINITION
In linguistics:
Karl Buhler in 1934: 3 functions of language: descriptive, impressive, emotive
Stevenson in 1937: emotiveness (with regards to morality as a concept influenced by emotions)
Roman Jakobson in 1960: 6 functions of language: +poetic, +phatic, +metalingual
• Karl Buhler. 1990. Theory of Language. Representational Function of Language. John Benjamins Publ. (reprint from
Karl Buhler. 1934. Sprachtheorie. Die Darstellungsfunktion der Sprache, Ullstein, Frankfurt a. Main, Berlin, Wien,)
• Stevenson, C. L. 1937. The Emotive Meaning of Ethical Terms. In Stevenson, C. L. Facts and Values. Yale University
Press (published 1963). ISBN 0-8371-8212-3.
• Roman Jakobson. 1960. Closing Statement: Linguistics and Poetics. Style in Language, pp.350-377, The MIT Press.

PROBLEM DEFINITION
In linguistics:
Emotive elements: interjections/exclamations (aah, ooh, whoa, great!), hypocoristics (endearments, dog doggy), emotive punctuation (!, ??, …, ~),
emoticons (:-), ^o^), onomatopoeia/mimetic expressions (gitaigo in Japanese)

PROBLEM DEFINITION
In linguistics:
Emotive elements: interjections/exclamations (aah, ooh, whoa, great!), hypocoristics (endearments, dog doggy), emotive punctuation (!, ??, …, ~),
emoticons (:-), ^o^), onomatopoeia/mimetic expressions (gitaigo in Japanese)
Plus - combinations of elements
ああ、今日はなんて気持ちいい日なんだ!(^O^)/Oh, what a pleasant day today, isn’t it ?! (^O^)/
InterjectionExclamative
phrase
Exclamative
grammar
Exclamation
markEmoticon

PROBLEM DEFINITION
In linguistics:
Emotive elements: interjections/exclamations (aah, ooh, whoa, great!), hypocoristics (endearments, dog doggy), emotive punctuation (!, ??, …, ~),
emoticons (:-), ^o^), onomatopoeia/mimetic expressions (gitaigo in Japanese)
Plus - combinations of elements
ああ、今日はなんて気持ちいい日なんだ!(^O^)/Oh, what a pleasant day today, isn’t it ?! (^O^)/
Are there non-emotive elements ??
InterjectionExclamative
phrase
Exclamative
grammar
Exclamation
markEmoticon

PROBLEM DEFINITION
In linguistics:
Emotive elements: interjections/exclamations (aah, ooh, whoa, great!), hypocoristics (endearments, dog doggy), emotive punctuation (!, ??, …, ~),
emoticons (:-), ^o^), onomatopoeia/mimetic expressions (gitaigo in Japanese)
Plus - combinations of elements
ああ、今日はなんて気持ちいい日なんだ!(^O^)/Oh, what a pleasant day today, isn’t it ?! (^O^)/
Are there non-emotive elements ??
How to extract them both?
InterjectionExclamative
phrase
Exclamative
grammar
Exclamation
markEmoticon

PROBLEM DEFINITION
ああ、今日はなんて気持ちいい日なんだ!(Oh, what a pleasant day today, isn’t it?)
This sentence contains the pattern:
ああ * なんて * なんだ! (Oh, what a * isn’t it?)
1. This pattern cannot be discovered with n-gram approach.
2. This pattern cannot be discovered if one doesn’t know what to look for.
Need to find a way to extract such frequent sophisticated patterns
from corpora.
*) pattern = something that frequently appears in a corpus (more than once).

LANGUAGE MODELING

INTRODUCTION
Language modelling
HOW YOU IMAGINE IT

LANGUAGE MODELING
IN REALITY…

LANGUAGE MODELING
Statistical representation of a piece of language data

LANGUAGE MODELING
1. Bag-of-words
2. N-gram
3. Skip-gram

LANGUAGE MODELING
1. Bag-of-words
2. N-gram
3. Skip-gram
Unordered set of words
The dog bit the man = The man bit the dog
- No grammar
- No word order
- Just a bag of words…
Harris, Zellig. 1954. Distributional Structure. Word, 10 (2/3), pp. 146-162.

LANGUAGE MODELING
1. Bag-of-words
2. N-gram
3. Skip-gram
Unordered set of words
The dog bit the man = The man bit the dog
- No grammar
- No word order
- Just a bag of words…
Harris, Zellig. 1954. Distributional Structure. Word, 10 (2/3), pp. 146-162.
POPULAR IN
MACHINE
LEARNING

LANGUAGE MODELING
1. Bag-of-words
2. N-gram
3. Skip-gram
Unordered set of words
The dog bit the man = The man bit the dog
- No grammar
- No word order
- Just a bag of words…
• Harris, Zellig. 1954. Distributional Structure. Word, 10 (2/3), pp. 146-162.
• E. Cambria and A. Hussain. 2012. Sentic Computing: Techniques, Tools, and Applications. Dordrecht, Netherlands: Springer.
• Yuanhua Lv and ChengXiang Zhai. 2009. Positional Language Models for Information Retrieval. In Proceedings of the 32nd
international ACM SIGIR conference on Research and development in information retrieval (SIGIR), pp. 299-306.
POPULAR IN
MACHINE
LEARNINGModifications:
• Positional Language Model
• Bag-of-concept

LANGUAGE MODELING
1. Bag-of-words
2. N-gram
3. Skip-gram
Sentence = set of n-long ordered sub-sequences
of words.
The dog bit the man
2grams:
the dog | dog bit | bit the | the man
3grams:
the dog bit | dog bit the | bit the man
4grams:
the dog bit the | dog bit the man
• C. E. Shannon. 1948. A Mathematical Theory of Communication, The Bell System Technical Journal, Vol. 27, pp. 379-423 (623-656), 1948.
• A.A. Markov. Extension of the limit theorems of probability theory to a sum of variables connected in a chain. Reprinted in Appendix B of: R.
Howard. 1971. Dynamic Probabilistic Systems, Vol. 1: Markov Chains. John Wiley and Sons.

LANGUAGE MODELING
1. Bag-of-words
2. N-gram
3. Skip-gram
Sentence = set of n-long ordered sub-sequences
of words.
The dog bit the man
2grams:
the dog | dog bit | bit the | the man
3grams:
the dog bit | dog bit the | bit the man
4grams:
the dog bit the | dog bit the man
• C. E. Shannon. 1948. A Mathematical Theory of Communication, The Bell System Technical Journal, Vol. 27, pp. 379-423 (623-656), 1948.
• A.A. Markov. Extension of the limit theorems of probability theory to a sum of variables connected in a chain. Reprinted in Appendix B of: R.
Howard. 1971. Dynamic Probabilistic Systems, Vol. 1: Markov Chains. John Wiley and Sons.
POPULAR IN
MACHINE
TRANSLATION

LANGUAGE MODELING
1. Bag-of-words
2. N-gram
3. Skip-gram
Sentence = set of n-long ordered sub-sequences
of words.
(1) John went to school today.
John went 👍went to 👍
John * school
• C. E. Shannon. 1948. A Mathematical Theory of Communication, The Bell System Technical Journal, Vol. 27, pp. 379-423 (623-656), 1948.
• A.A. Markov. Extension of the limit theorems of probability theory to a sum of variables connected in a chain. Reprinted in Appendix B of: R.
Howard. 1971. Dynamic Probabilistic Systems, Vol. 1: Markov Chains. John Wiley and Sons.
👍

LANGUAGE MODELING
1. Bag-of-words
2. N-gram
3. Skip-gram
• Xuedong Huang, Fileno Alleva, Hsiao-wuen Hon, Mei-yuh Hwang, Ronald Rosenfeld. 1992. The SPHINX-II Speech Recognition System: An Overview, Computer, Speech and Language, Vol. 7, pp. 137–148.
• Guthrie, D., Allison, B., Liu,W., Guthrie, L., &Wilks, Y. (2006). A closer look at skip-gram modelling. In Proceedings of the 5th international Conference on Language Resources and Evaluation (LREC-2006), pp. 1-4.
• Rene Pickhardt, Thomas Gottron, Martin Korner, Paul Georg Wagner, Till Speicher, Steffen Staab. 2014. A Generalized Language Model as the Combination of Skipped n-grams and Modified Kneser Ney Smoothing.
In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (ACL 2014), pp. 1145-1154.

LANGUAGE MODELING
1. Bag-of-words
2. N-gram
3. Skip-gram
Sentence = some words within an n-gram can be
skipped over
(1) John went to school today.
John went 👍went to 👍
John * to 👍
3gram: John went to
1skip2gram: John _ to
• Xuedong Huang, Fileno Alleva, Hsiao-wuen Hon, Mei-yuh Hwang, Ronald Rosenfeld. 1992. The SPHINX-II Speech Recognition System: An Overview, Computer, Speech and Language, Vol. 7, pp. 137–148.
• Guthrie, D., Allison, B., Liu,W., Guthrie, L., &Wilks, Y. (2006). A closer look at skip-gram modelling. In Proceedings of the 5th international Conference on Language Resources and Evaluation (LREC-2006), pp. 1-4.
• Rene Pickhardt, Thomas Gottron, Martin Korner, Paul Georg Wagner, Till Speicher, Steffen Staab. 2014. A Generalized Language Model as the Combination of Skipped n-grams and Modified Kneser Ney Smoothing.
In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (ACL 2014), pp. 1145-1154.

LANGUAGE MODELING
1. Bag-of-words
2. N-gram
3. Skip-gram
Sentence = some words within an n-gram can be
skipped over
(1) John went to school today.
John went 👍went to 👍
John * to 👍
3gram: John went to
1skip2gram: John _ to
To do this
you need to…
• Xuedong Huang, Fileno Alleva, Hsiao-wuen Hon, Mei-yuh Hwang, Ronald Rosenfeld. 1992. The SPHINX-II Speech Recognition System: An Overview, Computer, Speech and Language, Vol. 7, pp. 137–148.
• Guthrie, D., Allison, B., Liu,W., Guthrie, L., &Wilks, Y. (2006). A closer look at skip-gram modelling. In Proceedings of the 5th international Conference on Language Resources and Evaluation (LREC-2006), pp. 1-4.
• Rene Pickhardt, Thomas Gottron, Martin Korner, Paul Georg Wagner, Till Speicher, Steffen Staab. 2014. A Generalized Language Model as the Combination of Skipped n-grams and Modified Kneser Ney Smoothing.
In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (ACL 2014), pp. 1145-1154.

LANGUAGE MODELING
1. Bag-of-words
2. N-gram
3. Skip-gram
Sentence = some words within an n-gram can be
skipped over
(1) John went to school today.
John went 👍went to 👍
John * school 👍
4gram: John went to school
2skip2gram: John _ _ school
Skip-gram model with
modified Kneser-Ney
Smoothing
…
• Xuedong Huang, Fileno Alleva, Hsiao-wuen Hon, Mei-yuh Hwang, Ronald Rosenfeld. 1992. The SPHINX-II Speech Recognition System: An Overview, Computer, Speech and Language, Vol. 7, pp. 137–148.
• Guthrie, D., Allison, B., Liu,W., Guthrie, L., &Wilks, Y. (2006). A closer look at skip-gram modelling. In Proceedings of the 5th international Conference on Language Resources and Evaluation (LREC-2006), pp. 1-4.
• Rene Pickhardt, Thomas Gottron, Martin Korner, Paul Georg Wagner, Till Speicher, Steffen Staab. 2014. A Generalized Language Model as the Combination of Skipped n-grams and Modified Kneser Ney Smoothing.
In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (ACL 2014), pp. 1145-1154.

LANGUAGE MODELING
1. Bag-of-words
2. N-gram
3. Skip-gram
Sentence = some words within an n-gram can be
skipped over
(1) John went to school today.
(2) John went to this awful place many people
tend to generously call school today.
John went 👍went to 👍
John * to 👍
John * school
John * to * today
👍
👍
And still don’t get
the whole picture.
• Xuedong Huang, Fileno Alleva, Hsiao-wuen Hon, Mei-yuh Hwang, Ronald Rosenfeld. 1992. The SPHINX-II Speech Recognition System: An Overview, Computer, Speech and Language, Vol. 7, pp. 137–148.
• Guthrie, D., Allison, B., Liu,W., Guthrie, L., &Wilks, Y. (2006). A closer look at skip-gram modelling. In Proceedings of the 5th international Conference on Language Resources and Evaluation (LREC-2006), pp. 1-4.
• Rene Pickhardt, Thomas Gottron, Martin Korner, Paul Georg Wagner, Till Speicher, Steffen Staab. 2014. A Generalized Language Model as the Combination of Skipped n-grams and Modified Kneser Ney Smoothing.
In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (ACL 2014), pp. 1145-1154.

LANGUAGE MODELING
1. Bag-of-words
2. N-gram
3. Skip-gram
Sentence = some words within an n-gram can be
skipped over
(1) John went to school today.
(2) John went to this awful place many people
tend to generously call school today.
John went 👍went to 👍
John * to 👍
John * school
John * to * today
👍
👍
Skip-grams cannot
help extracting such
patterns because…
• Xuedong Huang, Fileno Alleva, Hsiao-wuen Hon, Mei-yuh Hwang, Ronald Rosenfeld. 1992. The SPHINX-II Speech Recognition System: An Overview, Computer, Speech and Language, Vol. 7, pp. 137–148.
• Guthrie, D., Allison, B., Liu,W., Guthrie, L., &Wilks, Y. (2006). A closer look at skip-gram modelling. In Proceedings of the 5th international Conference on Language Resources and Evaluation (LREC-2006), pp. 1-4.
• Rene Pickhardt, Thomas Gottron, Martin Korner, Paul Georg Wagner, Till Speicher, Steffen Staab. 2014. A Generalized Language Model as the Combination of Skipped n-grams and Modified Kneser Ney Smoothing.
In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (ACL 2014), pp. 1145-1154.

LANGUAGE MODELING
1. Bag-of-words
2. N-gram
3. Skip-gram
Sentence = some words within an n-gram can be
skipped over
(1) John went to school today.
(2) John went to this awful place many people
tend to generously call school today.
John went 👍went to 👍
John * to 👍
John * school
John * to * today
👍
👍
• Xuedong Huang, Fileno Alleva, Hsiao-wuen Hon, Mei-yuh Hwang, Ronald Rosenfeld. 1992. The SPHINX-II Speech Recognition System: An Overview, Computer, Speech and Language, Vol. 7, pp. 137–148.
• Guthrie, D., Allison, B., Liu,W., Guthrie, L., &Wilks, Y. (2006). A closer look at skip-gram modelling. In Proceedings of the 5th international Conference on Language Resources and Evaluation (LREC-2006), pp. 1-4.
• Rene Pickhardt, Thomas Gottron, Martin Korner, Paul Georg Wagner, Till Speicher, Steffen Staab. 2014. A Generalized Language Model as the Combination of Skipped n-grams and Modified Kneser Ney Smoothing.
In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (ACL 2014), pp. 1145-1154.
1. The “skip” can
appear only in one
place.

LANGUAGE MODELING
1. Bag-of-words
2. N-gram
3. Skip-gram
Sentence = some words within an n-gram can be
skipped over
(1) John went to school today.
(2) John went to this awful place many people
tend to generously call school today.
John went 👍went to 👍
John * to 👍
John * school
John * to * today
👍
👍
• Xuedong Huang, Fileno Alleva, Hsiao-wuen Hon, Mei-yuh Hwang, Ronald Rosenfeld. 1992. The SPHINX-II Speech Recognition System: An Overview, Computer, Speech and Language, Vol. 7, pp. 137–148.
• Guthrie, D., Allison, B., Liu,W., Guthrie, L., &Wilks, Y. (2006). A closer look at skip-gram modelling. In Proceedings of the 5th international Conference on Language Resources and Evaluation (LREC-2006), pp. 1-4.
• Rene Pickhardt, Thomas Gottron, Martin Korner, Paul Georg Wagner, Till Speicher, Steffen Staab. 2014. A Generalized Language Model as the Combination of Skipped n-grams and Modified Kneser Ney Smoothing.
In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (ACL 2014), pp. 1145-1154.
1. The “skip” can
appear only in one
place.
2. The same number of
skips needs to be
retained for each gap.
(Full control of the
skip-length.)

LANGUAGE MODELING
1. Bag-of-words
2. N-gram
3. Skip-gram
Sentence = some words within an n-gram can be
skipped over
(1) John went to school today.
(2) John went to this awful place many people
tend to generously call school today.
John went 👍went to 👍
John * to 👍
John * school
John * to * today
👍
👍
• Xuedong Huang, Fileno Alleva, Hsiao-wuen Hon, Mei-yuh Hwang, Ronald Rosenfeld. 1992. The SPHINX-II Speech Recognition System: An Overview, Computer, Speech and Language, Vol. 7, pp. 137–148.
• Guthrie, D., Allison, B., Liu,W., Guthrie, L., &Wilks, Y. (2006). A closer look at skip-gram modelling. In Proceedings of the 5th international Conference on Language Resources and Evaluation (LREC-2006), pp. 1-4.
• Rene Pickhardt, Thomas Gottron, Martin Korner, Paul Georg Wagner, Till Speicher, Steffen Staab. 2014. A Generalized Language Model as the Combination of Skipped n-grams and Modified Kneser Ney Smoothing.
In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (ACL 2014), pp. 1145-1154.
1. The “skip” can
appear only in one
place.
2. The same number of
skips needs to be
retained for each gap.
(Full control of the
skip-length.)
w s{1} w s{1} w
w s{1} w s{10} w
≠

LANGUAGE MODELING
1. Bag-of-words
2. N-gram
3. Skip-gram
Sentence = some words within an n-gram can be
skipped over
(1) John went to school today.
(2) John went to this awful place many people
tend to generously call school today.
John went 👍went to 👍
John * to 👍
John * school
John * to * today
👍
👍
• Xuedong Huang, Fileno Alleva, Hsiao-wuen Hon, Mei-yuh Hwang, Ronald Rosenfeld. 1992. The SPHINX-II Speech Recognition System: An Overview, Computer, Speech and Language, Vol. 7, pp. 137–148.
• Guthrie, D., Allison, B., Liu,W., Guthrie, L., &Wilks, Y. (2006). A closer look at skip-gram modelling. In Proceedings of the 5th international Conference on Language Resources and Evaluation (LREC-2006), pp. 1-4.
• Rene Pickhardt, Thomas Gottron, Martin Korner, Paul Georg Wagner, Till Speicher, Steffen Staab. 2014. A Generalized Language Model as the Combination of Skipped n-grams and Modified Kneser Ney Smoothing.
In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (ACL 2014), pp. 1145-1154.
w s{1} w s{1} w
w s{1} w s{10} w
≠NOT SO
POPULAR
1. The “skip” can
appear only in one
place.
2. The same number of
skips needs to be
retained for each gap.
(Full control of the
skip-length.)

LANGUAGE MODELING
1. Bag-of-words
2. N-gram
3. Skip-gram
Solution &
simplification:

PATTERN BASED LANGUAGE MODELING
SPEC – Sentence Pattern Extraction arChitecture
Sentence pattern = ordered non-repeated combinations of sentence elements.
For 1 ≤ k ≤ n , there is all possible k-long patterns, and
• Michal Ptaszynski, Rafal Rzepka, Kenji Araki and Yoshio Momouchi. 2011. Language combinatorics: A sentence pattern extraction architecture based on combinatorial explosion.
International Journal of Computational Linguistics (IJCL), Vol. 2, Issue 1, pp. 24-36.

PATTERN BASED LANGUAGE MODELING
SPEC – Sentence Pattern Extraction arChitecture
Sentence pattern = ordered non-repeated combinations of sentence elements.
For 1 ≤ k ≤ n , there is all possible k-long patterns, and
Extract patterns from
all sentences and
calculate occurrence.
• Michal Ptaszynski, Rafal Rzepka, Kenji Araki and Yoshio Momouchi. 2011. Language combinatorics: A sentence pattern extraction architecture based on combinatorial explosion.
International Journal of Computational Linguistics (IJCL), Vol. 2, Issue 1, pp. 24-36.

PATTERN BASED LANGUAGE MODELING
SPEC – Sentence Pattern Extraction arChitecture
Sentence pattern = ordered non-repeated combinations of sentence elements.
For 1 ≤ k ≤ n , there is all possible k-long patterns, and
Normalized pattern weight
Score for one sentence
• Michal Ptaszynski, Rafal Rzepka, Kenji Araki and Yoshio Momouchi. 2011. Language combinatorics: A sentence pattern extraction architecture based on combinatorial explosion.
International Journal of Computational Linguistics (IJCL), Vol. 2, Issue 1, pp. 24-36.
Next,
classify/
compare
emotive
sentences
with non-
emotive

PATTERN BASED LANGUAGE MODELING
SPEC – Sentence Pattern Extraction arChitecture
Sentence pattern = ordered non-repeated combinations of sentence elements.
For 1 ≤ k ≤ n , there is all possible k-long patterns, and
Normalized pattern weight
Score for one sentence
• Michal Ptaszynski, Rafal Rzepka, Kenji Araki and Yoshio Momouchi. 2011. Language combinatorics: A sentence pattern extraction architecture based on combinatorial explosion.
International Journal of Computational Linguistics (IJCL), Vol. 2, Issue 1, pp. 24-36.

PATTERN BASED LANGUAGE MODELING
1. Bag-of-words
2. N-gram
3. Skip-gram
4. PBLM(all above + more)
Sentence = some words within an n-gram can be
skipped over
(1) John went to school today.
(2) John went to this awful place many people
tend to generously call school today.
John went 👍went to 👍
John * to 👍
John * school 👍
John * to * today 👍

EXPERIMENT SETUP
DATASET
91 sentences close in meaning, but different emotional load
(50 emotive, 41 non-emotive) gathered in an anonymous survey on 30
people of different background (students, businessmen, housewives).
Emotive
高すぎるからねTakasugiru kara ne
'Cause its just too expensive
すごくきれいな海だなあ
Sugoku kirei na umi da naa
Oh, what a beautiful sea!
なんとあの人、結婚するらしいよ
Nanto ano hito, kekkon suru rashii yo
Have you heard? She’s getting married!
Non-emotive
高額なためです。
Kougaku na tame desu.
Due to high cost.
きれいな海です
Kirei na umi desu
This is a beautiful sea
あの日と結婚するらしいです
Ano hito kekkon suru rashii desu
They say she is gatting married.
Examples:

EXPERIMENT SETUP
Preprocessing
Sentence: 今日はなんて気持ちいい日なんだ!
Transliteration: Kyōwanantekimochiiihinanda!
Translation: What a pleasant day it is today!
Preprocessing examples
1. Tokens: Kyō wa nante kimochi ii hi nanda !
2. POS: N TOP ADV N ADJ N COP EXCL
3. Tokens+POS:Kyō[N] wa[TOP] nante[ADV] kimochi[N] ii[ADJ] hi[N]
nanda[COP] ![EXCL]

EXPERIMENT SETUP
Pattern List Modification
1. All patterns
2. Zero-patterns deleted
3. Ambiguous patterns deleted
Weight Calculation Modifications
1. Normalized
2. Award length
3. Award length and occurrence

EXPERIMENT SETUP
Pattern List Modification
1. All patterns
2. Zero-patterns deleted
3. Ambiguous patterns deleted
All patterns vs. only n-grams
Weight Calculation Modifications
1. Normalized
2. Award length
3. Award length and occurrence

EXPERIMENT SETUP
Pattern List Modification
1. All patterns
2. Zero-patterns deleted
3. Ambiguous patterns deleted
All patterns vs. only n-grams
Weight Calculation Modifications
1. Normalized
2. Award length
3. Award length and occurrence
Automatic threshold setting

EXPERIMENT SETUP
Pattern List Modification
1. All patterns
2. Zero-patterns deleted
3. Ambiguous patterns deleted
10-fold Cross Validation
All patterns vs. only n-grams
Weight Calculation Modifications
1. Normalized
2. Award length
3. Award length and occurrence
Automatic threshold setting
One experiment
= 280 runs

EXPERIMENT SETUP
Score calculated in:
- Precision
- Recall
- Balanced F-score
- Accuracy
- Specificity
- Phi-coefficient

RESULTS AND DISCUSSIONTokenized Tokens + POS
Weig
ht
norm
aliz
ed
Length
aw
ard
ed

RESULTS AND DISCUSSIONTokenized Tokens + POS
Weig
ht
norm
aliz
ed
Length
aw
ard
ed
specific elements
are more effective

RESULTS AND DISCUSSIONTokenized Tokens + POS
Weig
ht
norm
aliz
ed
Length
aw
ard
ed
specific elements
are more effectivepatterns better
than n-grams

RESULTS AND DISCUSSIONTokenized Tokens + POS
Weig
ht
norm
aliz
ed
Length
aw
ard
ed
specific elements
are more effectivepatterns better
than n-grams
awarding
length yields
higher results
awarding
length yields
higher results

RESULTS AND DISCUSSIONTokenized Tokens + POS
Weig
ht
norm
aliz
ed
Length
aw
ard
ed
specific elements
are more effectivepatterns better
than n-grams
awarding
length yields
higher results
awarding
length yields
higher results
F=0.76
P=0.64
R=0.95

RESULTS AND DISCUSSIONTokenized Tokens + POS
Weig
ht
norm
aliz
ed
Length
aw
ard
ed
specific elements
are more effectivepatterns better
than n-grams
awarding
length yields
higher results
awarding
length yields
higher results
F=0.76
P=0.64
R=0.95
• SPEC slightly worse than ML-Ask
(F = 0.79, P = 0.8, R = 0.78)
SPEC > ML-Ask
fully automatic handcrafted

RESULTS AND DISCUSSIONTokenized Tokens + POS
Weig
ht
norm
aliz
ed
Length
aw
ard
ed
specific elements
are more effectivepatterns better
than n-grams
awarding
length yields
higher results
awarding
length yields
higher results
F=0.76
P=0.64
R=0.95
• SPEC slightly worse than ML-Ask
(F = 0.79, P = 0.8, R = 0.78)
SPEC > ML-Ask
fully automatic handcrafted
• More efficient (user does
nothing)
• Applicable to other languages
• Can point out non-emotive el.

RESULTS AND DISCUSSION
Examples of extracted
Patterns (Tokenized)
Emotive Non-emotive
freq. example pattern freq. example pattern
14 、*た 11 い*。
12 で 8 し*。11 ん*。 7 です。
11 と 6 は*です11 ー 6 まし*。10 、*た*。 5 ました。
9 、*よ 5 ます9 、*ん 5 い
8 し 4 です*。7 ない 3 この*は*。7 ! 3 は*です。6 ん*よ 3 て*ます6 、*だ 3 が*た。
6 ちゃ 3 美味しい6 よ。 3 た。5 だ*。 2 た*、*。5 に*よ 2 せ5 が*よ 2 か
5 ん 2 さ
Emotive
- Casual wording
- Prolongation
marks
- SFPs
- Subject
particles
Non-emotive
- Official forms
(desu-masu)
- More “periods”
- No exclamation
marks, etc.

RESULTS AND DISCUSSION
EXAMPLE SENTENCES
Example 1.
メガネ、そこにあったんだよ。Megane, soko ni atta n da yo .
(The glasses were over there!)
Example 2.
ううん、舞台が見えないよ。Uun, butai ga mienai yo .
(Ooh, I cannot see the stage!)
Example 3.
ああ、おなかがすいたよ。Aa, onaka ga suita yo .
(Ohh, I’m so hungry)
><0
Example 4.
高額なためです。Kougaku na tame desu.
Due to high cost.
Example 5.
きれいな海ですKirei na umi desu
This is a beautiful sea
Example 6.
今日は雪が降っています.
Kyou wa yuki ga futte imasu.
It is snowing today.
:-|

CONCLUSIONS & FUTURE WORK
Presented research on extracting emotive patterns.
Used SPEC - a method for automatic extraction of patterns from sentences.
Extracted the patterns from a set of emotive and non-emotive sentences.
Classified sentences (test data) with those patterns.
Compared different preprocessing techniques (tokenization, POS, token-POS).
The best results obtained patterns with both tokens and POS (F-score = 76%, Precision = 64%, Recall 95%).
Results for only POS were the lowest. This means the algorithm works better on less abstracted data.
The results of SPEC were compared to ML-Ask affect analysis system. ML-Ask achieved better Precision, but lower Recall. However, SPEC is fully automatic and thus more efficient and language independent.
Many of the automatically extracted patterns appear in handcrafted databases of ML-Ask, which suggests it could be possible to improve ML-Ask performance by extracting additional patterns with SPEC.
In the future we’ll try to quantify the correlation between emotive-subjective-emotional

THANK YOU FOR YOUR ATTENTION!
Kitami Institute of Technology
ptaszynski@ieee.org
http://orion.cs.kitami-it.ac.jp/tipwiki/michal