lexical modeling of yamabuki (japanese kerria) in classical ...yamagen/papers/jadh...01% 23 4 fig. 4...

6
Lexical Modeling of Yamabuki (Japanese Kerria) in Classical Japanese Poetry Hilofumi Yamamoto * Tokyo Institute of Technology University of California, San Diego Keywords: corpus linguistics, co-occurrence weight, visualization, Japanese literature, network modeling Abstract This project is a lexical study of classical Japanese poetic vocabulary through network analysis. The anal- ysis is based on co-occurrence patterns, defined as any two words appearing in a poem. We developed the corpora of classical Japanese poetry based on the eight anthologies compiled under imperial order called the Hachidaish¯ u which were established from ca. 905 to 1205. The co-occurrence weighting, cw, allows us to examine the patterns of poetic word constructions through mathematical modeling(Yamamoto, 2006). As a result, we could in general observe a main hub node derived from a topic word. We also encountered other hub nodes which do not indicate topic words nor entry items in a poetic dictionary. We conclude that a term such as yahe can be shown as a hub node to connect a topic word with other peripheral words, and plays a supporting role to form a poetic story in the poem even if it is not included in a dictionary for Japanese poetic words. 1 Introduction This project is a lexical study of classical Japanese poetic vocabulary through network analysis based on graph theory. The analysis is mainly conducted with co-occurrence patterns defined as any two words appearing in a poem. Many scholars of classical Japanese poetry have tried to explain the constructions of poetic vocabulary relied on their sensitivity and experience. As scholars can only demonstrate constructions that they can consciously point out, those that they are unconscious of will never be demonstrated. Generally, when writing a dictionary, since the writer of dictionary picks up only what it is conscious, important knowledge in the subconscious mind of researchers is not stated. In order to describe what is in the subconscious mind of humans for the development of dictionary, it is important to adopt a computer- assisted description that does not depend on researcher’s intuition, sensitivity, consciousness. To this end, we will use co-occurrence weighting methods on corpora of classical Japanese poetry for the selection of lexical items to describe in a dictionary. 2 The flowers of Yamabuki (Japanese kerria) Three words, yamabuki (Japanese kerria), kahazu (Frog), Ide (a proper noun which indicates a placename in Kyoto), are frequently used together in waka and haiku, which are closely related to each other. Not only in poetry, they are used together in Japanese art as well such as in ukiyoe (Figure 1 and 2). In the * [email protected]: W1-8 2-12-1 O-okayama, Meguro-ku, Tokyo, 152-8550, Japan

Upload: others

Post on 05-Sep-2021

7 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Lexical Modeling of Yamabuki (Japanese Kerria) in Classical ...yamagen/papers/JADH...01% 23 4 Fig. 4 Graph model of kahazu (蛙, frog) after pruning node 蛙. 5 Discussions The terms

Lexical Modeling of Yamabuki (Japanese Kerria) in

Classical Japanese Poetry

Hilofumi Yamamoto∗

Tokyo Institute of TechnologyUniversity of California, San Diego

Keywords: corpus linguistics, co-occurrence weight, visualization, Japanese literature, network modeling

AbstractThis project is a lexical study of classical Japanese poetic vocabulary through network analysis. The anal-ysis is based on co-occurrence patterns, defined as any two words appearing in a poem. We developed thecorpora of classical Japanese poetry based on the eight anthologies compiled under imperial order calledthe Hachidaishu which were established from ca. 905 to 1205. The co-occurrence weighting, cw, allowsus to examine the patterns of poetic word constructions through mathematical modeling(Yamamoto,2006). As a result, we could in general observe a main hub node derived from a topic word. We alsoencountered other hub nodes which do not indicate topic words nor entry items in a poetic dictionary.We conclude that a term such as yahe can be shown as a hub node to connect a topic word with otherperipheral words, and plays a supporting role to form a poetic story in the poem even if it is not includedin a dictionary for Japanese poetic words.

1 IntroductionThis project is a lexical study of classical Japanese poetic vocabulary through network analysis based ongraph theory. The analysis is mainly conducted with co-occurrence patterns defined as any two wordsappearing in a poem.Many scholars of classical Japanese poetry have tried to explain the constructions of poetic vocabulary

relied on their sensitivity and experience. As scholars can only demonstrate constructions that theycan consciously point out, those that they are unconscious of will never be demonstrated. Generally,when writing a dictionary, since the writer of dictionary picks up only what it is conscious, importantknowledge in the subconscious mind of researchers is not stated. In order to describe what is in thesubconscious mind of humans for the development of dictionary, it is important to adopt a computer-assisted description that does not depend on researcher’s intuition, sensitivity, consciousness. To thisend, we will use co-occurrence weighting methods on corpora of classical Japanese poetry for the selectionof lexical items to describe in a dictionary.

2 The flowers of Yamabuki (Japanese kerria)Three words, yamabuki (Japanese kerria), kahazu (Frog), Ide (a proper noun which indicates a placenamein Kyoto), are frequently used together in waka and haiku, which are closely related to each other. Notonly in poetry, they are used together in Japanese art as well such as in ukiyoe (Figure 1 and 2). In the

[email protected]: W1-8 2-12-1 O-okayama, Meguro-ku, Tokyo, 152-8550, Japan

Page 2: Lexical Modeling of Yamabuki (Japanese Kerria) in Classical ...yamagen/papers/JADH...01% 23 4 Fig. 4 Graph model of kahazu (蛙, frog) after pruning node 蛙. 5 Discussions The terms

context of Japanese culture, frog and Ide are always used at the same time when Yamabuki is used. Ifthis relationship is common sense of Japanese culture, it will be analyzed through poetic texts and threeword relation will be drawn by visualization based on graph theory.

Fig. 1 Yamabuki-to-kahazu by Hiroshige Utagawa.(http://www.gekkanbijutsu.co.jp/shop/goods/030761011.htm)

Fig. 2 Yamashironokuni Ide no Tamagawa by Ku-niyoshi Utagawa in 1847. (https://ukiyo-e.org/image/metro/5245-006-01(01)

3 MethodsWe use the Hachidaishu as a material of the present study, which is the eight anthologies compiled by theorder of Emperors (ca. 905–1205) and contains about 9,500 poems. We developed the corpora of it and amethod of co-occurrence weighting, cw (Yamamoto, 2006) which calculates the weight of patterns of anytwo words appearing in a poem sentence similar to the tfidf method(Sparck Jones, 1972; Robertson,2004; Manning and Schutze, 1999).

w(t, d)=(1+log tf(t, d)) · idf(t)cw(t1, t2, d)=(1+log ctf(t1, t2, d)) · cidf(t1, t2)cidf(t1, t2)=

√idf(t1) · idf(t2)

idf(t)=logN

df(t)

Where, w is weight, t is a token, N is the number of tokens. The function, idf , is called “inversedocument freuency.”(Sparck Jones, 1972; Robertson, 2004; Manning and Schutze, 1999) The function cwis called “co-occurrence weight,” which allows us to examine the patterns of poetic word constructionsthrough mathematical modeling.

Page 3: Lexical Modeling of Yamabuki (Japanese Kerria) in Classical ...yamagen/papers/JADH...01% 23 4 Fig. 4 Graph model of kahazu (蛙, frog) after pruning node 蛙. 5 Discussions The terms

4 ResultsAs a result, when we draw a network model from co-occurrence patterns, we in general observe a mainhub node derived from a topic word. We encountered other hub nodes which do not indicate topic wordsnor words which we generally see in a poetic dictionary as well. we took yamabuki (Japanese kerria) as atopic word and draw its network model. we could observe kahazu (frog), Ide (place name, proper name)as hub nodes as we expected. Not only those but also yahe (eightfold or double flower) could be seen asa hub node. It is never described in any poetic dictionary of classical Japanese.

蛙 (15/15/15, 6.45) cw > 2.50 K:1-8 U:1 L:0.00 M:7 Z:1.00

山吹蛙10

井手

4

鳴く

9

移ろふ

3

神無備川

1

花 7

5

10

集く

1

閉づ1

漁る

1

3

1

流石に

1

忍ぶ

2

1

浮葉

1水錆

1

1

1

11

1

1

11

1

11

井戸1

11

3

都人

1

11

諸声1

1

浮草

1

騒ぐ

1

1

1

1

11

1

1

沼水

1

1

1

盛る1

1

7

3

小田

1

1

里人

1

1

沢水1

隠れ沼1

1

九重

1

1

1

Fig. 3 Graph model of kahazu (蛙, frog) before pruning node 蛙.

A minor term yahe (eightfold) could be shown as a hub node which plays a major role in connectinga topic word with other peripheral words which support/demonstrate poem stories. These minor wordsare not seen in poetic term dictionaries.‘Ide’, a place name in Tsuzuki-gun, Kyoto-fu, is famous for its kahazu(Tani, 2006, 25).In Heian period (795-1185), almost all songs including kahazu (flog) are composed with using terms

Ide and yamabuki (a flower, Japanese Kerria) such as in the following song:

Kahazu naku / Ide no yamabuki / chirinikeri / hana no sakari ni / ahamashi monowo(where flogs are crying / flowwers of Japanese kerria in Ide / have fallen / when the flowers fullybloom / I wanted to be and see it)

As we have seen in the explanation of ‘Ide’, “Ide no Tamagawa” locates at Yamashiro no kuni which iscurrently called as Ide cho, Kyoto-fu. Kerria originally had been planted in the precincts of Itsutsumideratemple whose owner was Tachibana clan. Because of it, Ide became famous for its kerria.

Page 4: Lexical Modeling of Yamabuki (Japanese Kerria) in Classical ...yamagen/papers/JADH...01% 23 4 Fig. 4 Graph model of kahazu (蛙, frog) after pruning node 蛙. 5 Discussions The terms

蛙 (15/15/15, 6.45) cw > 2.50 K:1-8 U:1 L:0.00 M:7 Z:1.00

山吹

井手

4鳴く9

移ろふ3 神無備川

1

花7

菱浮葉1

水錆

1

閉づ

1

漁る

1 11

1

11

1

井戸

1

1

都人

1

1

3

諸声

集く

1

浮草

1

騒ぐ1

1

1

1

1

沼水

1 宜

1

盛る

1

1

1

小田

流石に

1

里人

1

隠れ沼 1

九重

1

1

Fig. 4 Graph model of kahazu (蛙, frog) after pruning node 蛙.

5 DiscussionsThe terms yamabuki, kahazu, and Ide are contained in some poetic dictionaries as entry items or collo-cations. The term yahe is, however, not seen in any poetic dictionaries even as a single term.

6 ConclusionWe conclude that a term such as yahe can be shown as a hub node which takes an important roleto connect a topic word with other peripheral words such as kokonohe, nanahe, hitohe, and plays asupporting role to form a poetic story in the poem even if it is not included in a dictionary.The finding of this study is that the modeling developed here allows us to 1) discern not only patterns

described by experts but also patterns yet undescribed, and 2) identify not only specific or tangible wordsbut also abstract or conceptual words which have a tendency to be left out of dictionaries.

ReferencesManning, Christopher D. and Hinrich Schutze (1999) Foundation of statistical natural language process-ing, Cambridge, Massachusetts: The MIT press.

Robertson, Stephen (2004) “Understanding inverse document frequency: on theoretical arguments forIDF”, Journal of Documentation, Vol. 60, pp. 503–520.

Page 5: Lexical Modeling of Yamabuki (Japanese Kerria) in Classical ...yamagen/papers/JADH...01% 23 4 Fig. 4 Graph model of kahazu (蛙, frog) after pruning node 蛙. 5 Discussions The terms

山吹 (44/44/44, 5.37) cw > 2.50 K:1-8 U:1 L:0.00 M:7 Z:1.00

井手4

鳴く

9

移ろふ

3

神無備川

1

単衣

八重

4

2

九重3

十重

1

咲く

6

七重

1

重ぬ

2

河水2

梔子

1

1

井戸

1

1里人

1

2

河浪

2

花色衣

1

1

答ふ

1

許す

1

盛りなり

21

都人

11

1

吉野

4

盛る

2

1

怪し

1

4

4

花7

10

夫1

1

1

3幾重

1

1

2

沼水

1

1

添ふ

2

3

玉河

1

飼ふ

1

沢水

1

一つ

1

2

散る

3

2

flowerfrog

eightfold ninefold

lap over

tenfoldsinglefold

Ide.PN one

sevenfold

allow

bloom

fruit

suspicious

sing

water well

boggy water

transfer

Tama river.PN

go along with

ripples on a river Cape jasmin

river water

manyfold

river

Yoshino.PN

waterside

breed

villager

urban person

husband

prefecture

Master

mouth

flower color robe

Kannabi river.PN

answer.vt

shadow

aptly

mountain water

now

bottom

stack up

reason why

scatter

in season

here

Fig. 5 A graph model of Yamabuki: a core node, 山吹 yamabuki, is pruned. kahazu (蛙, frog), Ide (井手, place name, proper name), and yahe (八重, eightfold or double flower) are observed as hubnodes.

Fig. 6 Variations of Japanese kerria (yamabuki): Single petal (left), white petal (center), and plenapetal (right) of yamabuki. (http://mkfarm.blog118.fc2.com/blog-entry-27.html)

Page 6: Lexical Modeling of Yamabuki (Japanese Kerria) in Classical ...yamagen/papers/JADH...01% 23 4 Fig. 4 Graph model of kahazu (蛙, frog) after pruning node 蛙. 5 Discussions The terms

Sparck Jones, Karen (1972) “A Statistical Interpretation of Term Specificity and Its Application inRetrieval”, Journal of Documentation, Vol. 28, pp. 11–21.

Tani, Tomoko (2006) Wakabungaku no kisochishiki, Kadokawa sensho: Kadokawa Gakugei Shuppan.Yamamoto, Hilofumi (2006) “Konpyuta niyoru utamakura no bunseki / A Computer Analysis of PlaceNames in Classical Japanese Poetry”, in Atti del Terzo Convegno di Linguistica e Didattica DellaLingua Giapponese, Roma 2005 : Associazione Italiana Didattica Lingua Giapponese (AIDLG), pp.373–382.