hownet and computation of meaning zhendong dong [email protected] gwc-06 jeju, korea 2006-01-22

42
HowNet and Computation of Meaning Zhendong Dong [email protected] WWW.keenage.com GWC-06 Jeju, Korea 2006-01-22

Upload: justin-george

Post on 04-Jan-2016

232 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: HowNet and Computation of Meaning Zhendong Dong dzd@keenage.com  GWC-06 Jeju, Korea 2006-01-22

HowNet and

Computation of Meaning

Zhendong Dong

[email protected]

WWW.keenage.com

GWC-06

Jeju, Korea

2006-01-22

Page 2: HowNet and Computation of Meaning Zhendong Dong dzd@keenage.com  GWC-06 Jeju, Korea 2006-01-22

Outlines

Bird’s-eye view of HowNet

Prominent features

Page 3: HowNet and Computation of Meaning Zhendong Dong dzd@keenage.com  GWC-06 Jeju, Korea 2006-01-22

Bird’s-eye view of HowNet

What is HowNet? History of HowNet Statistics on latest version Composition of HowNet

Page 4: HowNet and Computation of Meaning Zhendong Dong dzd@keenage.com  GWC-06 Jeju, Korea 2006-01-22

What is HowNet?

HowNet is an on-line extralinguistic knowledge system for the computation of meaning in HLT.

HowNet unveils inter-concept relations and inter-attribute relations of the concepts as connoted in its Chinese-English lexicon.

Page 5: HowNet and Computation of Meaning Zhendong Dong dzd@keenage.com  GWC-06 Jeju, Korea 2006-01-22

History of HowNet

1988 Basic research started

1999 1st version released

2000 Revision of KDML started

2002 New version released

Page 6: HowNet and Computation of Meaning Zhendong Dong dzd@keenage.com  GWC-06 Jeju, Korea 2006-01-22

Statistics - general

Chinese word & expression 84102

English word & expression 80250

Chinese meaning 98530

English meaning 100071

Definition 25295

Record 161743

Page 7: HowNet and Computation of Meaning Zhendong Dong dzd@keenage.com  GWC-06 Jeju, Korea 2006-01-22

A record in HowNet dictionary

NO.=076856

W_C=买主G_C=N [mai3 zhu3]

E_C=

W_E=buyer

G_E=N

E_E=

DEF={human|人 :domain={commerce|商业 },{buy|买 :

agent={~}}}

Page 8: HowNet and Computation of Meaning Zhendong Dong dzd@keenage.com  GWC-06 Jeju, Korea 2006-01-22

Statistics - semantic

Chinese English

Thing 58153 58096

Component 7025 7023

Time 2238 2244

Space 1071 1071

Attribute 3776 4045

Atttibute-value 9089 8478

Event 12634 10076

Page 9: HowNet and Computation of Meaning Zhendong Dong dzd@keenage.com  GWC-06 Jeju, Korea 2006-01-22

Statistics – main syntactic categories

Chinese EnglishADJ 11705 9576ADV 1516 2084VERB 25929 21017NOUN 46867 48342PRON 112 71NUM 225 242PREP 128 113AUX 77 49CLA 424 0

Page 10: HowNet and Computation of Meaning Zhendong Dong dzd@keenage.com  GWC-06 Jeju, Korea 2006-01-22

Statistics – part of relations

Chinese synset: Set = 13463 Word Form = 54312

antonym: Set = 12777 converse: Set = 6753

English synset: Set = 18575 Word Form = 58488

antonym: Set = 12032 converse: Set = 6442

Page 11: HowNet and Computation of Meaning Zhendong Dong dzd@keenage.com  GWC-06 Jeju, Korea 2006-01-22

Composition

Database Tools for computation of meaning

Page 12: HowNet and Computation of Meaning Zhendong Dong dzd@keenage.com  GWC-06 Jeju, Korea 2006-01-22

Database

Dictionary Taxonomies Axiomatic relations & role shifting

Page 13: HowNet and Computation of Meaning Zhendong Dong dzd@keenage.com  GWC-06 Jeju, Korea 2006-01-22

Dictionary

Page 14: HowNet and Computation of Meaning Zhendong Dong dzd@keenage.com  GWC-06 Jeju, Korea 2006-01-22

Taxonomies - 10

Entity Event Attribute AttributeValue Secondary features Event roles Typical actors of event roles Event relations and role shifting Antonymous sememe pairs Converse sememe pairs

Page 15: HowNet and Computation of Meaning Zhendong Dong dzd@keenage.com  GWC-06 Jeju, Korea 2006-01-22

Tools for computation of meaning

Browser Secondary resources

Page 16: HowNet and Computation of Meaning Zhendong Dong dzd@keenage.com  GWC-06 Jeju, Korea 2006-01-22

Prominent features

All syntactic classes of words included Sememes and semantic roles Defining concepts in KDML on the basis of

sememes and semantic roles Relations – the soul of HowNet Relations obtained by computing rather than

manually-coding Identical representation in various linguistic

structures

Page 17: HowNet and Computation of Meaning Zhendong Dong dzd@keenage.com  GWC-06 Jeju, Korea 2006-01-22

Sememes

Sememes 2099Entity 151

thing (physical, mental, fact) component (part, fitting) time space (direction, location)

Event (relation, state; action) 812Attribute 247AttributeValue 889

Secondary feature 121

Page 18: HowNet and Computation of Meaning Zhendong Dong dzd@keenage.com  GWC-06 Jeju, Korea 2006-01-22

Semantic roles 91

(1) Main semantic roles

(a) principal semantic roles: 6

(b) affected semantic roles: 11

(2) peripheral semantic roles

(a) time: 12 (f) basis: 6

(b) space: 11 (g) comparison: 2

(c) resultant: 8 (h) coordination: 6

(d) manner: 11 (i) commentary: 2

(e) modifier: 16

Page 19: HowNet and Computation of Meaning Zhendong Dong dzd@keenage.com  GWC-06 Jeju, Korea 2006-01-22

Defining concepts (1)

W_E=doctorG_E=VDEF={doctor|医治 }

W_E=doctorG_E=NDEF={human|人 :HostOf={Occupation|职位 },domain={medical|医 },

{doctor|医治 :agent={~}}}

W_E=doctorG_E=NE_E=DEF={human|人 :{own|有 :possession={Status|身分 :

domain={education|教育 },modifier={HighRank|高等 :degree={most|最 }}},possessor={~}}}

Page 20: HowNet and Computation of Meaning Zhendong Dong dzd@keenage.com  GWC-06 Jeju, Korea 2006-01-22

Defining concepts (2)

W_E=buyG_E=VDEF={buy|买 }

cf. (WordNet) obtain by purchase; acquire by means of finacial transaction

W_E=buyG_E=VDEF={GiveAsGift|赠 :manner={guilty|有罪 },

purpose={entice|勾引 }} cf. (WordNet) make illegal payments to in exchange for favors or influence

Page 21: HowNet and Computation of Meaning Zhendong Dong dzd@keenage.com  GWC-06 Jeju, Korea 2006-01-22

Relations – the soul of HowNet

Meaning is represented by relations Computation of meaning is based on

relations

Page 22: HowNet and Computation of Meaning Zhendong Dong dzd@keenage.com  GWC-06 Jeju, Korea 2006-01-22

1. Event Frame ~ Verb frame- {event| 事件 }

├ {static| 静态 } {event| 事件 }

│ ├ {relation| 关系 } {static| 静态 }

│ │ ├ {possession|领属关系 } {relation| 关系 }

│ │ │ ├ {own| 有 } {possession| 领属关系 :possessor={*},possession={*}}

│ │ │ │ ├ {obtain| 得到 } {own| 有 :possessor={*},possession={*},source={*}}

└ {act| 行动 } {event| 事件 :agent={*}}

├ {ActGeneral| 泛动 } {act| 行动 :agent={*}}

└ {ActSpecific| 实动 } {act| 行动 :agent={*}}

└ {AlterSpecific| 实变 } {ActSpecific| 实动 :agent={*}}

├ {AlterRelation| 变关系 } {AlterSpecific| 实变 :agent={*}}

│ ├ {AlterPossession|变领属 } {AlterRelation| 变关系 :agent={*},possession={*}}

│ │ ├ {take|取 } {AlterPossession|变领属 :agent={*},possession={*},source={*}}

│ │ │ ├ {buy|买 } {take|取 :agent={*},

possession={*},

source={*},

cost={*},

beneficiary={*}

Page 23: HowNet and Computation of Meaning Zhendong Dong dzd@keenage.com  GWC-06 Jeju, Korea 2006-01-22

2. Typical actors of event roles ~ VerbNet

│ ├ {buy|买 } {take|取 :agent={human|人 }{group|群体 ->},

possession={artifact|人工物 ->},

source={human|人 }{InstitutePlace|场所 },

cost={money|货币 },

beneficiary={human|人 }{group|群体 ->},

domain={economy|经济 }}

Page 24: HowNet and Computation of Meaning Zhendong Dong dzd@keenage.com  GWC-06 Jeju, Korea 2006-01-22

Axiomatic Relations & Role Shifting - 1

{buy|买 } <----> {obtain|得到 } [consequence]; agent OF {buy|买 }=possessor OF {obtain|得到 }; possession OF {buy|买 }=possession OF {obtain|得到 }. {buy|买 } <----> {obtain|得到 } [consequence]; beneficiary OF {buy|买 }=possessor OF {obtain|得到 }; possession OF {buy|买 }=possession OF {obtain|得到 }.

{buy|买 } <----> {obtain|得到 } [consequence]; source OF {buy|买 }=source OF {obtain|得到 }; possession OF {buy|买 }=possession OF {obtain|得到 }.

Page 25: HowNet and Computation of Meaning Zhendong Dong dzd@keenage.com  GWC-06 Jeju, Korea 2006-01-22

Axiomatic Relations & Role Shifting - 2

{buy|买 } [entailment] <----> {choose|选择 };

agent OF {buy|买 }=agent OF {choose|选择 };

possession OF {buy|买 }=content OF {choose|选择 };

source OF {buy|买 }=location OF {choose|选择 }.

{buy|买 } [entailment] <----> {pay|付 };

agent OF {buy|买 }=agent OF {pay|付 };

cost OF {buy|买 }=possession OF {pay|付 };

source OF {buy|买 }=taget OF {pay|付 }.

Page 26: HowNet and Computation of Meaning Zhendong Dong dzd@keenage.com  GWC-06 Jeju, Korea 2006-01-22

Axiomatic Relations & Role Shifting - 3

{buy|买 } (X) <----> {sell|卖 } (Y) [mutual implication];

agent OF {buy|买 }=target OF {sell|卖 };

source OF {buy|买 }=agent OF {sell|卖 };

possession OF {buy|买 }=possession OF {sell|卖 };

cost OF {buy|买 }=cost OF {sell|卖 }.

Page 27: HowNet and Computation of Meaning Zhendong Dong dzd@keenage.com  GWC-06 Jeju, Korea 2006-01-22

Identical representation - 1

W_E=smuggle

G_E=V

DEF={transport|运送 :manner={guilty|有罪 }}

W_E=drug

G_E=N

DEF={addictive|嗜好物 :modifier={guilty|有罪 }}

Page 28: HowNet and Computation of Meaning Zhendong Dong dzd@keenage.com  GWC-06 Jeju, Korea 2006-01-22

Identical representation - 2

W_E=smuggling of drugsG_E=NDEF={fact|事情 :CoEvent={transport|运送 :

manner={guilty|有罪 },patient={addictive|嗜好物 :modifier={guilty|有罪 }}}}

W_E=drug smugglerG_E=NDEF={community|团体 :{transport|运送 :agent={~},

manner={unlawful|非法 },patient={addictive|嗜好物 },

purpose={sell|卖 }}}

Page 29: HowNet and Computation of Meaning Zhendong Dong dzd@keenage.com  GWC-06 Jeju, Korea 2006-01-22

Types of relations

Page 30: HowNet and Computation of Meaning Zhendong Dong dzd@keenage.com  GWC-06 Jeju, Korea 2006-01-22

Motivation to develop secondary resources

To check from different angles HowNet knowledge data for their preciseness and consistency

To provide users with tools for application Practible for any sense of any word

Page 31: HowNet and Computation of Meaning Zhendong Dong dzd@keenage.com  GWC-06 Jeju, Korea 2006-01-22

Secondary resources

Concept Relevance Calculator (CRC) Concept Similarity Measure (CSM) Query Expansion Tool (QET) Chinese Morphological Processor (CMP)Chinese Message Analyzer (CMA)

Page 32: HowNet and Computation of Meaning Zhendong Dong dzd@keenage.com  GWC-06 Jeju, Korea 2006-01-22
Page 33: HowNet and Computation of Meaning Zhendong Dong dzd@keenage.com  GWC-06 Jeju, Korea 2006-01-22
Page 34: HowNet and Computation of Meaning Zhendong Dong dzd@keenage.com  GWC-06 Jeju, Korea 2006-01-22
Page 35: HowNet and Computation of Meaning Zhendong Dong dzd@keenage.com  GWC-06 Jeju, Korea 2006-01-22

Concept similarity

doctor 2 <> dentist 0.300000doctor 1<> dentist 0.883333doctor 1<> nurse1 0.620000doctor 1<> nurse2 0.454545doctor 1<> patient 0.203636

walk <> run 0.144444walk <> jump 0.144444walk <> swim 0.130159walk <> fly 0.124444walk <> buy 0.018605

Page 36: HowNet and Computation of Meaning Zhendong Dong dzd@keenage.com  GWC-06 Jeju, Korea 2006-01-22
Page 37: HowNet and Computation of Meaning Zhendong Dong dzd@keenage.com  GWC-06 Jeju, Korea 2006-01-22
Page 38: HowNet and Computation of Meaning Zhendong Dong dzd@keenage.com  GWC-06 Jeju, Korea 2006-01-22
Page 39: HowNet and Computation of Meaning Zhendong Dong dzd@keenage.com  GWC-06 Jeju, Korea 2006-01-22
Page 40: HowNet and Computation of Meaning Zhendong Dong dzd@keenage.com  GWC-06 Jeju, Korea 2006-01-22

Conclusion

Extralinguistic knowledge is indispensable for HLT

The knowledge should be a system which is computer-oriented

It should be big enough, exemplary toy is useless

It can conduct computation of meaning

Page 41: HowNet and Computation of Meaning Zhendong Dong dzd@keenage.com  GWC-06 Jeju, Korea 2006-01-22
Page 42: HowNet and Computation of Meaning Zhendong Dong dzd@keenage.com  GWC-06 Jeju, Korea 2006-01-22

Thank youThank you

Welcome towww.keenage.com!

Download and try Mini-HowNet