hownet and computation of meaning zhendong dong [email protected] gwc-06 jeju, korea 2006-01-22
TRANSCRIPT
HowNet and
Computation of Meaning
Zhendong Dong
WWW.keenage.com
GWC-06
Jeju, Korea
2006-01-22
Outlines
Bird’s-eye view of HowNet
Prominent features
Bird’s-eye view of HowNet
What is HowNet? History of HowNet Statistics on latest version Composition of HowNet
What is HowNet?
HowNet is an on-line extralinguistic knowledge system for the computation of meaning in HLT.
HowNet unveils inter-concept relations and inter-attribute relations of the concepts as connoted in its Chinese-English lexicon.
History of HowNet
1988 Basic research started
1999 1st version released
2000 Revision of KDML started
2002 New version released
Statistics - general
Chinese word & expression 84102
English word & expression 80250
Chinese meaning 98530
English meaning 100071
Definition 25295
Record 161743
A record in HowNet dictionary
NO.=076856
W_C=买主G_C=N [mai3 zhu3]
E_C=
W_E=buyer
G_E=N
E_E=
DEF={human|人 :domain={commerce|商业 },{buy|买 :
agent={~}}}
Statistics - semantic
Chinese English
Thing 58153 58096
Component 7025 7023
Time 2238 2244
Space 1071 1071
Attribute 3776 4045
Atttibute-value 9089 8478
Event 12634 10076
Statistics – main syntactic categories
Chinese EnglishADJ 11705 9576ADV 1516 2084VERB 25929 21017NOUN 46867 48342PRON 112 71NUM 225 242PREP 128 113AUX 77 49CLA 424 0
Statistics – part of relations
Chinese synset: Set = 13463 Word Form = 54312
antonym: Set = 12777 converse: Set = 6753
English synset: Set = 18575 Word Form = 58488
antonym: Set = 12032 converse: Set = 6442
Composition
Database Tools for computation of meaning
Database
Dictionary Taxonomies Axiomatic relations & role shifting
Dictionary
Taxonomies - 10
Entity Event Attribute AttributeValue Secondary features Event roles Typical actors of event roles Event relations and role shifting Antonymous sememe pairs Converse sememe pairs
Tools for computation of meaning
Browser Secondary resources
Prominent features
All syntactic classes of words included Sememes and semantic roles Defining concepts in KDML on the basis of
sememes and semantic roles Relations – the soul of HowNet Relations obtained by computing rather than
manually-coding Identical representation in various linguistic
structures
Sememes
Sememes 2099Entity 151
thing (physical, mental, fact) component (part, fitting) time space (direction, location)
Event (relation, state; action) 812Attribute 247AttributeValue 889
Secondary feature 121
Semantic roles 91
(1) Main semantic roles
(a) principal semantic roles: 6
(b) affected semantic roles: 11
(2) peripheral semantic roles
(a) time: 12 (f) basis: 6
(b) space: 11 (g) comparison: 2
(c) resultant: 8 (h) coordination: 6
(d) manner: 11 (i) commentary: 2
(e) modifier: 16
Defining concepts (1)
W_E=doctorG_E=VDEF={doctor|医治 }
W_E=doctorG_E=NDEF={human|人 :HostOf={Occupation|职位 },domain={medical|医 },
{doctor|医治 :agent={~}}}
W_E=doctorG_E=NE_E=DEF={human|人 :{own|有 :possession={Status|身分 :
domain={education|教育 },modifier={HighRank|高等 :degree={most|最 }}},possessor={~}}}
Defining concepts (2)
W_E=buyG_E=VDEF={buy|买 }
cf. (WordNet) obtain by purchase; acquire by means of finacial transaction
W_E=buyG_E=VDEF={GiveAsGift|赠 :manner={guilty|有罪 },
purpose={entice|勾引 }} cf. (WordNet) make illegal payments to in exchange for favors or influence
Relations – the soul of HowNet
Meaning is represented by relations Computation of meaning is based on
relations
1. Event Frame ~ Verb frame- {event| 事件 }
├ {static| 静态 } {event| 事件 }
│ ├ {relation| 关系 } {static| 静态 }
│ │ ├ {possession|领属关系 } {relation| 关系 }
│ │ │ ├ {own| 有 } {possession| 领属关系 :possessor={*},possession={*}}
│ │ │ │ ├ {obtain| 得到 } {own| 有 :possessor={*},possession={*},source={*}}
└ {act| 行动 } {event| 事件 :agent={*}}
├ {ActGeneral| 泛动 } {act| 行动 :agent={*}}
└ {ActSpecific| 实动 } {act| 行动 :agent={*}}
└ {AlterSpecific| 实变 } {ActSpecific| 实动 :agent={*}}
├ {AlterRelation| 变关系 } {AlterSpecific| 实变 :agent={*}}
│ ├ {AlterPossession|变领属 } {AlterRelation| 变关系 :agent={*},possession={*}}
│ │ ├ {take|取 } {AlterPossession|变领属 :agent={*},possession={*},source={*}}
│ │ │ ├ {buy|买 } {take|取 :agent={*},
possession={*},
source={*},
cost={*},
beneficiary={*}
2. Typical actors of event roles ~ VerbNet
│ ├ {buy|买 } {take|取 :agent={human|人 }{group|群体 ->},
possession={artifact|人工物 ->},
source={human|人 }{InstitutePlace|场所 },
cost={money|货币 },
beneficiary={human|人 }{group|群体 ->},
domain={economy|经济 }}
Axiomatic Relations & Role Shifting - 1
{buy|买 } <----> {obtain|得到 } [consequence]; agent OF {buy|买 }=possessor OF {obtain|得到 }; possession OF {buy|买 }=possession OF {obtain|得到 }. {buy|买 } <----> {obtain|得到 } [consequence]; beneficiary OF {buy|买 }=possessor OF {obtain|得到 }; possession OF {buy|买 }=possession OF {obtain|得到 }.
{buy|买 } <----> {obtain|得到 } [consequence]; source OF {buy|买 }=source OF {obtain|得到 }; possession OF {buy|买 }=possession OF {obtain|得到 }.
Axiomatic Relations & Role Shifting - 2
{buy|买 } [entailment] <----> {choose|选择 };
agent OF {buy|买 }=agent OF {choose|选择 };
possession OF {buy|买 }=content OF {choose|选择 };
source OF {buy|买 }=location OF {choose|选择 }.
{buy|买 } [entailment] <----> {pay|付 };
agent OF {buy|买 }=agent OF {pay|付 };
cost OF {buy|买 }=possession OF {pay|付 };
source OF {buy|买 }=taget OF {pay|付 }.
Axiomatic Relations & Role Shifting - 3
{buy|买 } (X) <----> {sell|卖 } (Y) [mutual implication];
agent OF {buy|买 }=target OF {sell|卖 };
source OF {buy|买 }=agent OF {sell|卖 };
possession OF {buy|买 }=possession OF {sell|卖 };
cost OF {buy|买 }=cost OF {sell|卖 }.
Identical representation - 1
W_E=smuggle
G_E=V
DEF={transport|运送 :manner={guilty|有罪 }}
W_E=drug
G_E=N
DEF={addictive|嗜好物 :modifier={guilty|有罪 }}
Identical representation - 2
W_E=smuggling of drugsG_E=NDEF={fact|事情 :CoEvent={transport|运送 :
manner={guilty|有罪 },patient={addictive|嗜好物 :modifier={guilty|有罪 }}}}
W_E=drug smugglerG_E=NDEF={community|团体 :{transport|运送 :agent={~},
manner={unlawful|非法 },patient={addictive|嗜好物 },
purpose={sell|卖 }}}
Types of relations
Motivation to develop secondary resources
To check from different angles HowNet knowledge data for their preciseness and consistency
To provide users with tools for application Practible for any sense of any word
Secondary resources
Concept Relevance Calculator (CRC) Concept Similarity Measure (CSM) Query Expansion Tool (QET) Chinese Morphological Processor (CMP)Chinese Message Analyzer (CMA)
Concept similarity
doctor 2 <> dentist 0.300000doctor 1<> dentist 0.883333doctor 1<> nurse1 0.620000doctor 1<> nurse2 0.454545doctor 1<> patient 0.203636
walk <> run 0.144444walk <> jump 0.144444walk <> swim 0.130159walk <> fly 0.124444walk <> buy 0.018605
Conclusion
Extralinguistic knowledge is indispensable for HLT
The knowledge should be a system which is computer-oriented
It should be big enough, exemplary toy is useless
It can conduct computation of meaning
Thank youThank you
Welcome towww.keenage.com!
Download and try Mini-HowNet