Mining Domain Specific Words Mining Domain Specific Words fromfrom
Hierarchical Web DocumentsHierarchical Web Documents
Jing-Shin Chang (Jing-Shin Chang ( 張景新張景新 ))
Department of Computer Science & Information EngineeringDepartment of Computer Science & Information Engineering
National Chi-Nan (National Chi-Nan ( 暨南暨南 ) University) University
1, Univ. Road, Puli, Nantou 545, Taiwan, ROC.1, Univ. Road, Puli, Nantou 545, Taiwan, ROC.
[email protected]@csie.ncnu.edu.tw
CJNLP-04, 2004/11/10~15, City U., H.K.CJNLP-04, 2004/11/10~15, City U., H.K.
3
TOCTOC MotivationMotivation What are DSW’s?What are DSW’s? Why DSW Mining? (Applications)Why DSW Mining? (Applications)
WSD with DSW’s without sense tagged corpusWSD with DSW’s without sense tagged corpus Constructing Hierarchical Lexicon Tree w/o ClusteringConstructing Hierarchical Lexicon Tree w/o Clustering Other applicationsOther applications
How to Mine DSW’s from Hierarchical Web How to Mine DSW’s from Hierarchical Web DocumentsDocuments
Preliminary ResultsPreliminary Results Error SourcesError Sources RemarksRemarks
4
MotivationMotivation ““Is there a quick and easy (engineering) way to Is there a quick and easy (engineering) way to
construct a large scale WordNet or things like that construct a large scale WordNet or things like that … now that everyone is talking about ontological … now that everyone is talking about ontological knowledge sources and X-WordNet (whatever you knowledge sources and X-WordNet (whatever you call it)…?”call it)…?”
……trigger a new view for constructing a lexicon trigger a new view for constructing a lexicon tree with hierarchical semantic links…tree with hierarchical semantic links…
……DSW identification turns out to be a key to such DSW identification turns out to be a key to such constructionconstruction
……and can be used in various applications, and can be used in various applications, including DSW-based WSD without using sense including DSW-based WSD without using sense tagged corpora…tagged corpora…
5
What Are Domain Specific What Are Domain Specific Words (DSW’s)Words (DSW’s) Words that appear frequently in some particular Words that appear frequently in some particular
domains:domains: (a) Multiple Sense Words that are used frequently with (a) Multiple Sense Words that are used frequently with
special meanings or usage in particular domainsspecial meanings or usage in particular domains E.g., piston: “E.g., piston: “ 活塞” 活塞” ((mechanics) or “mechanics) or “ 活塞隊” 活塞隊” ((sports) sports)
(b) Single Sense Words that are used frequently in (b) Single Sense Words that are used frequently in particular domainsparticular domains
Suggesting that some words in the current document might be Suggesting that some words in the current document might be related to this particular senserelated to this particular sense
As “anchor words/tags” in the context for disambiguating As “anchor words/tags” in the context for disambiguating other multiple sense wordsother multiple sense words
6
What to Do in DSW MiningWhat to Do in DSW Mining
DSW Mining TaskDSW Mining Task Find lists of words that occurs frequently in the same Find lists of words that occurs frequently in the same
domain and associate each list (and words within it) a domain and associate each list (and words within it) a domain (implicit sense) tagdomain (implicit sense) tag
E.g., entertainment: ‘singer’, ‘pop songs’, ‘rock & roll’, E.g., entertainment: ‘singer’, ‘pop songs’, ‘rock & roll’, ‘Chang Hui-Mei’ (‘Ah-Mei’), ‘album’, …‘Chang Hui-Mei’ (‘Ah-Mei’), ‘album’, …
As a side effect, find the hierarchical or network-like As a side effect, find the hierarchical or network-like relationships between adjacent sets of DSW’srelationships between adjacent sets of DSW’s
When applied to mining DSW’s associated with each node of When applied to mining DSW’s associated with each node of a hierarchical directory/document tree a hierarchical directory/document tree
Each node being annotated with a domain tagEach node being annotated with a domain tag
10
DSW Applications (1)DSW Applications (1)
Technical term extraction:Technical term extraction: W(d) ={ w | w W(d) ={ w | w DSW(d) } DSW(d) } d d {computer, traveling, food, …} {computer, traveling, food, …}
11
DSW Applications (2)DSW Applications (2)
Generic WSD based on DSW’sGeneric WSD based on DSW’s ArgmaxArgmaxSS dd P(s|d,W)P(d|W) = agrmax P(s|d,W)P(d|W) = agrmaxSS dd P(s| P(s|
d,W)P(W|d)P(d)d,W)P(W|d)P(d) If a large-scale sense-tagged corpus is not If a large-scale sense-tagged corpus is not
available, which is often the caseavailable, which is often the case Machine translationMachine translation
Help select translation lexicon candidatesHelp select translation lexicon candidates E.g., money bank (when used with “payment”, E.g., money bank (when used with “payment”,
“loan”, etc.), river bank, memory bank (in PC, “loan”, etc.), river bank, memory bank (in PC, Intel, MS Windows domains)Intel, MS Windows domains)
12
DSW ApplicationsDSW Applications
Generic WSD based on DSW’sGeneric WSD based on DSW’s
*0 1
0
0 1
0 0
0 0
arg max | ,
arg max | [sense-based models]
arg max , | ,
arg max | , |
arg max | , | [domain-based models]
n
s
n
s
n
s d
n n
s d
n n
s d
s P s w w
P w s P s
P s d w w
P s d w P d w
P s d w P w d P d
Need sense-tagged corpora
for training (*not widely
available)
Implicitly domain-tagged corpora are widely available in
the web
Sum over domains where w0 is a DSW 0 0| , , , : almost deterministic ("one sense per context")nP s d w d w s
13
DSW Applications (3)DSW Applications (3)
Document classificationDocument classification N-class classification based on DSW’sN-class classification based on DSW’s
Anti-spamming (Two-class classification)Anti-spamming (Two-class classification) Words in spamming (uninteresting) mails vs. Words in spamming (uninteresting) mails vs.
normal (interesting) mails help block spamming normal (interesting) mails help block spamming mailsmails
Interesting domains vs. uninteresting domainsInteresting domains vs. uninteresting domains P(W|S)P(S) vs. P(W|~S)P(~S)P(W|S)P(S) vs. P(W|~S)P(~S)
14
DSW Applications (3.a)DSW Applications (3.a)
Document classification based on DSW’sDocument classification based on DSW’s d: document class labeld: document class label w[1..n]: bag of words in documentw[1..n]: bag of words in document |D| = n >= 2: number of document classes|D| = n >= 2: number of document classes
Anti-spamming based on DSW’sAnti-spamming based on DSW’s |D|=n=2 (two-class classification)|D|=n=2 (two-class classification)
*1
1
arg max |
arg max | [class-based models]
n
d
n
d
d P d w
P w d P d
15
DSW Applications (4)DSW Applications (4)
Building large lexicon tree or wordnet-Building large lexicon tree or wordnet-lookalike (semi-) automatically from lookalike (semi-) automatically from hierarchical web documentshierarchical web documents Membership: Semantic links among words of Membership: Semantic links among words of
the same domain are close (context), similar the same domain are close (context), similar (synonym, thesaurus), or negated concept (synonym, thesaurus), or negated concept (antonym)(antonym)
Hierarchy: Hierarchy of the lexicon suggests Hierarchy: Hierarchy of the lexicon suggests some ontological relationshipssome ontological relationships
16
Conventional Methods for Conventional Methods for Constructing Lexicon TreesConstructing Lexicon Trees Construction by ClusteringConstruction by Clustering
Collect words in a large corpusCollect words in a large corpus Evaluate Evaluate word associationword association as distance (or as distance (or
closeness) measure for all word pairscloseness) measure for all word pairs Use Use clustering criteriaclustering criteria to build lexicon to build lexicon
hierarchyhierarchy Adjust the hierarchy and Assign semantic/sense Adjust the hierarchy and Assign semantic/sense
tags to nodes of the lexicon treetags to nodes of the lexicon tree Thus assigning sense tags to members of each nodeThus assigning sense tags to members of each node
17
Clustering Methods for Clustering Methods for Constructing Lexicon TreesConstructing Lexicon Trees
A04A1
2B C A22D E
A0, A1, B
A0, A1, C
A0, A2, D
A0, A2, E
18
Clustering Methods for Clustering Methods for Constructing Lexicon TreesConstructing Lexicon Trees DisadvantagesDisadvantages
Do not take advantages of Do not take advantages of hierarchical informationhierarchical information of of document tree (flattened when collecting words)document tree (flattened when collecting words)
Word association & Clustering criteriaWord association & Clustering criteria are not related directly are not related directly to human to human perceptionperception
Most clustering algorithms conduct Most clustering algorithms conduct binarybinary merging (or merging (or division) in each step for simplicitydivision) in each step for simplicity
Automatically generated Automatically generated semantics hierarchysemantics hierarchy may not may not reflect reflect human perceptionhuman perception
Hierarchy Hierarchy boundariesboundaries are not clearly & automatically are not clearly & automatically detecteddetected
Adjustment of hierarchy may not be easy (since human Adjustment of hierarchy may not be easy (since human perception is not used to guide clustering)perception is not used to guide clustering)
Pairwise association evaluation is costlyPairwise association evaluation is costly
19
Hierarchical Information Loss Hierarchical Information Loss when Collecting Wordswhen Collecting Words
A04, A1
2, A22,
B, C, D, E
A02, A1
2, B, C
A0, A1, B A0, A1, C A0, A2, D A0, A2, E
A02, A2
2, D, E
A0, A1, B A0, A1, C A0, A2, D A0, A2, E
20
Clustering Methods for Clustering Methods for Constructing Lexicon TreesConstructing Lexicon Trees
A04A1
2B C
?
?
?
A22D E
?
?
?
A0, A1, B
A0, A1, C
A0, A2, D
A0, A2, E
Reflect human
perception?
Why binary?
Hierarchy?
21
Alternative View for Alternative View for Constructing Lexicon TreesConstructing Lexicon Trees Construction by Retaining DSW’sConstruction by Retaining DSW’s
Preserve Preserve hierarchical structurehierarchical structure of web of web documents as baseline of semantic hierarchy, documents as baseline of semantic hierarchy, which is already mildly confirmed by which is already mildly confirmed by webmasterswebmasters
Associate each node with Associate each node with DSW’s as membersDSW’s as members and tag each DSW with the directory/domain and tag each DSW with the directory/domain namename
Optionally adjust the tree hierarchy and Optionally adjust the tree hierarchy and members of each nodesmembers of each nodes
22
Constructing Lexicon Trees by Constructing Lexicon Trees by Preserving DSW’sPreserving DSW’s
O,O,O,O
X,O,X,O X,X,O,O
O,X,O,O
O,X,O,X O,O,X,X
O,O,X,O
O: +DSWX: -DSW
23
Constructing Lexicon Trees by Constructing Lexicon Trees by Preserving DSW’sPreserving DSW’s
O,O,O,O
O,O O,O
O,O,O
O,O O,O
O,O,O
O: +DSWX: -DSW
24
Constructing Lexicon Trees by Constructing Lexicon Trees by Preserving DSW’sPreserving DSW’s AdvantagesAdvantages
Hierarchy reflect Hierarchy reflect human perceptionhuman perception Adjustment could be easier if necessaryAdjustment could be easier if necessary
Directory names are Directory names are highly correlated to sense tagshighly correlated to sense tags Domain-based model can be used if sense-tagged corpora is Domain-based model can be used if sense-tagged corpora is
not availablenot available Pairwise Pairwise word associationword association evaluation is replaced by evaluation is replaced by
computation of “computation of “domain specificitydomain specificity” against domains” against domains O(|W|x|W|) vs. O(|W|x|D|)O(|W|x|W|) vs. O(|W|x|D|)
Requirements:Requirements: A well-organized web siteA well-organized web site Mining DSW’s from such a siteMining DSW’s from such a site
25
Constructing Lexicon Trees by Constructing Lexicon Trees by Preserving DSW’sPreserving DSW’s
A04, A1
2, A22,
B, C, D, E
A0, A1, B A0, A1, C
A02, A1
2, B, C
A0, A2, D A0, A2, E
A02, A2
2, D, E
relationship
Membership(closeness, similarity)
Is_a, hypernym,
…
SynonymAntonym
X
Y
Y is_a X ?? B is_a X (or A1)
26
Alternative View for Constructing Alternative View for Constructing Lexicon TreesLexicon Trees Benefits:Benefits:
No similarity computation: Closeness (incl. No similarity computation: Closeness (incl. similarity) is already implicitly encoded by similarity) is already implicitly encoded by human judgeshuman judges
No binary clustering: Clustering is already done No binary clustering: Clustering is already done (implicitly) with human judgment(implicitly) with human judgment
Hierarchical links available: Some well Hierarchical links available: Some well developed relationships are already donedeveloped relationships are already done
Although not perfect…Although not perfect…
28
Proposed Method for MiningProposed Method for Mining
Web Hierarchy as a Large Document TreeWeb Hierarchy as a Large Document Tree Each document was generated by applying DSW’s to Each document was generated by applying DSW’s to
some generic document templatessome generic document templates Remove non-specific words from documents, Remove non-specific words from documents,
leaving a lexicon tree with DSW’s associated with leaving a lexicon tree with DSW’s associated with each nodeeach node
Leaving only domain-specific wordsLeaving only domain-specific words Forming a lexicon tree from a document treeForming a lexicon tree from a document tree Label domain specific wordsLabel domain specific words
Characteristics:Characteristics: Get associated words by measuring domain-specificity to Get associated words by measuring domain-specificity to
a known and common domain instead of measuring a known and common domain instead of measuring pairwise association plus clustering pairwise association plus clustering
29
Mining Criteria:Mining Criteria:Cross-Domain EntropyCross-Domain Entropy Domain-independent terms tend to Domain-independent terms tend to
distributed evenly in all domains.distributed evenly in all domains. Distributional “evenness” can be measured Distributional “evenness” can be measured
with the with the Cross-Domain EntropyCross-Domain Entropy (CDE) (CDE) defined as follows:defined as follows: Pij: probability of word-i in domain-jPij: probability of word-i in domain-j fij: normalized frequencyfij: normalized frequency
* * logi i ij ijj
ijij
ijj
H H w P P
fP
f
30
Mining Criteria:Mining Criteria:Cross-Domain EntropyCross-Domain Entropy Example:Example:
Wi = “piston”, with frequencies (normalized to Wi = “piston”, with frequencies (normalized to [0,1]) at various domains:[0,1]) at various domains:
ffijij = (0.001, 0.62, 0.0003, 0.57, 0.0004) = (0.001, 0.62, 0.0003, 0.57, 0.0004) Domain-specific (unevenly distributed) at the 2Domain-specific (unevenly distributed) at the 2ndnd
and the 4and the 4thth domains domains
31
Mining Algorithm – Step1Mining Algorithm – Step1
Step1 (Data Collection)Step1 (Data Collection): Acquire a large : Acquire a large collection of web documents using a web collection of web documents using a web spider while preserving the directory spider while preserving the directory hierarchy of the documents. Strip unused hierarchy of the documents. Strip unused markup tags from the web pages.markup tags from the web pages.
32
Mining Algorithm – Step2Mining Algorithm – Step2
Step2 (Word Segmentation or Chunking)Step2 (Word Segmentation or Chunking): : Identify word (or compound word) Identify word (or compound word) boundaries in the documents by applying a boundaries in the documents by applying a word segmentation process, such as word segmentation process, such as (Chiang 92; Lin 93), to Chinese-like (Chiang 92; Lin 93), to Chinese-like documents (where word boundaries are not documents (where word boundaries are not explicit) or applying a compound word explicit) or applying a compound word chunking algorithms to English-like chunking algorithms to English-like documents in order to identify interested documents in order to identify interested word entities.word entities.
33
Mining Algorithm – Step3Mining Algorithm – Step3 Step3 (Acquiring Normalized Term Frequencies for all Step3 (Acquiring Normalized Term Frequencies for all
Words in Various Domains):Words in Various Domains): For each subdirectory For each subdirectory ddjj, , find the number of occurrences find the number of occurrences nnijij of each term of each term wwii in in all the documents, and derive the normalized term all the documents, and derive the normalized term frequency frequency ffijij = = nnijij//NNjj by normalizing by normalizing nnijij with the total with the total document size, document size, NNjj = = ii nnijij, in that directory. The , in that directory. The directory is then associated with a set of <directory is then associated with a set of <wwii, , ddjj, , ffijij> > tuples, where tuples, where wwii is the is the ii-th words of the complete -th words of the complete word list for all documents, word list for all documents, ddjj is the is the jj-th directory -th directory name (refer to as the domain hereafter), and name (refer to as the domain hereafter), and ffijij is the is the normalized relative frequency of occurrence of in normalized relative frequency of occurrence of in domain domain ddjj..
34
Mining Algorithm – Step3Mining Algorithm – Step3
Input:Input:
wherewhere
, , : word, domain, normalized frequence triplei j ijw d f
: frequency of in domain
: number of words in domain
/ : normalized frequency of in domain
ij i j
j ij ji
ij ij j i j
n w d
N n d
f n N w d
35
Mining Algorithm – Step4Mining Algorithm – Step4 Step4 (Removing Domain-Independent Terms):Step4 (Removing Domain-Independent Terms):
Domain-independent terms are identified as those Domain-independent terms are identified as those terms which distributed evenly in all domains. That is, terms which distributed evenly in all domains. That is, terms with large terms with large Cross-Domain EntropyCross-Domain Entropy (CDE) (CDE) defined as follows:defined as follows:
Terms whose CDE is above a threshold can be Terms whose CDE is above a threshold can be removed from the lexicon tree since such terms are removed from the lexicon tree since such terms are unlikely to be associated with any domain closely. unlikely to be associated with any domain closely. Terms with a low CDE will be retained in a few Terms with a low CDE will be retained in a few domains with the highest normalized frequencies (e.g., domains with the highest normalized frequencies (e.g., top-1 and top-2).top-1 and top-2).
* * logi i ij ijj
ijij
ijj
H H w P P
fP
f
42
ExperimentsExperiments
Domains:Domains: News articles from a local news siteNews articles from a local news site 138 distinct domains138 distinct domains
including leaf nodes of the directory tree and their parentsincluding leaf nodes of the directory tree and their parents leaves with the same name are considered in the same domainleaves with the same name are considered in the same domain Examples: baseball, basketball, broadcasting, car, Examples: baseball, basketball, broadcasting, car,
communication, culture, digital, edu(cation), entertainment communication, culture, digital, edu(cation), entertainment (( 流星流星 ++ 花園花園 ), finance, food (), finance, food ( 大魚大肉大魚大肉 ,, 干貝干貝 ,, 木耳木耳 ,, 錫箔錫箔紙紙 ,…)…,…)…
Size: 200M bytes (HTML files)Size: 200M bytes (HTML files) 16K+ unique words after word segmentation16K+ unique words after word segmentation
43
Dom
ains
Dom
ains
(Hierarch
y not sh
own
)(H
ierarchy n
ot show
n)
afternoon-news
entertainment
ilan listed-elec personal taiwan-china
all-baseball europe important local-scene pintung taoyuan
america-topic
europe2 important2 lotto pl tax-law
autumn family important3 main politics ti
basketball finance important4 mainland public-forum todaynews
bnext fish important5 managementreadexcellen
ttopic
broadcasting-tv
focus infotech medical-news readtopic topic2
buybooks focusnews insurance medical shopping trade
car food interest-prose miaoli sitemap travel
card fund-futures internal-sport middle-taiwan sitemap_title travelwindow
changhwa gameinternational-
sportmiddlesouth-
taiwansocial-forum
udn-supplement
chiayi global international miscellaneous society udn
college golf internet mixtravel south-taiwan udnbw
communication
happy_worker
japan movie special ue
culture hardware kaoshiung-city music sport usa-stock
daily health-carekaoshiung-
sentrynantou star world-econ
day_starnews
health-club keelung national-travel stock writers
digital hot-news life-topic02 newbooks taichung-city yunlin
domestic hot-topic life-topic03 north-taiwantaichung-
sentry
dswa.crp <root>
hot-topic2 life-topic1 opinion taiex
east-taiwan hot-topic3 life otc tainan
ec hot lifestyle out-activity taipei-city
edict hsinchu life_newtopic oversea-star taipei-sentry
edu hwalen listed-co performance taitung
44
Sample Output (4 Selected Sample Output (4 Selected Domains)Domains) baseball
broadcast -TV
basketball car
日本職棒 有線電視 一分 千西西
棒球賽 東風 三秒 小型車
熱身 開工 女子組 中古
運動 節目中 包夾 引擎蓋
場次 廣電處 外線 水箱
價碼 收視 犯規 加裝
球團 和信 投籃 市場買氣
部長 新聞局 男子組 目的地
練球 開獎 防守 交車
興農 頻道 冠軍戰 同級
球場 電視 後衛 合作開發
投手 電影 活塞 安全系統
球季 熱門 國男 行李
賽程 影視 華勒 行李廂
太陽 娛樂 費城 西西
Table 1. Sampled domain specific words with low entropies.
46
Preliminary ResultsPreliminary Results
Domain specific words and the assigned domain Domain specific words and the assigned domain tags are well associated (e.g., “tags are well associated (e.g., “ 投手” 投手” is is specifically used in the “baseball” domain.)specifically used in the “baseball” domain.) Extraction with the cross-domain entropy (CDE) metric Extraction with the cross-domain entropy (CDE) metric
is well founded.is well founded. Domain-independent (or irrelevant) words (such as Domain-independent (or irrelevant) words (such as
those for webmaster’s advertisements) are well rejected those for webmaster’s advertisements) are well rejected as DSW candidates for their high cross-domain entropyas DSW candidates for their high cross-domain entropy
DSW’s are mostly nouns and verbs (open-class DSW’s are mostly nouns and verbs (open-class words)words)
47
Preliminary ResultsPreliminary Results
Low cross-domain entropy words (DSW’s) Low cross-domain entropy words (DSW’s) in the respective domain are generally in the respective domain are generally highly correlated (e.g., “highly correlated (e.g., “ 日本職棒”日本職棒” , “, “ 部部長”長” ))
New New usages of words, such as “usages of words, such as “ 活塞活塞” ” ((Pistons) with the “basketball” sense, could Pistons) with the “basketball” sense, could also be identifiedalso be identified Both are good for WSD tasks to use the DSW’s Both are good for WSD tasks to use the DSW’s
as contextual evidencesas contextual evidences
48
Error SourcesError Sources
Single CDE metric may not be sufficient to Single CDE metric may not be sufficient to capture all characteristics of “domain-specificity”capture all characteristics of “domain-specificity” Type II error: Some general (non-specific) words may Type II error: Some general (non-specific) words may
have low entropy simply because they appear only in have low entropy simply because they appear only in one domain (CDE=0)one domain (CDE=0)
Probably due to low occurrence counts (a kind of estimation Probably due to low occurrence counts (a kind of estimation error)error)
Type I error: Some multiple sense words may have too Type I error: Some multiple sense words may have too many senses and thus be mis-recognized as non-many senses and thus be mis-recognized as non-specific in each domain (although the senses are unique specific in each domain (although the senses are unique in respect domains)in respect domains)
49
Error SourcesError Sources
““Well-organized website” assumption may Well-organized website” assumption may not be available all the timenot be available all the time The hierarchical directory tags may not be The hierarchical directory tags may not be
appropriate representatives for the document appropriate representatives for the document words within a websitewords within a website
The hierarchies may not be consistent from The hierarchies may not be consistent from website to websitewebsite to website
50
Future worksFuture works
Use other knowledge sources, other than the Use other knowledge sources, other than the single CDE measure, to co-train the model single CDE measure, to co-train the model in a manner similar to [Chang 97b, c]in a manner similar to [Chang 97b, c] E.g., with other term weighting metrics E.g., with other term weighting metrics E.g., stop list acquisition metric for identifying E.g., stop list acquisition metric for identifying
common words (for type II errors)common words (for type II errors) Explore methods and criteria to adjust Explore methods and criteria to adjust
hierarchy of a single directory treehierarchy of a single directory tree Explore methods to merge directory trees Explore methods to merge directory trees
from different sitesfrom different sites
51
Concluding RemarksConcluding Remarks A simple metric for automatic/semi-automatic A simple metric for automatic/semi-automatic
identification of DSW’sidentification of DSW’s At low sense tagging costAt low sense tagging cost
Rich web resource almost freeRich web resource almost free Implicit semantic tagging implied by the directory hierarchy Implicit semantic tagging implied by the directory hierarchy
(imperfect hierarchy but free)(imperfect hierarchy but free)
A simple method to build semantic links and degree of A simple method to build semantic links and degree of closeness among DSW’scloseness among DSW’s may be helpful for building large semantically tagged may be helpful for building large semantically tagged
lexicon trees or network linked x-wordnetslexicon trees or network linked x-wordnets Good knowledge source for WSD-related applicationsGood knowledge source for WSD-related applications
WSD, Machine translation, document classification, anti-spamming, WSD, Machine translation, document classification, anti-spamming, ……
52
Thanks for your attention!!Thanks for your attention!!
Thanks!!Thanks!!