national institute of informatics kiyoko uchiyama 1 a study for introductory terms in logical...
TRANSCRIPT
![Page 1: National Institute of Informatics Kiyoko Uchiyama 1 A Study for Introductory Terms in Logical Structure of Scientific Papers](https://reader034.vdocuments.net/reader034/viewer/2022051316/56649e7c5503460f94b7ed9b/html5/thumbnails/1.jpg)
National Institute of Informatics
Kiyoko Uchiyama
1
A Study for Introductory Terms in Logical Structure of Scientific P
apers
![Page 2: National Institute of Informatics Kiyoko Uchiyama 1 A Study for Introductory Terms in Logical Structure of Scientific Papers](https://reader034.vdocuments.net/reader034/viewer/2022051316/56649e7c5503460f94b7ed9b/html5/thumbnails/2.jpg)
Outline
1. Purpose, background, motivation
2. What’s “Introductory terms”
3. Analysis of logical structure
4. Analysis of structural role
5. Apply to MIC theory
6. Future works
2
![Page 3: National Institute of Informatics Kiyoko Uchiyama 1 A Study for Introductory Terms in Logical Structure of Scientific Papers](https://reader034.vdocuments.net/reader034/viewer/2022051316/56649e7c5503460f94b7ed9b/html5/thumbnails/3.jpg)
Author-based logging in
![Page 4: National Institute of Informatics Kiyoko Uchiyama 1 A Study for Introductory Terms in Logical Structure of Scientific Papers](https://reader034.vdocuments.net/reader034/viewer/2022051316/56649e7c5503460f94b7ed9b/html5/thumbnails/4.jpg)
Result of author’s publications &similar papers and similar research
ers
![Page 5: National Institute of Informatics Kiyoko Uchiyama 1 A Study for Introductory Terms in Logical Structure of Scientific Papers](https://reader034.vdocuments.net/reader034/viewer/2022051316/56649e7c5503460f94b7ed9b/html5/thumbnails/5.jpg)
Keyword-based logging in
![Page 6: National Institute of Informatics Kiyoko Uchiyama 1 A Study for Introductory Terms in Logical Structure of Scientific Papers](https://reader034.vdocuments.net/reader034/viewer/2022051316/56649e7c5503460f94b7ed9b/html5/thumbnails/6.jpg)
Result of keyword search by cosine similarity
![Page 7: National Institute of Informatics Kiyoko Uchiyama 1 A Study for Introductory Terms in Logical Structure of Scientific Papers](https://reader034.vdocuments.net/reader034/viewer/2022051316/56649e7c5503460f94b7ed9b/html5/thumbnails/7.jpg)
Select seed paper & several viewpoints
![Page 8: National Institute of Informatics Kiyoko Uchiyama 1 A Study for Introductory Terms in Logical Structure of Scientific Papers](https://reader034.vdocuments.net/reader034/viewer/2022051316/56649e7c5503460f94b7ed9b/html5/thumbnails/8.jpg)
Jump to Cinii & REO
![Page 9: National Institute of Informatics Kiyoko Uchiyama 1 A Study for Introductory Terms in Logical Structure of Scientific Papers](https://reader034.vdocuments.net/reader034/viewer/2022051316/56649e7c5503460f94b7ed9b/html5/thumbnails/9.jpg)
Purpose
9
• Investigate the occurrence of introductory terms in logical structure of textbooks, research papers and encyclopedia
• Categorize each sentence including introductory terms into structural roles
• Analyze how to behave the introductory terms in Introduction section
![Page 10: National Institute of Informatics Kiyoko Uchiyama 1 A Study for Introductory Terms in Logical Structure of Scientific Papers](https://reader034.vdocuments.net/reader034/viewer/2022051316/56649e7c5503460f94b7ed9b/html5/thumbnails/10.jpg)
Background
• A lot of technical terms exit in specific domain
• Difficult to identify the most important terms in the target field for novices
• Novices should learn the basic and necessary terms in the field in the first priority
10
![Page 11: National Institute of Informatics Kiyoko Uchiyama 1 A Study for Introductory Terms in Logical Structure of Scientific Papers](https://reader034.vdocuments.net/reader034/viewer/2022051316/56649e7c5503460f94b7ed9b/html5/thumbnails/11.jpg)
Our motivation
• Apply to a method for advanced search
• Assume that introductory terms ..
– play a important role for describing domain knowledge
– help novices to understand the content of academic papers
11
![Page 12: National Institute of Informatics Kiyoko Uchiyama 1 A Study for Introductory Terms in Logical Structure of Scientific Papers](https://reader034.vdocuments.net/reader034/viewer/2022051316/56649e7c5503460f94b7ed9b/html5/thumbnails/12.jpg)
What is “introductory terms”?
12
• Essential & basic terms for a target field
• The terms that should make it a first priority to learn in a target field
• Difficult to understand more difficult terms in the target field without the introductory terms
![Page 13: National Institute of Informatics Kiyoko Uchiyama 1 A Study for Introductory Terms in Logical Structure of Scientific Papers](https://reader034.vdocuments.net/reader034/viewer/2022051316/56649e7c5503460f94b7ed9b/html5/thumbnails/13.jpg)
novice
Hidden Markov Model
•Chasen
•MeCab
•JUMAN
•KAKASI
Paper A
Conditional Random field
Maximum entropy model
High →→ introductory degree →→ low
•Morphological analysis▼
•Syntactic analysis
•Semantic analysis
Tutorial paper
PaperB
PaperC
![Page 14: National Institute of Informatics Kiyoko Uchiyama 1 A Study for Introductory Terms in Logical Structure of Scientific Papers](https://reader034.vdocuments.net/reader034/viewer/2022051316/56649e7c5503460f94b7ed9b/html5/thumbnails/14.jpg)
Automatic definition
14
• Define the introductory terms which are selected in common by a lot of experts
• Experts of specific field wrote/edited the following resources
– Textbooks
– Encyclopedia
– Research papers
![Page 15: National Institute of Informatics Kiyoko Uchiyama 1 A Study for Introductory Terms in Logical Structure of Scientific Papers](https://reader034.vdocuments.net/reader034/viewer/2022051316/56649e7c5503460f94b7ed9b/html5/thumbnails/15.jpg)
Priority ( Frequency )
15
• Authors arrange the contents of their textbooks in an easy-to-understand order
• Authors include important keywords in title, author-assigned keywords in academic papers
• The table of contents of Encyclopedia is edited by a lot of experts
![Page 16: National Institute of Informatics Kiyoko Uchiyama 1 A Study for Introductory Terms in Logical Structure of Scientific Papers](https://reader034.vdocuments.net/reader034/viewer/2022051316/56649e7c5503460f94b7ed9b/html5/thumbnails/16.jpg)
Logical Structure
17
• Distribution patterns in IMRD structure (Introduction, Method, Result and Discussion) of the text might be informative for identifying the introductory terms
• Assume that introductory terms are frequently used in introduction section
![Page 17: National Institute of Informatics Kiyoko Uchiyama 1 A Study for Introductory Terms in Logical Structure of Scientific Papers](https://reader034.vdocuments.net/reader034/viewer/2022051316/56649e7c5503460f94b7ed9b/html5/thumbnails/17.jpg)
Data set
18
Target field: NLP, Target language: Japanese
• Textbooks: 39 textbooks whose titles included “natural language processing”
• Natural Language Processing Encyclopedia written in Japanese
• Academic papers: 1421 papers of NLP research group in Information Processing Society of Japan from 1993 to 2007
![Page 18: National Institute of Informatics Kiyoko Uchiyama 1 A Study for Introductory Terms in Logical Structure of Scientific Papers](https://reader034.vdocuments.net/reader034/viewer/2022051316/56649e7c5503460f94b7ed9b/html5/thumbnails/18.jpg)
Data collection
19
• Morphological analysis by MeCab for Japanese
• Extract sequential noun strings as the term candidate in
– the textbooks(694 types)
– table of contents of Encyclopedia(463 types)
– title, abstract and author-assigned keywords in papers ( 13493 types)
• 90 terms appeared in all of three resources
![Page 19: National Institute of Informatics Kiyoko Uchiyama 1 A Study for Introductory Terms in Logical Structure of Scientific Papers](https://reader034.vdocuments.net/reader034/viewer/2022051316/56649e7c5503460f94b7ed9b/html5/thumbnails/19.jpg)
Analysis of Logical Structure
20
• Use full text of research papers in NLP field
• Target papers which describe experiments and results
• Extract 100 papers which include words such as “experiment”, “evaluation”, “precision” and “%” and so on
• Divide full texts into 6 sections
![Page 20: National Institute of Informatics Kiyoko Uchiyama 1 A Study for Introductory Terms in Logical Structure of Scientific Papers](https://reader034.vdocuments.net/reader034/viewer/2022051316/56649e7c5503460f94b7ed9b/html5/thumbnails/20.jpg)
21
numbers
of sentence
s
numbers of sentences including
introductory terms
rate
Abstract 656 362 0.552Introduction 2448 1284 0.525
Experiment 8931 2701 0.302Related works 1222 542 0.444
Conclusion 805 394 0.489Others 11965 3439 0.287Total 26027 8722 0.376
![Page 21: National Institute of Informatics Kiyoko Uchiyama 1 A Study for Introductory Terms in Logical Structure of Scientific Papers](https://reader034.vdocuments.net/reader034/viewer/2022051316/56649e7c5503460f94b7ed9b/html5/thumbnails/21.jpg)
Analysis of structural role
• Extract sentences including introductory terms in Introduction
• “Introduction” section has several kinds of sentences outlining the research
• Categorize each sentences into structural role by manual
• Analyze the sentence from the viewpoint of various features
22
![Page 22: National Institute of Informatics Kiyoko Uchiyama 1 A Study for Introductory Terms in Logical Structure of Scientific Papers](https://reader034.vdocuments.net/reader034/viewer/2022051316/56649e7c5503460f94b7ed9b/html5/thumbnails/22.jpg)
Structural role
23
1. Hypothesis
2. Motivation
Problem
3. Background
4. Goal
5. Object
6. Method( new-old )
7. Experiment
8. Model
9. Observation
10. Result
11. Conclusion
Base on the the CoreSC Annotation scheme( Soldatova & Liakata, 2007)
![Page 23: National Institute of Informatics Kiyoko Uchiyama 1 A Study for Introductory Terms in Logical Structure of Scientific Papers](https://reader034.vdocuments.net/reader034/viewer/2022051316/56649e7c5503460f94b7ed9b/html5/thumbnails/23.jpg)
Features in structural role
• Tense, aspect, modality
• Verbs
• Syntactic features
• Lexical features
24
![Page 24: National Institute of Informatics Kiyoko Uchiyama 1 A Study for Introductory Terms in Logical Structure of Scientific Papers](https://reader034.vdocuments.net/reader034/viewer/2022051316/56649e7c5503460f94b7ed9b/html5/thumbnails/24.jpg)
Tense, Aspect
25
• Background– Recently, morphological analysis has been transiti
oning from the method based on heuristic knowledge to the method using probabilistic model. ( 近年、 〜しつつある。)
• Related Works
– The authors is proposing/proposed a method for morphological analysis using rule-based paraphrasing (提案している => 提案した)
![Page 25: National Institute of Informatics Kiyoko Uchiyama 1 A Study for Introductory Terms in Logical Structure of Scientific Papers](https://reader034.vdocuments.net/reader034/viewer/2022051316/56649e7c5503460f94b7ed9b/html5/thumbnails/25.jpg)
Modality, Verbs
• Modality
– The high level of language processing would be needed for assigning semantic features to words
• 必要かもしれない
• Verbs
– Specific verbs in present sense tend to be used in Object
Ex. Propose 、 intend, design, tackle26
![Page 26: National Institute of Informatics Kiyoko Uchiyama 1 A Study for Introductory Terms in Logical Structure of Scientific Papers](https://reader034.vdocuments.net/reader034/viewer/2022051316/56649e7c5503460f94b7ed9b/html5/thumbnails/26.jpg)
Syntactic features
28
• Temporal expression (Background)
– Recently 近年、 so far これまで、
– Several researches have been done …. • 研究が行われてきた
• Fixed expression (Motivation, Related-works)
– It is inevitable/necessary 〜必要である
– The research has not be done … 〜の研究は行われていない
– [Authors] is proposing … 提案している
![Page 27: National Institute of Informatics Kiyoko Uchiyama 1 A Study for Introductory Terms in Logical Structure of Scientific Papers](https://reader034.vdocuments.net/reader034/viewer/2022051316/56649e7c5503460f94b7ed9b/html5/thumbnails/27.jpg)
Lexical features
• Keywords related to structural role
– Problem
• One of the main problems is that unknown word and new terms have been increasing day by day.
• it costs a lot of time …
– Experiment
• We conducted/ proceeded the experiment
• In order to evaluate our proposed method,
– Result
• We show the result of the experiment …
• We could obtain better precision …
29
![Page 28: National Institute of Informatics Kiyoko Uchiyama 1 A Study for Introductory Terms in Logical Structure of Scientific Papers](https://reader034.vdocuments.net/reader034/viewer/2022051316/56649e7c5503460f94b7ed9b/html5/thumbnails/28.jpg)
Discussion
• Introductory terms are frequently used in sentences to position the proposed method in a target field
• Introductory terms and the structural role introduced the basic domain knowledge which is necessary for understanding the main purpose of papers
• Possible to classify each sentence into specific structural role automatically
30
![Page 29: National Institute of Informatics Kiyoko Uchiyama 1 A Study for Introductory Terms in Logical Structure of Scientific Papers](https://reader034.vdocuments.net/reader034/viewer/2022051316/56649e7c5503460f94b7ed9b/html5/thumbnails/29.jpg)
Future works
31
• Categorize sentences including introductory terms into each structural role automatically
• Analyze the collocation words with introductory terms
– Syntactic information ( subject, object, modifier, and so on )
– Semantic relation between the introductory terms and other terms ( objective, method, target )
![Page 30: National Institute of Informatics Kiyoko Uchiyama 1 A Study for Introductory Terms in Logical Structure of Scientific Papers](https://reader034.vdocuments.net/reader034/viewer/2022051316/56649e7c5503460f94b7ed9b/html5/thumbnails/30.jpg)
Information types
32
contents information Components of papersSemantic information
Intensive expression
Logical structure
Informative expression
Structural role
Syntactic information
Basic expression Tense, aspect, modalityIntroductory terms, author assigned keywords
![Page 31: National Institute of Informatics Kiyoko Uchiyama 1 A Study for Introductory Terms in Logical Structure of Scientific Papers](https://reader034.vdocuments.net/reader034/viewer/2022051316/56649e7c5503460f94b7ed9b/html5/thumbnails/31.jpg)
Apply to MIC theory
• Logical structure consists of structural roles
• The authors consider the discourse of their paper based on their proposed model/method
• MIC theory could be applied to sentence level and discourse level
• The order strategy of structural roles might relate to meta-information
33
![Page 32: National Institute of Informatics Kiyoko Uchiyama 1 A Study for Introductory Terms in Logical Structure of Scientific Papers](https://reader034.vdocuments.net/reader034/viewer/2022051316/56649e7c5503460f94b7ed9b/html5/thumbnails/32.jpg)
Analysis of Hierarchy
Sentence level
– There are no researches for [METHOD]
Basic expression → informative expression
Discourse level
• Background: Recently, [METHOD]has been used in…
• Motivation: We need to consider [METHOD] for morphological analysis
• Objective-New: We propose [METHOD] ← Focus34
![Page 33: National Institute of Informatics Kiyoko Uchiyama 1 A Study for Introductory Terms in Logical Structure of Scientific Papers](https://reader034.vdocuments.net/reader034/viewer/2022051316/56649e7c5503460f94b7ed9b/html5/thumbnails/33.jpg)
Conclusion
• Might be interested in analysis of introductory terms and their surrounding syntactic and semantic information from the view point of MIC ( I’m not sure…)
• The result of the analysis would hope to contribute the understanding of academic papers
35