discourse annotation for arabic 3

15
Discourse Annotation for Arabic Imam University College of Computer and Information systems Prepared by : Al-harbi.A Al- Gumlas.H Al-Otaibi.E

Upload: arabicnlpimamu2013

Post on 11-Jun-2015

72 views

Category:

Technology


0 download

TRANSCRIPT

Page 1: Discourse annotation for arabic 3

Discourse Annotation for Arabic Imam University

College of Computer and Information systems

Prepared by: Al-harbi.A

Al-Gumlas.H Al-Otaibi.E

Page 2: Discourse annotation for arabic 3

• Introduction :Discourse usually refers to a form of written text

or spoken language.

A text is not only a sequence of sentences or clauses, but rather it is a coherent object that has many cohesive devices linking its units (words, clauses and sentences).

Page 3: Discourse annotation for arabic 3

• Discourse Relations There are two types of discourse relations: (i) Relations that are signalled explicitly via so called discourse connectives. (ii) Relations that can be inferred from the context without any signaling.

Discourse relations are semantic relations.

Page 4: Discourse annotation for arabic 3

• Discourse Relations

Page 5: Discourse annotation for arabic 3

• Discourse ConnectivesTypes : Simple Connectives Ex: ألن because - بعدما after – و and Paired Connectives Ex: .. ف ف... if' – Then ..اذا .. - although - بالرغم لبث ما.حتى Modified Connectives Ex: لو ا - even if’ حتى أيضا and also ’و Combined Connectives

Ex: بعد and but ولكٍن� - except after اال

Page 6: Discourse annotation for arabic 3

Related Work: . Several textual corpora of Arabic exist.

. Some of them are available with Part-of-Speech and syntactic

annotation such as the Arabic Treebank (ATB) The Prague Arabic Dependency Treebank (PADT), which is smaller in scale than the ATB, contains multilevel annotations, including morphological and analytical level of linguistic representation.

. Also, a recent effort by Dukes and Habash (2010) has produced

The Quranic) has produced The Quranic Arabic Corpus, a free annotated linguistic resource which provides morphological annotation and syntactic analysis of the Holy Quran.

Page 7: Discourse annotation for arabic 3

• Collecting Arabic Connectives

.They are collected a large set of Arabic discourse

connectives using text analysis and corpus-based techniques. Example : A. الثمٍن ] �]Arg1 [ لكنها]DC [ باهظة جدا متطورة .Arg2 [ السيارةB. [al-sy¯arh mtt.wrh ˇgd¯an.] Arg1 [lknh¯a] DC [b¯ahz. ah

alt-mn] Arg2C. [The car is so modern.] Arg1 [but] DC [it is too expensive]

Arg2.

Page 8: Discourse annotation for arabic 3

• Annotation Scheme

. Annotation is based on lexicalized grammar theory.

1. The anchor of the annotation is the lexical item – a discourse connective (DC).

2. The Arg2 label is assigned to the argument with which the connective was syntactically associated.

3. The Arg1 label, can refer to an abstract object at any distance from the connective.

Page 9: Discourse annotation for arabic 3

Theories of Discourse Structure

. Linguists attempted to produce reasonable generalized theories to

represent discourse structure.

.Theories of discourse structure differ in their focus according to the type of

discourse such as: written text or dialogue, the type of organization such as intentional organization (speaker’s plan).

. One of the most popular discourse theories is:

( RST ) Rhetorical Structure Theory

Page 10: Discourse annotation for arabic 3

RST. .RST is a theory of how coherence in text is achieved

.RST was originally developed as part of studies of .computer-based text generation

.RST is designed to explain the coherence of texts, seen as a kind of function, linking parts of a text to each other.

Page 11: Discourse annotation for arabic 3

RST Relation Name Nucleus Satellite

Background text whose understanding is being facilitated

text for facilitating understanding

Elaboration basic information additional information

Preparation text to be presented text which prepares the reader to expect and interpret the text to be presented .

Page 12: Discourse annotation for arabic 3

RST Example With just those relations, we can illustrate the analysis of a text.

Page 13: Discourse annotation for arabic 3

applications• Question-Answering and Information

Extraction systems• Speech Recognition.• Text Generation.• Essay Scoring.• Text Summarization.

Page 14: Discourse annotation for arabic 3

Dicourse Annotation tool for English and Arabic

Page 15: Discourse annotation for arabic 3

Thank you