a proposed approach to handling unbounded dependencies in automatic parsers

Upload: ramy-al-gamal

Post on 03-Jun-2018

217 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/13/2019 A Proposed Approach to handling unbounded dependencies in automatic parsers

    1/149

    University of Alexandria

    Faculty of Arts

    English Language Department

    A Proposed Approach to Handling Unbounded

    Dependencies in Automatic Parsers

    A THESIS SUBMITTED TO THE ENGLISH LANGUAGE

    DEPARTMENT, FACULTY OF ARTS, THE UNIVERSITY OF

    ALEXANDRIA IN FULFILLMENT OF THE REQUIREMENTS FOR

    THE DEGREE OF MASTER OF ARTS IN COMPUTATIONAL

    LINGUISTICS

    By

    Ramy Muhammad Magdi Ragab Abdel Azim

    Supervised by:

    Dr. Sameh Al-Ansary

    Associate Professor of Computational

    Linguistics

    Department of Phonetics and Linguistics

    Faculty of Arts

    Alexandria University

    Dr. Heba Labib

    Assistant Professor of Linguistics

    Department of English Language and

    Literature

    Faculty of Arts

    Alexandria University

  • 8/13/2019 A Proposed Approach to handling unbounded dependencies in automatic parsers

    2/149

    2

  • 8/13/2019 A Proposed Approach to handling unbounded dependencies in automatic parsers

    3/149

    3

    to the memory of

    Professor Hassan Atiyya Taman

    (2010)

  • 8/13/2019 A Proposed Approach to handling unbounded dependencies in automatic parsers

    4/149

    4

    Contents

    Abstract

    Acknowledgements

    Symbols and Abbreviations

    List of Figures

    List of Tables

    1. INTRODUCTION 171.1.Motivation 181.2.The Problem 211.3.Aims and Contributions 231.4.Thesis Structure 251.5.UDs defined 271.6.The class of UDs 30

    1.6.1. Strong UDs 311.6.2. Weak UDs 32

    1.7.Nomenclature 342. UDS AND SYNTACTIC FORMALISMS 37

    2.1.Derivational Approaches 382.2.Generalized phrase structure grammar (GPSG) 442.3.Head-driven phrase structure grammar (HPSG) 522.4.Categorial grammar (CG) 602.5.Lexical functional grammar (LFG) 63

  • 8/13/2019 A Proposed Approach to handling unbounded dependencies in automatic parsers

    5/149

    5

    2.6.Towards an Ontology of Gaps 702.6.1. Gaps between Objects and Subjects 712.6.2. The Distribution of Gaps 732.6.3. The Ontology 77

    3. Parsing and Formal Languages 813.1.The Concept of a Formal Language 823.2.Defining a Generative Grammar 833.3.Formal Grammars and their Relation to Formal Languages 843.4.The Chomsky Hierarchy 863.5.Automata 893.6.Parsing Theories and Strategies 913.7.The Universal Parsing Problem 923.8.Major Parsing Direction 933.9.Top-down Parsing 95

    3.10.Bottom-up Parsing 963.11.The Cocke-Kasami-Younger Algorithm 983.12.The Earley Algorithm 1003.13.Statistical or Grammarless Parsing 1033.14.Text vs. Grammar Parsing: the Nivre Model 1043.15.Text Parsing and the Problem of UDs 105

    4. UDs Parsing Complexity 1084.1.The Rimell-Clark-Steedman (RCS) Test 1104.2.The Parsers Set 112

  • 8/13/2019 A Proposed Approach to handling unbounded dependencies in automatic parsers

    6/149

  • 8/13/2019 A Proposed Approach to handling unbounded dependencies in automatic parsers

    7/149

    7

    "In the beginning was the word. But by the time the second word was

    added to it, there was trouble. For with it came syntax, the thing that

    tripped up so many people."

    John Simon,Paradigms Lost

    This is a fertile area of research, in which definitive answers have not

    yet been found.

    Sag &Wasow, Syntactic Theory: a formal introduction

  • 8/13/2019 A Proposed Approach to handling unbounded dependencies in automatic parsers

    8/149

    8

    Abstract

    Unbounded dependencies (UDs) represent a set of syntactic constructions in the

    English language that face syntactic and computational analyses with a number of

    challenges. Unbounded dependencies cover such constructions as wh-questions,

    relative clauses, topicalized sentences, tough movement clauses, it-clefts and many

    more. Though each of the previous constructions may have received considerable

    attention in the syntactic literature, the awareness of the unity of all these constructions

    and their likeminded behavior that make them form a coherent whole was largely

    missing in such treatments.

    This thesis explores the linguistic nature of UDs and how they were handled within

    the current flurry of syntactic theories. The thesis provides analyses of UDs within the

    Principles & Parameters model (as representative of derivational approaches to syntax),

    Generalized Phrase Structure Grammars, Head-driven Phrase Structure Grammars,

    Lexical-functional Grammars, and Categorial Grammars (as representatives of non-

    derivational approaches). The thesis, then offers a newly devised gaps-ontology that

    aims at gathering all the information and rules related to the behavior of gaps in

    unbounded dependencies in one integral theoretical entity that can be utilized in

    computational environments.

    The thesis claims that the problem of UDs parsing is basically a computational

    problem not a syntactic one, i.e. the solution of the problem lies in the parsing strategy

    and techniques used not the theoretical underpinnings of the different syntactic analyses

    available. Accordingly, the thesis proposes two types of solutions to the parsing

  • 8/13/2019 A Proposed Approach to handling unbounded dependencies in automatic parsers

    9/149

    9

    problem of UDs: the first introduces modifications on the architectural design of the

    universal parser, subscribing to the highly useful technique of modularity and thus

    devising what the thesis calls a Small-scale Latent Parser. The other proposes

    processing modifications represented by the techniques of gap-threading and

    memoization. The thesis also claims that top-down parsing cannot be endorsed as a

    possible strategy for parsing UDs and favors, thus, bottom-up parsing strategies

    instead.

  • 8/13/2019 A Proposed Approach to handling unbounded dependencies in automatic parsers

    10/149

    01

    Acknowledgments

    My interests in computer science and the study of computational linguistics were

    triggered 13 years ago when I began my work with Dr. Nabil Ali. Dr. Ali, an engineer

    by training and the father of Arabic informatics and computational linguistics, brought

    to my attention many important works and gave me the opportunity to see how a real

    computational system looks like. The late Prof. Hasan Taman, original supervisor of

    the thesis, is the one who should be accredited with the current organization of the

    thesis. He insisted, against my disposition to work on theoretical issues alone, on a

    problem-solving method that finds a problem and proposes solutions, which explains

    the title of the thesis itself (his exact phrasing). Prof. Tamans belief in me and in my

    academic abilities was so crucial in infusing me with the spirit that made me work on

    this thesis and recover from so many bouts of despair. May his soul rest in peace.

    Prof. Azza el-Khoulys and Prof. Sahar Hamoudas kindness and support made this

    thesis see the light of day. Dr. Sameh al-Ansarys patience, unflinching support and

    understanding also revitalized the hope of finishing this thesis. Without him I would

    not have been able to finish the thesis in the first place, not to mention his comments

    and suggestions that improved the outlook and organization of the thesis. My debt to

    him will always be remembered.

    Also, Prof. Olga Matars kind approval to be one of the examiners brought me such

    happiness because she was the first one I hoped could supervise my work even before

    Prof. Taman, but unfortunately at that time she was unable to slot me in her already full

    schedule of graduate theses supervisions. Dr. Heba Labibs sweet kindness and

  • 8/13/2019 A Proposed Approach to handling unbounded dependencies in automatic parsers

    11/149

  • 8/13/2019 A Proposed Approach to handling unbounded dependencies in automatic parsers

    12/149

    02

    Symbols and Abbreviations

    _ Underscores represent the position (s) of gaps in a sentence./ Represents the SLASH feature in which the slashed feature on the

    right-hand side of the slash is missing.

    e Null or empty categories.

    ' Adding up to in HPSG

    AB There is a category A missing somewhere within it a B, within

    Moortgats version of CG.

    In LFG, a variable that refers to the lexical item being categorized.= In LFG, an equation meaning that the features of the nodes below and

    above are being shared.

    Lambda, a symbol referring to a string consisting of zero elements.

    L Language in formal languages theory.

    G Grammar in the theory of formal languages.

    VN Nonterminal variables

    VT Terminal variables

    L(G) The grammar of a languageLin formal language theory.

    (N, , S,P) Elements of a formal grammar G.

    The left-hand side elements are rewritten as the right-hand side

    elements, e.g. S NP VP

    xS xbelongs to or a member of S. Refers either to the root of a sentence or, in formal language theory, to

    terminals of sentence in contrast toNwhich refers to non-terminals.

    In the Earley parsing algorithm, the dot is used on the right-hand side

    of the grammar rule to tell us where the rule has reached or to what

    extent it progressed, e.g. SVP, [0, 0]

    Boxed numbers, or tags, in AVMs indicate structure sharing in HPSG

  • 8/13/2019 A Proposed Approach to handling unbounded dependencies in automatic parsers

    13/149

    03

    NLP Natural Language Processing

    UDs Unbounded Dependencies

    ST Syntactic Theory

    GPSG Generalized Phrase Structure Grammar

    LFG Lexical Functional Grammar

    HPSG Head-driven Phrase Structure Grammar

    CG(s) Categorial Grammar(s)

    CCG Combinatory Categorial Grammar

    TG Transformational Grammar

    ATN Augmented Transition Networks

    PSG Phrase Structure Grammar

    GB Government and Binding theory

    P&P Principles and Parameters theory

    MP Minimalist Program

    TP A clause consisting of an NP and a VP.

    C Complementizer within a P&P context.

    CP Complementizer phrase within a P&P context.

    DP Determiner Phrase within a P&P context.

    SPEC Specifier within a P&P context.

    CF-PSG Context-free Phrase Structure Grammar.

    FFP Foot Feature Principle within a GPSG context.

    ID Immediate Dominance rules within a GPSG context.

    LP Linear Precedence within a GPSG context.

    HFP Head Feature principle within a GPSG context.

    CSLI Stanfords University Center for the Study of Language and Information.QUE A feature of questions in HPSG.

    REL A feature of relative clauses in HPSG.

    INHER Inheritance feature in HPSG.

    AVM Attribute Value Matrix in HPSG and Unification Grammars.

    SYNSEM Syntax-semantics interface in HPSG.

    SPR Specifiers in HPSG.

    3sg Third person singular in HPSG.

  • 8/13/2019 A Proposed Approach to handling unbounded dependencies in automatic parsers

    14/149

  • 8/13/2019 A Proposed Approach to handling unbounded dependencies in automatic parsers

    15/149

    05

    List of Figures and Tables

    Figures

    (1) The Class of Unbounded Dependency Constructions.(2) A derivational analysis of the sentence Who do you think Jim Kissed?.(3) A derivational analysis of the sentence Who do you think Jim Kissed?

    (modified).

    (4) A derivational analysis of the sentence Who do you think Jim Kissed?(modified).

    (5)

    A derivational analysis of the sentence Who do you think Jim Kissed?(modified).

    (6) A derivational analysis of the sentence Which city did Ian visit?.(7) Tree geometry of the structure of a UD in GPSG.(8) A GPSG analysis of the sentence Sandy we want to succeed.(9) An HPSG analysis of the sentence Kim,we know Sandy claims Dana hates.(10) An attribute value matrix (AVM) for the verb sees in HPSG.(11) An HPSG structural description (SD) of gaps in UDs.(12) A CG analysis of the sentences Whom do you think he loves? and Who do

    you think loves him?.(13) A CG analysis of the sentence Who Jo hits?(14) An LFG analysis of the sentence What Rachel thinks Ross put on the shelf?(15) The c-structure of What Rachel thinks Ross put on the table?(16) The f-structure of What Rachel thinks Ross put on the table?(17) C-structure of What did the strange, green entity seem to try to quickly hide?

    (Asudeh 2009)(18) F-structure of What did the strange, green entity seem to try to quickly hide?

    (Asudeh 2009)(19) A subject-predicate analysis of the topicalized sentence The others I know are

    genuine. CGEL.(20) A proposed Gap ontology.(21) GAPS AVM.(22) The Chomsky Hierarchy and its corresponding automata.(23) a top-down analysis of the sentenceBook that flight.(24) A bottom-up analysis of the sentenceBook that flight.(25) A CKY parsing of the sentenceBook the flight through Houston.(26) An illustration of an attachment ambiguity in the sentence I shot the elephant in

    my pajamas.

  • 8/13/2019 A Proposed Approach to handling unbounded dependencies in automatic parsers

    16/149

    06

    (27) Components of a language processing system.(28) The structure of a compiler within a language processing system.(29) The Parser within the compiler.(30) Small-scale Latent Parser.(31) GAPS AVM.(32) Flowchart of UDs SPL algorithm.(33) Gap-threading in the sentence John, Sally gave a book to.(34) A parse of the sentence Who do you claim that you like?using Python.(35) A parser blueprint incorporating all proposed modifications.

    Tables

    (1) Position/function of Gaps.(2) Multi-locus Gaps.(3) Formal elements of a PSG.(4) Chomsky hierarchy grammars and their corresponding automata.(5) An Earley algorithm analysis of the sentenceBook that flight.(6) Examples of the seven types of UDs used in the RSC Test.(7) Parser accuracy on the UDs corpus according to the RCS Test.

  • 8/13/2019 A Proposed Approach to handling unbounded dependencies in automatic parsers

    17/149

    07

    Chapter1: Introduction

  • 8/13/2019 A Proposed Approach to handling unbounded dependencies in automatic parsers

    18/149

    08

    Chapter1:

    Introduction

    1.1. Motivation:Since the beginning of the year 2000 up until the end of 2003, I have worked on natural

    language processing (NLP) solutions for two major companies in Egypt. My first-hand

    experience with actual large-scale parsers made me aware of some problems facing

    those parsers in the processing of certain grammatical constructions. I decided, back

    then, to tackle one of the most difficult problems facing those parsers unbounded

    dependencies.

    Complex syntactic phenomena stand out as a challenge to computational

    implementation in NLP applications. The challenge resides in the problematic nature of

    these phenomena: they are syntactically rich with details, and as a consequence of

    complexity, they are interleaved with many other linguistic phenomena. In addition,

    they exhibit a sufficiently perplexing tendency towards being polymorphous and

    diverse. Unbounded dependencies (or, alternatively, long-distance dependencies, filler-

    gap constructions, wh-movement constructions, A-bar dependency constructions,

    extraction dependencies, etc.) are classic examples of how complex and theoretically as

    well as computationally challenging these syntactic phenomena can be. Terry

    Winograd (Winograd 1983) gives us an unequivocal statement about the significance of

    UDs to the then current syntactic theory. He says:

  • 8/13/2019 A Proposed Approach to handling unbounded dependencies in automatic parsers

    19/149

    09

    The need to account for this phenomenon [UDs] is one of the major forces

    shaping grammar formalisms. It was one of the motivations for the original

    idea of transformations, and in some recent versions of TG, the only

    remaining transformations are those needed to handle it. The hold register

    in ATN grammars, the distant binding arrows of LFG, and the derived

    categories of PSG are other examples of special devices that have been

    added on top of simpler underlying mechanisms in order to handle it.

    (Winograd 1983: 478)Since the 1970s, it has been generally assumed that a number of grammatical

    constructions show a type of uniform behavior and architecture that they should be

    considered en masse. Chomsky (1977) notes that the rule of wh-movement has, inter

    alia, the following general characteristics:

    1- it leaves a gap.2- where there is a bridge, there is an apparent violation of subjacency.3- it observes wh-islands. (Chomsky 1977: 86)

    Grammatical phenomena that fall under the rubric of UDs cover the following

    constructions: topicalization, wh-questions, wh-relatives, it-clefts, tough movement, etc.

    The most important feature marking all these constructions is the existence of gaps as

    Chomsky noted above.

    UDs represent a unique class of grammatical constructions that require some

    especially devised mechanisms in order to successfully process them syntactically and

    computationally. A basic example on UDs is given in the following sentence:

    (a) Sam, I think he told me he tried to understand __.The above sentence can be represented in the following, largely theory-neutral, tree diagram

  • 8/13/2019 A Proposed Approach to handling unbounded dependencies in automatic parsers

    20/149

    21

    S

    NP VP

    VS

    NP VP

    V NP S

    NP VP

    V IP

    I

    V NP

    S

    NP

    Sam, I think he told me he tried to understand___

    Sentence (a) above is a topicalized sentence where the object of the sentence is

    fronted to add emphasis to the intended message of the construction. The fronting of

    Sam, i.e. its displacement from the normal object position in the English language (an

    SVO language) left a trace in the position of the displaced object that tells us about the

    history or the original constitution of the structure before the displacement process.

    This trace is usually marked with a hyphen or a dash representing the displaced

    element. This account somehow subscribes to a movement-based hypothesis that is part

    of the derivational approach to UDs evidenced in TG, GB, P&P and MP theories of

    syntax.1

    1 The example above and the subsequent explanation should not be taken as a sign of the researchers

    subscription to the Chomskyan model and its various manifestations and developments. On the contrary.

    The present work openly criticizes those approaches and spots many deficiencies in them as will be seen

    in chapter 2.

  • 8/13/2019 A Proposed Approach to handling unbounded dependencies in automatic parsers

    21/149

    20

    1.2. The ProblemUDs represent a unique instance in the history of contemporary syntactic theory

    (henceforth: ST) and NLP. In fact, they became the raison d'tre of a handful of

    extremely influential syntactic formalisms and a number of novel computational

    theorems and techniques. Robust syntactic formalisms such as Generalized Phrase

    Structure Grammar (henceforth: GPSG), Lexical Functional Grammar(s) (henceforth:

    LFG), Head-driven Phrase Structure Grammar(s) (henceforth: HPSG), and modern

    Categorial Grammar(s) all owe, some way or another, many of their formative concepts

    and notational devices to studies of UDs. Ivan Sag (Sag 1982) expresses this fact

    succinctly by saying that:

    Few linguists would take seriously a theory of grammar which did not

    address the fundamental problems of English that were dealt with in the

    framework of standard transformational grammar by such rules as There-

    insertion, It-extraposition, Passive, SubjectSubject raising, and Subject

    Object raising. (Sag 1982, p. 427)

    UDs happen to be one of those constructions. This is not the whole picture, though.

    UDs form an integrated component in most syntactic theories that have attained a

    considerable degree of maturity. Its internal complexity and the sophistication needed

    to handle them formally and computationally made them a benchmark against which

    the validity, expressive power, and exhaustiveness of treatment of any given syntactic

    theory are gauged. None the less, only few works have paid attention to handling UDs

    in a uniform manner, i.e. works dealing with UDs as a uniform whole surveying their

  • 8/13/2019 A Proposed Approach to handling unbounded dependencies in automatic parsers

    22/149

    22

    treatment in different syntactic formalisms that subscribe to different linguistic

    frameworks are quite meager.

    1

    As regards the computational handling of UDs, there have been various attempts at

    unraveling their syntactic complexity through computers. The basic idea was to test the

    robustness of a particular grammar formalism or computational system (Oltmans 1999).

    The idea of robustness is of the essence here. A computational system is deemed robust

    if it exhibits graceful behavior in the presence of exceptional conditions. Robustness in

    NLP is concerned with the systems behavior when input falls outside of its initial

    coverage. For instance, if the system is fed with rules describing and specifying the

    behavior and structure of relative clauses in English, it will not be negatively affected if

    these rules are not covered in full. But the question remains: why study UDs from a

    computational viewpoint? The answer to this question seems to be unanimous in the

    computational literature. UDs have been always identified in computational linguistics

    works as a problem. Charniak (1993) mentions the following concerning UDs:

    Another standard problem with CFGs is long distance dependencies This

    problem can be solved within a CFG, although it gets a bit complicated. (Charniak

    1993: 8-9)

    In Mellishet al.(1994) the situation is even more clear-cut:

    The problem is more severe when we come to consider long distance

    dependencies, or more correctly unbounded dependencies in which two unrelated

    pieces of structure may be arbitrarily far apart and not in the same level in the tree.

    (Mellish et al. 1994: 129-130)

    1Only recently Robert Levine and Thomas Hukari have produced a uniform treatment of UDs in their full

    manifestations in their: R. Levine & T. Hukari (2006) The Unity of Unbounded Dependency

    Constructions. CSLI Publications, Stanford University. Unfortunately, I was unable to secure a copy ofthe book, but I read a detailed academic review of it by Robert Borsley. However, the main thrust of the

    book is on the syntax-theoretic aspects of UDs within the framework of HPSG without any reference to

    computational issues (see Borsley 2009).

  • 8/13/2019 A Proposed Approach to handling unbounded dependencies in automatic parsers

    23/149

    23

    Pereira (1981) finds out that one of the most important benefits of connecting parsing

    with deduction is the [h]andling of gaps and unbounded dependencies on the fly

    without adding special mechanisms. Such an excessive interest and engagement with

    UDs, give us a clear unhampered view of the status of UDs as a computational

    problem. There seems to be a common realization amongst computational linguists and

    syntacticians of the problematic nature of UDs; a fact that precipitated many of the

    current theoretical frameworks both in pure syntax and in computational linguistics.

    Statistically speaking, there is a common belief that UDs and similar phenomena do

    not represent a sizable portion of any general large-scale corpus, hence ignoring their

    treatment. Surprisingly, however, around three quarters of the Wall Street Journal

    corpus (WSJC) in the Penn Treebank (PTB) are non-local dependencies, which happen

    to include UDs most of the time. The internal sophistication of UDs, their typological

    diversity, the existence of gaps, their considerable corpus frequency not only lay bare

    UDs as an engaging problem (syntactically and computationally) but as a compelling

    one as well.

    1.3. Aims and Contributions:

    The main goal of this thesis is to provide outlines for solutions of the problem of UDs

    as a computational problem. The overall aims of the thesis can be summarized in the

    following points:

    Placing UDs in their proper positions as regards simple, non-theoretic,grammatical analysis.

  • 8/13/2019 A Proposed Approach to handling unbounded dependencies in automatic parsers

    24/149

    24

    Uncovering the role of UDs in the formation and evolution of many syntactictheories and formalisms.

    Laying hands on the key element(s) that could enable us to unravel thegrammatical complexity of UDs.

    Proposing a syntax-theoretic solution for UDs in terms of a proposed gaps-ontology.

    Highlighting the complexity of processing UDs computationally, i.e. UDs as aparsing problem.

    Proposing two types of solutions regarding the automatic parsing of UDs: thefirst has to do with the overall parser design (some tweaking and modifications

    of the parser architecture); while the second offers two parsing techniques that

    may enable the parser to process UDs in a robust and efficient way.

    However, before embarking on discussing the general outlines of my study, it is of

    paramount importance to examine a question of method which confronts the researcher

    at the outset. A linguist who has been trained on the dynamics and sophisticated details

    of the many linguistic theories currently available while hardly having any formal

    training in computability theory or computer science is unlikely to offer any detailed or

    profoundly technical treatment of a phenomenon such as UDs from a computational

    viewpoint. Besides, in order to prepare aseriouscomputationally viable study of UDs,

    a linguist needs an intricate set of computational tools that can only be secured and

    afforded by such large commercial/research entities (IBM, Microsoft, Carnegie Mellon,

    etc.), not to mention the academic and technical expertise that cannot be obviated.

  • 8/13/2019 A Proposed Approach to handling unbounded dependencies in automatic parsers

    25/149

    25

    Thus, what a linguist can do on their own is to adumbrate certain guidelines that relate

    to the interface between theoretical and computational linguistics. Many a research has

    been marred because its author was unable to resist the temptation of going

    computational: a temptation that normally leads to a chaotic morass of computational

    nuances that, with the wisdom of hindsight, prove quite hard to disentangle. This

    aptitude towards things computational can be ascribed to the current hype given to

    anything that has to do with computers, without having, on the part of the researcher,

    any proper knowledge, training or experience to do so.

    I have attempted to get around this dilemma by focusing on the theoretical syntactic

    issues that relate directly to computational parsing offering a broad, semi-technical

    approach to solutions. As such, none of the arguments or proposals in the

    computational section of this work should be judged as technical; they are just a

    number of theoretical postulations, conjectures and refutations on how, in my opinion

    and according to my knowledge of computer science, these problems can be solved.

    1.4. Thesis Structure:The thesis is broadly divided into three sections: the first focuses on the extensive

    theoretical backdrop of the phenomenon, providing an eclectic approach towards a

    uniform view of one of the lynchpin components of both the theoretical division of the

    work and the computational onegaps. The second represents a rough treatment of the

    computational and the parsing problems involved using the second part as a

    springboard. The third represents the researchers contribution to the problem of UDs

  • 8/13/2019 A Proposed Approach to handling unbounded dependencies in automatic parsers

    26/149

    26

    parsing by giving two sets of proposed modifications on the architectural and

    processing levels of the parser.

    The first section can be seen as the syntax-theoretic part that deals with the

    definitions, typology and grammatical analysis of UDs (Chapter 1 and 2) and how a

    number of syntactic theories and formalisms dealt with them. In addition to the

    analytical exposition, this section is permeated with critiques of those theories and

    formalisms in their treatment of UDs, along with an attempt (a perfunctory one though)

    at digging up their intellectual milieus and methodological underpinnings (Chapter 2).

    Section 2.6 proposes a gaps-ontology in which an eclectic, but hopefully harmonious,

    mlange of the theoretical component of gaps and gaps handling is offered. This

    concludes the syntax-theoretic section of the thesis.

    The second section of the thesis focuses on parsing theory and its roots in the study

    of formal languages (Chapter 3). Sections 3.6 - 14 discuss the various strategies and

    techniques of parsing available in the literature. Chapter 4 considers the complexity of

    UDs parasability as evidenced in a recent computational experiment. Sections 4.3 - 5

    examine the architecture and design of mainstream parsers and how they are built.

    The final section of the thesis represents the contributions part of the work where the

    proposed modifications mentioned earlier are found. Chapter 5 proposes architectural

    and design modifications on the universal parser by introducing the notion of

    modularity and by devising a Small-scale Latent parser. Chapter 6 proposes the next set

    of modifications that relate to the processing of the parser itself. This final section

    concludes with a brief account of the conclusions of the thesis.

  • 8/13/2019 A Proposed Approach to handling unbounded dependencies in automatic parsers

    27/149

    27

    1.5. UDs Defined

    The syntactic phenomenon of unbounded dependencies has been, as alluded to above, a

    major springboard for many theoretical proposals and syntactic formalisms. Naturally,

    this multiplicity of origins generated concomitantly a multiplicity of definitions and

    designations in the literature. First, I will look at the different definitions of UDs and

    how these differences can be accounted for. Then I will survey the various designations

    found in the relevant syntactic literature.

    The concept of "unbounded dependencies" was first introduced by Gerald Gazdar

    (1981) to refer to a set of syntactic structures handled within transformational

    frameworks in terms of movement or, more specifically, wh-movement. The use of the

    adjective "unbounded" in such contexts, however, goes back to J. Bresnan (1976)

    during the heyday of transformational approaches to grammatical analysis. Originally,

    however, the idea of "unboundedness" is a mathematical concept used in algebraic and

    computational studies of unbounded operators, set theory, number theory and

    algorithmics (Gowers 2009). The mathematical undertones of the term will be

    discussed later in the following section.

    Crystal (2008) defines an unbounded dependency as

    [a] term used in some theories of grammar (such as GPSG) to refer to a

    construction in which a syntactic relationship holds between two

    constituents such that there is no restriction on the structural distance

    between them (e.g. a restriction which would require that both be

    constituents of the same clause); also called a long-distance clause. In

    English, cleft sentences, topicalization, wh-questions and relative clauses

    have been proposed as examples of constructions which involve this kind

    of dependency; for instance, a wh-constituent may occur at the beginning

  • 8/13/2019 A Proposed Approach to handling unbounded dependencies in automatic parsers

    28/149

    28

    of a main clause, while the construction with which it is connected may be

    one, two or more clauses away, as in What has John done?/What do they

    think John has done?/ What do they think we have said John has done?,

    etc. In GB theory, unbounded dependencies are analyzed in terms of

    movement. In GPSG, use is made of the feature SLASH. The term is

    increasingly used outside the generative context1. (Crystal 2008: 501)

    Crystal's definition deserves a while of analytical contemplation. First, we need to

    establish the fact that Crystal (2008) is a relatively basic specialized dictionary targeted

    at professional as well as lay readers. This means that encountering detailed

    argumentative analyses of linguistic phenomena would be a rare incident in his work.

    He establishes his definition of UDs upon an abstract postulate that describes UDs as

    having a syntactic relationship between two constituents "such that there is no

    restriction on the structural distance between them." The idea of having no restriction

    on the structural distance between two dependencies is a mathematically or logically

    oriented idea rather than a natural language based one. In other words, natural language

    cannot permit such infinitely continuous clausal concatenations. It has to have a bound

    (i.e. a sentence must end somewhere in a linguistic text). The idea of unboundedness is

    thus a potentiality rather than an actuality. Mathematically-oriented thinking about

    language, however, has a natural proclivity towards abstraction and higher-order

    1 The final two sentences in Crystal's definition are interesting from an error analysis viewpoint,

    however. First, he describes GB as handling UDs in terms of movement, which is essentially correct.

    However, he continues his description by stipulating another fact about the handling of UDs in GPSGthrough the feature SLASH. The feature SLASH, as we will see later, is postulated in GPSG to account

    primarily for the existence of gaps in UDs, while describing movement only as the main technique for

    handling UDs in GB. This entails an intrinsic mistake in proposing that GB theory has no theorem for

    handling gaps, which is incorrect. Second, Crystal describes GPSG, HPSG, LFG and CGs as theories

    "outside the generative context." In fact all these theories are "generative" in essence; they are only non-

    transformational.

  • 8/13/2019 A Proposed Approach to handling unbounded dependencies in automatic parsers

    29/149

    29

    language. A more linguistically-real term would be "long-distance dependencies",

    which was later adopted by most non-transformational syntactic theories and syntactic

    formalisms handling the phenomenon of UDs.

    Trask (1993) defines UDs in a more poised manner. He notes how UDs present "a

    major headache for syntactic analysis," and that "all sorts of special machinery have

    been postulated to deal with them." He takes a more development-oriented approach to

    the handling of the phenomenon: for example, he mentions that classical TG made a

    liberal use of the theoretically problematic unbounded movement rules, and that GB

    and GPSG both reanalyzed UDs in terms of chains of local dependencies. GB used

    traces and GPSG came up with a feature SLASH. LFG, on the other hand, used arcs in

    its f-structures. I shall deal with all these formative concepts in more detail later in this

    work.

    Matthews (1997) defines the phenomenon of UDs as a "[r]elation between syntactic

    elements that is not subject to a restriction on the complexity of intervening structures."

    His definition is a restriction-based one, bearing in mind the formative concepts of

    island and cross-over constraints.

    Another definition based on psycho-syntactic realization of UDs is found in Slack

    (1990). According to him UDs represent a unique linguistic phenomenon, he writes:

    One linguistic phenomenon which, more than any other, focuses on the

    problem of addressing structural configurations is that of unbounded

    dependency. Typically, in sentences like The boy who John gave the book

    to __ last week was Bill, the phrase The boyis taken as the filler for the

    missing argument, or gap, of the gave predicate, as indicated by the

  • 8/13/2019 A Proposed Approach to handling unbounded dependencies in automatic parsers

    30/149

    31

    underline. At the level of constituent structure there are no constraints on

    the number of lexical items that can intervene between a filler and its

    corresponding gap. (Slack 1990: 268)

    Slack (1990) dissects the phenomenon of UDs in a more profound manner. He states

    that UDs belong to a class of linguistic phenomena in which the structural address of

    an element is determined by information which is only accessible over some arbitrary

    distance in the structure.According to him it is necessary to determine the address of

    the gap to which a filler belongs. The arbitrariness of the distance separating the gaps

    and their fillers in the input strings, makes the specification of the set of potential

    predicate-argument relations that the filler can be involved in (and thus the

    identification of a direct address of the gap) quite an impossible task (ibid.).

    The former definitions can be classified as non-partisan, i.e. they do not subscribe to

    any particular syntactic theory, framework or formalism. Also being mostly dictionary

    entries they are naturally confined by the constraints of brevity, simplicity and

    neutrality. Apart from encyclopedic definitions, I need to establish the fact that the

    study of UDs have been originally formulated within more arcane journal articles and

    research monographs. For that matter Gazdar et al. (1985) presents the first

    perspicuous and formally rigorous definition of UDs. I shall not dwell further on GPSG

    and its treatment of UDs for I have included a whole section dedicated to this classic

    and most influential treatment of UDs (see 3.2.).

    1.6. The Class of UDs:

    Any rigorous treatment of the phenomenon of Unbounded Dependencies should rest on

    a uniform, holistic comprehension of its nature. By "holistic" I refer to the necessity of

  • 8/13/2019 A Proposed Approach to handling unbounded dependencies in automatic parsers

    31/149

    30

    treating UDs in an undivided manner; i.e. studying relative clauses, wh-questions or

    topicalized constructions separately will not shed enough light on the nature and

    dynamics of the phenomenon. The study of UDs should be applied to the complete set

    of constructions recognized and classified as unbounded dependency constructions.

    These constructions are included within the following two subsets: strong UDs and

    weak UDs.

    1.6.1. Strong UDs:

    In what sense is the first subset of UDs "strong"? "Strength" here is rather a misnomer

    for compatibility or isomorphism. They are strong because they require the filler and

    the gap to be of the same syntactic category. According to Pollard & Sag (1994: 157-

    158), the first subset clearly represents strong UDs because there is an overt constituent

    in a non-argument position (sentences 1-5 group A) (normally the wh-phrase) that is

    strongly associated with the gap indicated by "_". Strong UDs include the following

    structures:

    GROUP (A)

    Topicalization:

    (1) This sort of problemi, my motherjis difficult to talk to_jabout_i.1

    Wh-questions:

    (2) Which violiniare these sonatasjdifficult for them to play_jon_i?

    Wh-relative clauses:

    (3) This is the bookithat the manjwe told the story to _jbought_i.

    I t-clefts:

    1Underscores and small subscripts (j, i, etc.) in this and the following sentences represent gaps or empty

    elements (traces of nominal or pronominal antecedents); this is a notational convention found in the

    majority of syntactic analyses of UDs and similar grammatical constructions.

  • 8/13/2019 A Proposed Approach to handling unbounded dependencies in automatic parsers

    32/149

    32

    (4) It is Kim whoiSandy loves _i.

    Pseudo-clefts:

    (5) This is whatiKim loves _i.

    1.6.2. Weak UDs:

    Weak UDs, on the other hand, have no overt filler in a non-argument position

    (sentences 1-4 group B); instead they have a constituent in an argument position that is

    "loosely" co-referential with the gap or the trace. Weak UDs include the following

    structures:

    GROUP (B)

    Tough movement:

    (1)Sandyiis hard to love _i.Purpose in f in i tives:

    (2)I bought itifor Sandy to eat _i.Non-wh r elati ves

    (3)This is the politicianiSandy loves _i.Non-wh clefts

    (4)It's KimiSandy loves _i.Two important points have to be mentioned here. First, UDs are indeed unbounded,

    which means that the dependency may, theoretically speaking, extend ad infinitum.

    Second, there is a syntactic category-matching condition between the filler and the gap,

    especially in strong UDs. The following examples illustrate these two points:

    (1)

    a) Kimi, Sandy trusts _i.

    b) [On Kim]i, Sandy depends _i.

    (2)

    a) Kimi,Chris knows Sandy trusts _i.

    b) [On Kim]i,

    Chris knows Sandy depends _i.

  • 8/13/2019 A Proposed Approach to handling unbounded dependencies in automatic parsers

    33/149

    33

    (3)

    a) Kimi,Dana believes Chris knows Sandy trusts _i.

    b) [On Kim]i, Dana believes Chris knows Sandy depends _i.

    In (1) the gap is an argument of the main clause, in (2) it is an argument of an

    embedded complement clause, and in (3) it is an argument of a complement clause

    within a complement clause. Mathematically speaking, there is no bound on the depth

    of embedding. The following diagram represents the above-mentioned in a clearer

    style.

    Figure (1) The Class of Unbounded Dependency Constructions

    Evidently, the class of UDs has a rich taxonomical structure that justifies its

    complexity. As noted above, studying each of the branches in the above tree diagram

    on its own will yield unsubstantial insights into UDs. As a first approximation, the

    thing that gathers all these different syntactic constructions under a uniform category is

    the existence of a "gap" somewhere in the construction. An oxymoron as it might seem,

  • 8/13/2019 A Proposed Approach to handling unbounded dependencies in automatic parsers

    34/149

    34

    the existence of gaps or missing elements in the sentence is the common denominator

    that holds all the above branches under one node UDs. That is why I allocate a

    special section for the handling of gaps in UDs later in this work (see Chapter 4). Also,

    I found out that an eclectic theory of gaps might be a step towards a better and more

    profound comprehension of the phenomenon of UDs and the more general

    phenomenon of gapping.

    For the sake of brevity and better visibility conditions, the present work will focus

    mainly on strong UDs throughout the proposed analyses and critiques. Weak UDs will

    be sporadically mentioned throughout the work, though they will not have a proper

    treatment on their own right. The partial exclusion of weak UDs from the work will

    hardly affect the treatment of the overall phenomenon. Strong UDs have all the features

    that we need in order to analyze UDs. Weak UDs, on the other hand, are more of a

    subset of strong UDs: a fact that makes obviating the handling of weak UDs a

    reasonable act in the footsteps of Ockham's razor.

    1.7. Nomenclature

    UDs have been variously termed in the literature. Y. Falk (2006: 316) recognized the

    following designations: extraction, long-distance dependencies, wh dependencies

    (or wh-movement), A' dependencies (or A' movement), syntactic binding,

    operator movement, and constituent control. The concept owes its multifarious

    terminological manifestations to different realizations of its nature and functions. Each

    linguistic school or syntactic formalism saw UDs according to its defining

    characteristics and theoretical grounding. Transformational theories (such as

  • 8/13/2019 A Proposed Approach to handling unbounded dependencies in automatic parsers

    35/149

    35

    Chomsky's GB, P&P and MP), for instance, have essentially a dynamic, movement-

    based conception of most linguistic constructions; a fact which clearly explains the use

    of such terms as "wh-movement, A' movement, syntactic binding," etc. On the contrary,

    non-transformational theories (such as GPSG, HPSG, CG) proceed from a static

    monostratal1 conception of linguistic constructions, hence their use of such terms as

    "unbounded dependency constructions and long-distance dependencies."

    Terminologically speaking, the term "extraction" is the only common ground where

    transformational and non-transformational theories meet (on the use of extraction in

    non-transformational contexts see Sag 1994).

    1This term refers to the idea that syntactic structures are essentially monostratal, i.e. they consist of only

    one level of representation, which is a surface apparent level. The Chomskyan postulate of a deep

    structure is irrevocably repudiated within this monostratal framework. Gazdar et al(1985) was the firstunequivocal statement of this theorem on which are based the whole frameworks of GPSG, HPSG and

    DCG. For more details see Horrocks (1987), Gazdar et al (1985), Sag et al (1994), Sag et al (2003),

    Brown ed. (2006).

  • 8/13/2019 A Proposed Approach to handling unbounded dependencies in automatic parsers

    36/149

    36

  • 8/13/2019 A Proposed Approach to handling unbounded dependencies in automatic parsers

    37/149

    37

    Chapter 2: UDs and Syntactic Formalisms

  • 8/13/2019 A Proposed Approach to handling unbounded dependencies in automatic parsers

    38/149

    38

    Chapter 2:

    UDs and Syntactic Formalisms

    2.1. Derivational Approaches to UDs

    Most of the reviews of the literature I came across in academic theses and books

    dealing with UDs, from a historiographical viewpoint, seem to be a disparate collation

    of information that hardly precipitates profound understanding or evaluation of the

    intellectual context that spawned and fostered the growth of syntactic theory. This is

    not the case here as far as I hope. My faith is that syntactic theory (and its handling of

    UDs) can hardly be understood or profoundly appreciated without a firm belief in the

    utility of coming to grips with the intellectual milieu that made such scholarly feats

    possible. Fortunately, the historiography of UDs in both the syntactic and the

    computational realms is as much variegated as could help build a mosaic that is

    informative, insightful and sufficiently panoramic. I believe, thus, along with Tomalin

    (2006) that

    [i]t could hardly be claimed that to consider the aims and goals of

    contemporary generative grammar, without first attempting to comprehendsomething of the intellectual context out of which the theory developed, is

    to labour in a penumbra of ineffectual superficiality. (Tomalin 2006: 20)

    Another important factor that necessitates this line of research has to do with UDs

    themselves. The study of UDs has been a major formative force in the field, a fact that

    made it a prerequisite (and a keepsake) for anyone embarking on a serious study of

  • 8/13/2019 A Proposed Approach to handling unbounded dependencies in automatic parsers

    39/149

    39

    syntactic theory. The inherent complexity of unbounded dependency constructions and

    the challenges they posed before syntacticians of different streamlines and the various

    analytical strategies and tools proposed to handle them endowed these constructions

    with a level of significance unprecedented in the field. That is why I adopted a

    historical-cum-theoretical approach in studying them, because, as far as I can see, this

    is the approach that is the most felicitous and the most enlightening as well.

    Historically, UDs have been studied according to two different approaches: the

    transformational and the non-transformational.1 Transformational approaches analyze

    UDs from a movement-based perspective. The filler of a UD is marked with an

    underscore (as in [a] below) then it changes its location through a series of movements

    till it reaches the leftmost position in the tree.

    (a)

    1. Which car does John think you should purchase_?

    2. That book you should read_.

    3. This is the car which_ John told me he thinks I should purchase_.

    4. Whom do you think Jim kissed _?

    Sentence (4) (see Carnie 2006: 325) can be represented according to a transformational

    (derivational) framework like the following2:

    1Transformational approaches have also been known as derivational approaches, because they depend

    on processes that derive, via transformations, the final output of a sentence from certain hypothesized

    deep structures to their final realizations as surface structures (see Bussmann 1996; Trask 1993; Radford

    2003).2The version used here is a recent version of the transformational enterprise known as P&P (Principles

    and Parameters) which is the version before the last emendation stated in Chomskys The Minimalist

    Program(1995).

  • 8/13/2019 A Proposed Approach to handling unbounded dependencies in automatic parsers

    40/149

    41

    Figure (2)

    According to TG analyses, this is the original deep representation of the sentence;

    where the wh-word is situated at the bottom of the tree. This means that in order to

    move who to its proper position a number of movements have to be done. These

    movements can also be illustrated in the following tree (see Carnie 2006: 326):

  • 8/13/2019 A Proposed Approach to handling unbounded dependencies in automatic parsers

    41/149

  • 8/13/2019 A Proposed Approach to handling unbounded dependencies in automatic parsers

    42/149

    42

    Figure (4)

    The two arcs in the above figure represent the two hopsCarnie just referred to. Now,

    we can have the correct S-structure where the wh-phrase will be situated at its rightful

    initial position in the tree, as shown in figure (4) (Carnie: 328):

  • 8/13/2019 A Proposed Approach to handling unbounded dependencies in automatic parsers

    43/149

    43

    Figure (5)

    Fodor (1978) pointed out that the effects of Wh-movement are not strictly local. The s-

    structure position of a wh-phrase can be arbitrarily far from its d-structure position. The

    sentence Which city did Ian visitcan serve as an example

  • 8/13/2019 A Proposed Approach to handling unbounded dependencies in automatic parsers

    44/149

    44

    Figure (6)

    The analysis proceeds by creating the appropriate CP structure by attaching the phrase

    which city in the [SPEC, CP] position. Then the analysis proceeds to account for did by

    attaching it to the C position. Then the analysis proceeds to handle the verb visit by

    identifying it with a verb that requires a NP. The analysis identifies an antecedent

    where there is no argument position for the proposed NP. Here comes in the role of the

    wh-trace (t) attaching it to a post-verbal NP node (Gorrell 1995: 132-133). The

    fundamental line of argument evident in transformational analyses proceeds from a

    psychological springboard entrenched in hypothetical reasoning that hardly accounts

    for the computational handling we aspire to study.

    2.2. UDs in Generalized Phrase Structure Grammar (GPSG)

    The domineering nature of Noam Chomskys transformational grammar generated a

    sense of dissatisfaction among leading younger linguists during the early 80s. Gerald

    Gazdar was one of those leading linguists. Back at that time linguists began to call what

    is now GPSG Gazdar Grammar. Gerald Gazdar, however, did not like that nor did his

    collaborators: Ewan Klein, Geoffrey Pullum and Ivan Sag. Their main focus was on the

    study of PSGs (Phrase Structure Grammars) but they did not have a specific name for

    what they were doing. After attending a talk by Emmon Bach called Generalized

  • 8/13/2019 A Proposed Approach to handling unbounded dependencies in automatic parsers

    45/149

  • 8/13/2019 A Proposed Approach to handling unbounded dependencies in automatic parsers

    46/149

    46

    It has to be noted that GPSG was something of a revolution against Chomskyan TG

    (Gazdar et al. 1985; Horrocks 1987; Borsely 1999; Falk 2006). And since the class of

    UDs was one of the constructions that TG adherents used as proof of the inadequacy of

    the class of Phrase Structure Grammars (PSGs) in describing natural language syntax,

    Gazdar and his collaborators decided to show that this assumption was basically

    mistaken (Falk 2006).1 Thus the earliest work in GPSG dealt with UDs in greater

    detail.

    Gazdars paper opened up new avenues of research in theoretical linguistics and formal

    computer science producing four years later the seminal and foundational work by

    Gazdar, Klein, Pullum and Sag (1985).2In Gazdaret al.(1985) we will encounter the

    first formally perspicuous exposition of the nature of UDs. According to Gazdar et al.

    (1985: 137) an unbounded dependency construction is one in which

    (i) a syntactic relation of some kind holds between thesubstructures in the construction, and

    (ii) the structural distance between these two substructures isnot restricted to some finite domain (e.g. by a requirement

    that both be substructures of the same simple clause).

    1GPSG was a frontal attack on transformational grammar. It not only attacked the lynchpins of

    the concept of transformations, but it also showed how unfounded other sacrosanct conceptssuch as Deep Structure vs. Surface Structure are. Another attack was against the permeatingpsychologism of TG and its claim to universality. As such and against this backdrop, GPSGwas founded on a monostratal model (a model that accepts no dualisms or hypothesized deepvs. surface dichotomies) with an intricate use of set-theoretic concepts, just to cleanse their

    syntactic model of any possible trace of psychologism.In spite of the rigorous nature of GKPS, the first chapter has this air of revolutionary

    manifestoes, and it is by far the authors clearest statement on what GPSG is, (see Gazdar et al.1985: 1-16).2

    Sometimes abbreviated as GKPS based on authors initials.

  • 8/13/2019 A Proposed Approach to handling unbounded dependencies in automatic parsers

    47/149

    47

    (iii) topicalization, relative clauses, constituent questions, freerelatives, clefts, and various other constructions in English

    have been taken to involve a dependency of this kind.

    According to Gazdar et al. (1985: 137), it is analytically useful to think of such

    constructions, conceptualized in terms of tree geometry (in the usual way, root up and

    leaves down), as having three parts: the top, the middle and the bottom. The top is the

    substructure which introduces the dependency, the middle is the domain of the structure

    that the dependency spans, and the bottom is the substructure in which the dependency

    ends, or is eliminated. Gazdar et al.(1985: 138) illustrate their proposed tree geometry

    as follows:

  • 8/13/2019 A Proposed Approach to handling unbounded dependencies in automatic parsers

    48/149

    48

    Figure (7) Tree geometry of the structure of a UD in GPSG

    Gazdar et al. (1985: 138) theory of UDs claims that the principles which govern the

    bottom and the middle are completely general in character, in that all types of UDs

    receive the same treatment. The idea is that the proposed analysis of UDs will be

    focused on the middle of the construction which involves no more than the feature

    SLASH along with feature instantiation principles. Of these principles the Foot Feature

    Principle (FFP) is the most important.

  • 8/13/2019 A Proposed Approach to handling unbounded dependencies in automatic parsers

    49/149

    49

    The central claim of GPSG analysis of unbounded dependencies is that these

    dependencies are simply a global consequence of a linked series of mother-daughter

    feature correspondences.

    The main formative components of GPSG is a set of metarules that generate other

    rules, such as Immediate Dominance (ID)/ Linear Precedence (LP) rules, along with

    feature instantiation principles, such as FFP, Head Feature Principle (HFP) and

    SLASH. The feature SLASH, however, is our mainstay in the analysis of UDs, because

    it represents and accounts for the behavior of the most significant element in an

    unbounded dependency constructiongaps. But what is a SLASH?

    When we write down in quasi-algebraic notation that we have, for instance, a set

    A/B, this means that the set A lacks or is missing the element B. The SLASH or [/] is

    originally an algebraic symbol for a missing element. The value of the SLASH feature

    will be a category corresponding to a gap dominated by the categories bearing a

    SLASH specification. A gap is created by some Immediate Dominance (ID) rule which

    introduces a constituent that has a SLASH feature; the feature-matching principles of

    GPSG push it down the head path of the category on which it first appears, and a

    multiplicity of metarules allow it eventually to be cashed outas a gap at the bottom of a

    nonlocal tree structure (see Levine 1989: 124-5). The best way to come to grips with

    the effects of the FFP apropos slash categories is to inspect an example of its

    application. Consider the following ID rules:

  • 8/13/2019 A Proposed Approach to handling unbounded dependencies in automatic parsers

    50/149

    51

    1

    According to the above rules and according to feature instantiation principles, we can

    predict that the resulting structures will be the following:

    Though the above notation seems a little difficult to follow, it is actually very

    straightforward. Rule (e.) above, for instance, refers to a verb phrase (VP) missing (/) a

    noun phrase, an object in this case (NP), which conforms with ID rule number (45) that

    deals with transitive verbs that takes a prepositional object as part of its

    1Numbers in square brackets refer to a list of rules provided as an appendix in Gazdaret al.

    (1985: 245-9).

  • 8/13/2019 A Proposed Approach to handling unbounded dependencies in automatic parsers

    51/149

    50

    subcategorization such as approve of, which itself lacks the existence of this object

    (PP/NP).

    Now, we need to see an example illustrating all the formal nuances mentioned

    above. A topicalized sentence like (a) will suffice.

    (a)Sandy we want to succeed.

    The normal ordering of this sentence would normally reads We want Sandy to

    succeed. However, a topicalized structure such as (a) within the framework of GPSG

    can be represented according to the following tree:

    Figure (8)

    The basic idea in GPSG analysis of UDs is that the constituent containing the gap

    has a missing element feature (Falk 2006). This is represented by the [+NULL] e

    above. The constituent headed by wantis a VP/NP (a verb with a missing object). The

  • 8/13/2019 A Proposed Approach to handling unbounded dependencies in automatic parsers

    52/149

    52

    e (empty) is a pronominal that refers back to Sandy. The whole clausal constituent

    containing this VP/NP is S/NP, since it is missing the same NP as the VP it dominates.

    As a result of the above feature sharing, the same element occupies the filler and gap

    positions at the same time, without any indication or sign of movement. This

    movement-less approach to UDs along with a solid formal apparatus (ID & LP rules,

    metarules, FCRs, FSDs and FIPs) catapulted GPSG as a suitable alternative to the

    much disputed TG framework. However, GPSG was short-lived: its sophisticated

    formalism and nuanced quasi-algebraic treatment of complex phenomena such as UDs

    made it forbidding to the majority of linguists during the 80s. But this was not the end

    of GPSG, though. For, it continued its existence, as we shall see in the next chapter, in

    a different guise, this time as the much more successful framework of HPSG (Head-

    driven Phrase Structure Grammar).

    2.3. UDs in Head-driven Phrase Structure Grammar (HPSG)

    According to Sag et al.(1999: 435) HPSG was formulated in an intellectually eclectic

    environment at Stanfords Center for the Study of Language and Information (CSLI).

    During the 1980s, CSLI was incubating a number of theories, approaches and

    frameworks that aim at formulating a kaleidoscopic view to language and its

    mechanisms. Sag and Pollard established their theory of HPSG on a variety of theories

    and formalisms: situation semantics, data type theory, TG, GPSG, CG and Unification

    Grammars. This eclectic formation endowed HPSG with an undeniable flexibility on

    the theoretical and formal levels.

  • 8/13/2019 A Proposed Approach to handling unbounded dependencies in automatic parsers

    53/149

    53

    There are three known hallmarks in the history of HPSG: the publication of Sag and

    Pollard (1987); Sag and Pollard (1994); and Sag and Wasow (1999). These are

    hallmarks in the sense that they marked some definitive changes in the views of the

    authors or the formal apparatus of HPSG in general.

    Unlike GPSG, HPSG shifted its attention from rules to features. This is clearly

    manifested in the adoption of Unification Grammars use of typed (or sorted) feature

    structures. A typed feature structure consists of features representing linguistic entities

    (words, phrases, sentences) and values that identify the dimensions of those features.

    For example, the feature PERSON in a given feature structure has three values: 1st, 2nd,

    and 3rd. According to this, the word youhas the property second person and this is

    represented by the feature value pair [PERSON 2nd]. Sag and Pollard (1994:8) suggests

    that the role of their proposed linguistic theory is to give a precise specification of

    which feature structures are to be considered admissible. And also according to their

    view, the types of linguistic entities that correspond to the admissible feature structures

    constitute the predictions of the theory.

    UDs have received considerable treatment within HPSG. This could be ascribed to

    two reasons: the first one has to do with the incremental theoreticalprerequisitenessof

    UDs as a sophisticated syntactic phenomenon that many see as a testing-ground for any

    proposed syntactic theory or formalism (see Winograd 1983; Falk 2006). The second

    reason has to do with the importance of UDs within the previous contributory

    progenitorGPSG.1However, HPSG took the analysis of UDs some steps further. In

    HPSG UDs get more than a single feature, a wh-feature, as they used to get in GPSG.

    1 It has to be noted that Ivan Sag, one of the original expositors of GPSG, became later the

    central figure in HPSG work for his, along with Carl Pollard, 1987 and 1994 publications.

  • 8/13/2019 A Proposed Approach to handling unbounded dependencies in automatic parsers

    54/149

    54

    In HPSG they get two distinct features: QUE and REL for questions and relative

    constructions (Pollard and Sag 1994: 159). This separation could be accounted for on

    the ground that the only information that needs to be kept track of in an interrogative

    dependency is the nominal-object corresponding to the wh-phrase, while in a relative

    dependency the referential index of the relative pronoun is all that is required (see

    Pollard and Sag 1994).

    Another difference relates to the realization of feature structures in both GPSG and

    HPSG. In GPSG, foot features take the same kind of value, which is normally a

    syntactic category, while in HPSG, nonlocal features take setsas values.1According

    to Pollard and Sag (1994: 159) this strategy will enable HPSG to deal with more

    sophisticated UDs, such as multipleUDs as in the following sentences:

    1- [A violin this well crafted]1, even [the most difficult sonata]2will beeasy to play2on1.

    2- This is a problem which1John2is difficult to talk to2about1.It is noteworthy to mention the fact that in HPSG, strong UDs are analyzed in terms of

    a filler-gap conception. This peculiar conception underscores the centrality of the

    concept of gap in any treatment of UDs. This is why I think that HPSG is ahead of

    most other syntactic theories in the analysis of UDs, because of this very gap-based

    analysis. This competitive edge will be more clearly accounted for later in this work

    (see ch.?).

    1Again the mathematical, especially algebraic, influence on syntactic theory is very muchmanifested in this instance where the use of sets is borrowed from algebraic Cantorian Set

    Theory.

  • 8/13/2019 A Proposed Approach to handling unbounded dependencies in automatic parsers

    55/149

    55

    Take the following sentence (P&S 1994: 160) as an example of how HPSG

    analyzes a topicalized clause of the strong UDs type:

    1- Kim1,we know Sandy claims Dana hates1.

    Figure (8)

    The analysis provided above looks similar, to a great extent, to Gazdars bottom -

    middle-top model (see figure 6). In HPSG, the bottom of the arboreal skeleton is where

    the dependency is introduced, because at the bottom there exists the terminal node that

    triggers the whole unbounded dependency. This terminal node is associated with a

    special sign that must be nonempty. As for interrogative dependencies, this sign is an

    interrogative pronoun (what, which, where, etc.) with a nonempty value for the QUE

  • 8/13/2019 A Proposed Approach to handling unbounded dependencies in automatic parsers

    56/149

    56

    nonlocal feature, while in relative dependencies, the sign is a relative word (e.g. who,

    which) having also a nonempty value for the REL nonlocal feature.

    What really distinguishes HPSG from previous theories or formalisms is its reliance

    on associativity: it attempts to associate linguistic objects with each other by a number

    of concepts and techniques. Central to these is the concept of inheritance hierarchy, the

    embodiment of which can be seen in the above tree diagram (figure 8). Instead of the

    crude movement transformations in all versions of Transformational Grammars, we get

    here a more computationally sound technique where the traits of a certain linguistic

    object are inherited from one object to another. The SLASH category in the above tree,

    for example, is being inherited from one stratum of analysis to the other by inserting

    boxed numbers and the feature INHER. So the SLASH feature at the bottom of the

    dependency passes from daughter to mother up the tree, and the top is where the

    dependency is discharged or bound off (Pollard and Sag 1994: 160-161). As with

    GPSG, HPSG is more inclined towards computational implementation, because it

    originally availed itself from many computational models and procedures, and it has to

    be noted here that the concept of inheritance is a genuine computational procedure that

    HPSG incorporated into its theoretical architectonic.1

    HPSG uses a number of features to construct what it considers to be a complete

    description of a given linguistic entity. For the description of the syntax-semantics

    interface, for example, it employs a feature SYNSEM that represents the syntactic as

    well as the semantic content of a particular lexical item. This is realized via what HPSG

    1The idea of inheritance is directly borrowed from computer science, especially from work onGenetic Algorithms which resorts to biological jargon and concepts such as inheritance,

    evolution and survival of the fittest (see Dopico et al 2009)

  • 8/13/2019 A Proposed Approach to handling unbounded dependencies in automatic parsers

    57/149

  • 8/13/2019 A Proposed Approach to handling unbounded dependencies in automatic parsers

    58/149

    58

    Head-driven phrase structure grammar is a monostratal theory of natural

    language grammar, based on richly specified lexical descriptions which

    combine according to a small set of abstract combinatory principles stated as

    formulae in a constraint logic regulating, for the most part, the satisfaction of

    valence and other properties of syntactic heads. These constraints, applying

    locally, determine the flow of information, encoded as feature specifications,

    through arbitrarily complex syntactic representations, and capture all

    syntactic dependenciesboth local and non-local in elegant and compact

    form requiring no derivational apparatus.

    This theoretically rich definition deserves an equally rich analysis. The first fact about

    HPSG in this definition is that it is monostratal, which means that it does not subscribe

    to derivational or transformational theories of natural language grammars (see fn.1 in

    p.28 above). This, of course, reminds us of the early beginnings of GPSG (Gazdar

    1982). The second important notice that really characterizes the theory of HPSG is its

    lexicalism: as Levine (2003) puts it, HPSG is based on richly specified lexical

    descriptions. This highlights HPSGs attention to the value of lexical items as bearers

    of information and as the glue that binds linguistic descriptions together. In fact, HPSG

    is head-driven because it relies on lexical heads, such as seesabove, on its descriptions

    of linguistic entities. Finally, the definition gives us a hint concerning HPSG recourse

    to mathematical and logico-mathematical jargon in its descriptions of local and non-

    local (UDs) syntactic dependencies in an elegant and compact form. 1 Implied here is

    1Note here also the use of elegant and compact which is a commonplace description inmathematical and logico-mathematical literature. A mathematical proof, for instance, has to beelegant and compact in the sense that it admits of no logical fallacies, internal inconsistencies

    or needless tortuous sub-proofs.

  • 8/13/2019 A Proposed Approach to handling unbounded dependencies in automatic parsers

    59/149

    59

    the idea that a derivational apparatus as in GB, P&P and MP formalisms is

    essentially inelegant and incompact.

    HPSG, then, looks at UDs as filler-gap constructions (Pollard & Sag 1994), or as

    constructions with gaps (GAPs) that can be resolved via the detection of the sites or

    positions of those gaps and relating them to their original positions via inheritance. This

    is realized by stipulating what HPSG calls the GAP Principle (Sag &Wasow 1999;

    Carnie 2003). The GAP Principle states the following:

    A well-formed phrase structure licensed by a headed rule other than the Head Filler

    Rule must satisfy the following SD1:

    Figure (11)

    This means that the mother GAP feature subsumes all the GAP values in its daughters.

    The symbol in the diagram above simply refers to the arithmetical notion of adding

    up to, but this time the entities added are not single linguistic objects but lists of

    linguistic objects (Sag & Wasow 1999: 351). The boxed n above is also the

    arithmetical indication of the idea of any number of. Gaps in HPSG will be more

    thoroughly, and comparatively, explored later along with other syntactic frameworks.

    1SDs stand for Structural Descriptions, which are the amalgamation of constraints from lexical

    entries, grammar rules, and relevant grammatical principles. See Sag &Wasow (1999: 68)

  • 8/13/2019 A Proposed Approach to handling unbounded dependencies in automatic parsers

    60/149

    61

    2.4. Categorial Grammar(s)

    From a historical vantage point, Categorial Grammars (or CGs) antedate all generative

    theories of syntax. CG was first formulated within a strictly logical backdrop: it was

    Kasimir Ajdukiewicz, the famous Polish logician and algebraist of the Lvov-Warsaw

    school of logic and mathematics, who introduced the idea of functional syntax in his

    Die syntaktischeKonnexitt(1935). But Ajdukiewiczs treatment was strictly logico-

    mathematical, a fact which made his work quite forbidding for linguists. 1Two decades

    later, Yehoshua Bar-Hillel (1953), also a logician, came along with a revived interest in

    Ajdukiewiczs CG, but this time combining it with many insights and methods from

    American linguists during the 1950s. This new combination of ideas and methods of

    mathematical logic and structural linguistics spawned a novel interest in CG in the

    USA and the Continent. The interesting thing about Bar-Hillels revival of CG is his

    belief in the suitability of CG for machine translation purposes. That explains why

    computational linguists tend to prefer CG, and other likeminded formalisms, over other

    syntactic theories bereft of such computational aptitude.

    Being an offshoot of advanced logical and formal studies, CGs emphasis on the

    semantics of natural languages is naturally expected. Unlike other formalisms and

    theories of syntax, CG has no separate module for semantic processing; for it sees

    semantics as an inherently inextricable component of syntactic description. In other

    words, syntax and semantics in CG are one and the same thing: every rule of syntax is,

    1Besides being an excruciating reading even for the initiated in mathematical logic,Ajdukiewiczs paper appeared in a Polish philosophical journal and has therefore been

    unknown to most linguists, (Y. Bar-Hillel 1953: 1).

  • 8/13/2019 A Proposed Approach to handling unbounded dependencies in automatic parsers

    61/149

    60

    inherently, a rule of semantics (Wood 1993: 3). CG has the following properties (Wood

    1993: 3-5):

    (1)It sees language in terms of functions and arguments rather than of constituentstructure.

    (2)Syntax and semantics are integral.(3)It is monotonous (monostratal), i.e. it avoids destructive devices such as

    movement or deletion rules which characterize transformational grammars.

    (4)It takes to its logical extreme the move towards lexicalism, i.e. the syntacticbehavior of any linguistic item is directly encoded in its lexical category

    specifications.

    The other peculiar aspect that has to do with CG and UDs is the somehow troubled

    relationship between the two. Ironically, Bar-Hillel lost faith in CG because he found

    out that it was unable to process discontinuous constructions (such as UDs) (Wood

    1993: 23,104). But the theory of CG during the 1960s was not very much developed to

    handle such sophisticated syntactic constructions such as UDs. Since that early, UDs

    intractability was recognized as a processing fact that any syntactic theory or formalism

    has to efficiently and rigorously account for.

    Classical CG did not offer any straightforward method to deal with UDs (Wood

    1993: 104). However, Ades and Steedman (1982) used the recursive power of

    generalized composition to reach what they called a derivational constituent which

    can be utilized to apply backwards to the fronted object giving the correct semantic

    interpretation (Wood 1993: 105). A sentence like Who(m) do you think he loves?can be

    represented according to Ades and Steedman (1982) in the following way

  • 8/13/2019 A Proposed Approach to handling unbounded dependencies in automatic parsers

    62/149

    62

    Figure (12)

    Recent advances in CG produced the more elaborate type-logical categorical

    grammar. What interests me the most in this more advanced formalism is its proposal

    of a novel procedure to handle gaps in UDs. Bob Carpenter (1997) adopts Moortgats

    approach to UDs to account for the existence of gaps and how they should be treated

    within a CG-based framework. As Carpenter (1997: 203) mentions, Moortgats

    analysis rests on proposing an additional binary category constructor,, that can be used

    to construct equations of the form AB. This equation means that there is a category A

    missing somewhere within it a B. For instance, snpis a sentence from which a noun

    phrase has been extracted. The extraction constructor AB is a generic form for both

    A/BandA\Bthat may be instantiated in the following:

    snp=s/npors\np

    which indicate a sentence lacking a noun phrase on the right or left frontiers. The use of

    the SLASH feature in CG is similar to that in GPSG and HPSG; the difference lies in

    the adoption of feature structures and AVMs in HPSG and the adoption of the Lambek

  • 8/13/2019 A Proposed Approach to handling unbounded dependencies in automatic parsers

    63/149

    63

    calculus (a semi-algebraic linear formalism) in CG. An example of how advanced CG

    handles a UD can be of use here. The phrase Who Jo hitsis formally represented in

    CG according to the following schemata, see Carpenter (1997: 206).

    Figure (13) A representation of who Jo hits?

    The postulation of (snp) in the beginning of the relative or interrogative clause (under

    who) is the notational tool that unravels the unboundedness of the structure by

    postulating that there is a missing noun phrase somewhere in the construction.

    2.5. Lexical Functional Grammar (LFG)

    This is the fourth syntactic theory through which I try to explain and unravel the nature

    of UDs. LFG is one of the most prominent theories of grammar belonging to the

    generative tradition. It is also one of the theories that subscribe to a non-

    transformational agenda. Being non-transformational boosted the theorys potential for

    a rigorous treatment of UDs. This is due to the fact that most non-transformational

  • 8/13/2019 A Proposed Approach to handling unbounded dependencies in automatic parsers

    64/149

    64

    theories and formalisms are more inclined towards formalisms that are couched in

    mathematical or semi-mathematical terms. This is the case with LFG.

    But in what sense LFG is different from the other theories mentioned above? It

    differs from GPSG and CG in that LFG is, in fact, a complete theory of language

    syntax, with a separate explanatory module for the study of language acquisition,

    universals and cognitive aspects. This is not the case with GPSG or CG, because both

    of them, and especially GPSG, pose ruthless critiques to the prevalent psychologism in

    GB and P&P. And both of them are more devoted to such applications as

    computational linguistics and AI. LFG is similar to HPSG because the latter also

    sustains certain claims to universality and psychological reality. But all of them share a

    staunch rejection of transformational rules and assumptions. They also share their avid

    interest in lexicalism: the four of them (GPSG, HPSG, CG, LFG) see the lexicon as the

    springboard for any viable and true grammatical analysis.

    As opposed to GB and P&P, the non-transformational approaches mentioned above

    see lexical categories as the keys with which we can unravel syntactic riddles,

    especially the riddle of UDs. That also accounts for the high importance of UDs

    analyses within the frameworks of all those theories. GPSG proposed the Head Feature

    Principle, which restores to lexical items their due powers instead of ascribing all

    powers to extra-linguistic features and movements as is the case with transformational

    grammars (Falk 2001). HPSG, which is a more stringent framework than GPSG

    (Carnie 2008), bases the entire linguistic analysis on the head sign, which is an

    instantiation of a certain lexical item or word. CG is even more extremist on the issue

  • 8/13/2019 A Proposed Approach to handling unbounded dependencies in automatic parsers

    65/149

    65

    of lexicalism; that is why it derives its analytical momentum from certain atomic

    lexical categories.

    LFG is also lexical or lexicalist because the lexicon plays a major role in it. In LFG

    (Dalrymple ELL2 2006) the lexicon is richly structured, with lexical relations rather

    than transformations or operations on phrase structure trees as a means of capturing

    linguistic generalizations. Yehuda Falk (2003) adds to the major tenets of LFG what he

    calls the Lexical Integrity Principle, which states the following:

    Words are the atoms out of which syntactic structure is built. Syntactic

    rules cannot create words or refer to the internal structures of words, and

    each terminal node (or leaf of the tree) is a word.(Falk 2003: 4)

    The other aspect of LFG has to do with its emphasis on functionalism. The

    functional part of LFG means that grammatical functions (or grammatical relations)

    such as subject and object are primitives of the theory, not defined in terms of structural

    configurations or semantic roles1 (Dalrymple 2006). LFG grants such grammatical

    functions as subject and object a rather universal character where such abstract

    grammatical functions are at play in the structure of all languages no matter how

    dissimilar they might appear. The theory assumes that as languages obey certain

    universal principles as regards abstract syntactic structures, they do the same thing

    regarding the principles of functional organization (Dalrymple 2001). This is LFG as

    pertains to its nomenclature, i.e. the lexical and functional epithets.

    1This is the standard view of transformational approaches. According to this view subject and

    object are not part of the syntax vocabulary, i.e. they are extra-configurational. Thosegrammatical functions or relations derive from the phrase structure they happen to occur in. Ifsubjects, for example, can be controlled, this control, according to this view, is attributed to thestructural lineaments of the position where the subject occurs. For a more in depth discussion,

    see Falk (2003).

  • 8/13/2019 A Proposed Approach to handling unbounded dependencies in automatic parsers

    66/149

    66

    C-structure and F-structure:

    The two divisions of the formal architecture of LFG are constituent structure (c-

    structure) and functional structure (f-structure). The c-structure is concerned with the

    description of syntactic structure while the f-structure details the semantic-cum-

    functional structure of the linguistic entities concerned. The formal machinery of c-

    structure depends on X-bar syntax with the addition of a number of techniques and

    concepts that characterize the LFG theory and its formalism. C-structure can be

    illustrated according to the following figure (Falk 2003) analyzing the following clause:

    What Rachel thinks Ross put on the shelf

    Figure (14)

    According to this description the empty category (e) is tied to or bound with the

    antecedent filler by what LFG calls metavaraibles represented by the up and down

    arrows. The use of double arrows has been left over in the more recent versions of LFG

  • 8/13/2019 A Proposed Approach to handling unbounded dependencies in automatic parsers

    67/149

    67

    incorporating functional components into the tree; this could be illustrated in the

    following sentence:

    Figure (15) the c-structure of What Rachel thinks Ross put on the table?

    The corresponding f-structure looks like the following

    Figure (16) the f-structure of What Rachel thinks Ross put on the table?

    The previous descriptions are classic representations of UDs that are due to Kaplan and

    Bresnan (1982) and Kaplan and Zaenan (1989) respectively.

    More recent advances in LFG tend to be more detailed and hence more

    sophisticated. The following example from Asudeh (2009) is just an example. The

  • 8/13/2019 A Proposed Approach to handling unbounded dependencies in automatic parsers

    68/149

    68

    clause What did the strange, green entity seem to try to quickly hide?gets the following

    constituent and functional descriptions respectively:

    Figure (17) C-structure of What did the strange, green entity seem to try to quickly hide?

    (Asudeh 2009)

  • 8/13/2019 A Proposed Approach to handling unbounded dependencies in automatic parsers

    69/149

    69

    Figure (18) F-structure of What did the strange, green entity seem to try to quickly hide?

    (Asudeh 2009)

    The interesting thing about this clause, however, is that it not only describes how LFG

    handles the phenomenon of UDs but it also describes a host of other syntactic

    phenomena such as Adjunction, Raising and Control.

    To sum up, early LFG (Kaplan and Bresnan 1982) analyzed UDs in terms of c-

    structure that explicitly drew the relation between a displaced constituent and its

    corresponding gap via the double arrow notation. However, Kaplan and Zaenan (1989)

  • 8/13/2019 A Proposed Approach to handling unbounded dependencies in automatic parsers

    70/149

    71

    showed that the previous treatment was deficient in accounting for functional

    constraints on UDs (Dalrymple 2001). This led them to incorporating f-structure

    components in their analysis of UDs, thus abandoning the double arrow notation as

    seen in figure (15) above.

    2.6. Towards an Ontology of Gaps

    The previous accounts pose a serious question as to the various treatments of UDs. But,

    despite the various moot points among the many theories and formalisms scantily

    described in the previous sections, the one thing that all those theories tend to agree

    upon is that the key to unlocking the sophistication of unbounded constructions lies in

    providing a rigorous account of gaps (a.k.a. empty categories, null elements, missing

    elements, SLASH categories, traces). A correct and rigorous account of gaps will be

    the liaison between the purely theoretical treatment of UDs and computational

    implementation. This is due to the fact that dealing with gaps represents a crystallized

    problem, and all computational theorizing or implementation is based on problem-

    solving. Thus first we need to identify what might be called an ontologyo