rmrs some background and current work. talk overview rmrs: integrating processors via semantics...
Post on 20-Dec-2015
221 views
TRANSCRIPT
![Page 1: RMRS some background and current work. Talk overview RMRS: integrating processors via semantics Underspecified semantics from shallow processing Integration](https://reader037.vdocuments.net/reader037/viewer/2022103123/56649d425503460f94a1d128/html5/thumbnails/1.jpg)
RMRS
some background and current work
![Page 2: RMRS some background and current work. Talk overview RMRS: integrating processors via semantics Underspecified semantics from shallow processing Integration](https://reader037.vdocuments.net/reader037/viewer/2022103123/56649d425503460f94a1d128/html5/thumbnails/2.jpg)
Talk overview RMRS: integrating processors via
semantics Underspecified semantics from shallow
processing Integration experiments with broad-
coverage systems/grammars (LinGO ERG and RASP)
Planned work
![Page 3: RMRS some background and current work. Talk overview RMRS: integrating processors via semantics Underspecified semantics from shallow processing Integration](https://reader037.vdocuments.net/reader037/viewer/2022103123/56649d425503460f94a1d128/html5/thumbnails/3.jpg)
Integrating processing No single system can do everything:
deep and shallow processing have inherent strengths and weaknesses
Domain-dependent and domain-independent processing must be linked
Parsers and generators Common representation for processing
`above sentence level’ (e.g., anaphora)
![Page 4: RMRS some background and current work. Talk overview RMRS: integrating processors via semantics Underspecified semantics from shallow processing Integration](https://reader037.vdocuments.net/reader037/viewer/2022103123/56649d425503460f94a1d128/html5/thumbnails/4.jpg)
Compositional semantics as a common representation Need a common representation language for
systems: pairwise compatibility between systems is too limiting
Syntax is theory-specific and unnecessarily language-specific
Eventual goal should be semantics Core idea: shallow processing gives underspecified
semantic representation, so deep and shallow systems can be integrated
Full interlingua / common lexical semantics is too difficult (certainly currently), but can link predicates to ontologies, etc.
![Page 5: RMRS some background and current work. Talk overview RMRS: integrating processors via semantics Underspecified semantics from shallow processing Integration](https://reader037.vdocuments.net/reader037/viewer/2022103123/56649d425503460f94a1d128/html5/thumbnails/5.jpg)
Shallow processing and underspecified semantics Integrated parsing: shallow parsed phrases
incorporated into deep parsed structures Deep parsing invoked incrementally in response
to information needs Reuse of knowledge sources:
domain knowledge, recognition of named entities, transfer rules in MT
Integrated generation Formal properties clearer, representations more
generally usable Deep semantics taken as normative
![Page 6: RMRS some background and current work. Talk overview RMRS: integrating processors via semantics Underspecified semantics from shallow processing Integration](https://reader037.vdocuments.net/reader037/viewer/2022103123/56649d425503460f94a1d128/html5/thumbnails/6.jpg)
RMRS approach: current and planned applications Question answering:
Cambridge CSTIT: deep parse questions, shallow parse answers QA from structured knowledge: Frank et al
Information extraction: Deep Thought Chemistry texts (SciBorg (?))
Dictionary definition parsing for Japanese and English Bond and Flickinger
Rhetorical structure, multi-document summarization, email response ...
also LOGON: semantic transfer. MRSs from LFG used in HPSG generator.
![Page 7: RMRS some background and current work. Talk overview RMRS: integrating processors via semantics Underspecified semantics from shallow processing Integration](https://reader037.vdocuments.net/reader037/viewer/2022103123/56649d425503460f94a1d128/html5/thumbnails/7.jpg)
RMRS: Extreme underspecification Goal is to split up semantic representation
into minimal components (cf Verbmobil VITs) Scope underspecification (MRS) Splitting up predicate argument structure Explicit equalities Hierarchies for predicates and sorts
Compatibility with deep grammars: Sorts and (some) closed class word information in
SEM-I (API for grammar, more later) No lexicon for shallow processing (apart from POS
tags and possibly closed class words)
![Page 8: RMRS some background and current work. Talk overview RMRS: integrating processors via semantics Underspecified semantics from shallow processing Integration](https://reader037.vdocuments.net/reader037/viewer/2022103123/56649d425503460f94a1d128/html5/thumbnails/8.jpg)
RMRS principles Split up information content as much as
possible Accumulate information monotonically
by simple operations Don’t represent what you don’t know
but preserve everything you do know Use a flat representation to allow pieces
to be accessed individually
![Page 9: RMRS some background and current work. Talk overview RMRS: integrating processors via semantics Underspecified semantics from shallow processing Integration](https://reader037.vdocuments.net/reader037/viewer/2022103123/56649d425503460f94a1d128/html5/thumbnails/9.jpg)
Separating argumentslb1:every(x,h9,h6), lb2:cat(x), lb5:dog1(y),
lb4:some(y,h8,h7), lb3:chase(e,x,y), h9=lb2,h8=lb5
goes to:
lb1:every(x), RSTR(lb1,h9), BODY(lb1,h6), lb2:cat(x), lb5:dog1(y), lb4:some(y), RSTR(lb4,h8), BODY(lb4,h7), lb3:chase(e),ARG1(lb3,x),ARG2(lb3,y), h9=lb2,h8=lb5
![Page 10: RMRS some background and current work. Talk overview RMRS: integrating processors via semantics Underspecified semantics from shallow processing Integration](https://reader037.vdocuments.net/reader037/viewer/2022103123/56649d425503460f94a1d128/html5/thumbnails/10.jpg)
Naming conventions:predicate names without a lexiconlb1:_every_q(x1sg),RSTR(lb1,h9),BODY(lb1,h6),
lb2:_cat_n(x2sg),
lb5:_dog_n_1(x4sg),
lb4:_some_q(x3sg),RSTR(lb4,h8),BODY(lb4,h7),
lb3:_chase_v(esp),ARG1(lb3,x2sg),ARG2(lb3,x4sg)h9=lb2,h8=lb5, x1sg=x2sg,x3sg=x4sg
![Page 11: RMRS some background and current work. Talk overview RMRS: integrating processors via semantics Underspecified semantics from shallow processing Integration](https://reader037.vdocuments.net/reader037/viewer/2022103123/56649d425503460f94a1d128/html5/thumbnails/11.jpg)
POS output as underspecificationDEEP –
lb1:_every_q(x1sg), RSTR(lb1,h9), BODY(lb1,h6), lb2:_cat_n(x2sg), lb5:_dog_n_1(x4sg), lb4:_some_q(x3sg), RSTR(lb4,h8), BODY(lb4,h7),lb3:_chase_v(esp), ARG1(lb3,x2sg),ARG2(lb3,x4sg), h9=lb2,h8=lb5, x1sg=x2sg,x3sg=x4sg
POS –
lb1:_every_q(x1), lb2:_cat_n(x2sg), lb3:_chase_v(epast), lb4:_some_q(x3), lb5:_dog_n(x4sg)
![Page 12: RMRS some background and current work. Talk overview RMRS: integrating processors via semantics Underspecified semantics from shallow processing Integration](https://reader037.vdocuments.net/reader037/viewer/2022103123/56649d425503460f94a1d128/html5/thumbnails/12.jpg)
POS output as underspecificationDEEP –
lb1:_every_q(x1sg), RSTR(lb1,h9),BODY(lb1,h6), lb2:_cat_n(x2sg), lb5:_dog_n_1(x4sg), lb4:_some_q(x3sg), RSTR(lb4,h8), BODY(lb4,h7),lb3:_chase_v(esp), ARG1(lb3,x2sg),ARG2(lb3,x3sg), h9=lb2,h8=lb5, x1sg=x2sg,x3sg=x4sg
POS –
lb1:_every_q(x1), lb2:_cat_n(x2sg), lb3:_chase_v(epast), lb4:_some_q(x3), lb5:_dog_n(x4sg)
![Page 13: RMRS some background and current work. Talk overview RMRS: integrating processors via semantics Underspecified semantics from shallow processing Integration](https://reader037.vdocuments.net/reader037/viewer/2022103123/56649d425503460f94a1d128/html5/thumbnails/13.jpg)
Semantics from RASP RASP: robust, domain-independent, statistical
parsing (Briscoe and Carroll) can’t produce conventional semantics
because no subcategorization can often identify arguments:
S -> NP VP NP supplies ARG1 for V potential for partial identification:
VP -> V NP S -> NP S NP might be ARG2 or ARG3
![Page 14: RMRS some background and current work. Talk overview RMRS: integrating processors via semantics Underspecified semantics from shallow processing Integration](https://reader037.vdocuments.net/reader037/viewer/2022103123/56649d425503460f94a1d128/html5/thumbnails/14.jpg)
Underspecification of arguments
ARGN
ARG1or2 ARG2or3
ARG2ARG1 ARG3
RASP arguments can be specified as ARGN, ARG2or3 etcAlso useful for Japanese deep parsing?
![Page 15: RMRS some background and current work. Talk overview RMRS: integrating processors via semantics Underspecified semantics from shallow processing Integration](https://reader037.vdocuments.net/reader037/viewer/2022103123/56649d425503460f94a1d128/html5/thumbnails/15.jpg)
RMRS construction ERG etc – uses MRS -> RMRS converter
argument splitting etc also RMRS -> MRS conversion
POS-RMRS: tag lexicon RASP-RMRS: tag lexicon plus semantic
rules associated with RASP rules to match ERG defaults when no rule RMRS specified
![Page 16: RMRS some background and current work. Talk overview RMRS: integrating processors via semantics Underspecified semantics from shallow processing Integration](https://reader037.vdocuments.net/reader037/viewer/2022103123/56649d425503460f94a1d128/html5/thumbnails/16.jpg)
RMRS composition with non-lexicalized grammars MRS composition assumes a lexicalized
approach: algebra defined in Copestake, Lascarides and Flickinger (2001)
RMRS with non-lexicalised grammars: has similar basic algebra without lexical subcategorization, rely on grammar
rules to provide the ARGs `anchors’ rather than slots, to ground the ARGs
(single anchor for RASP) developed on basis of semantic test suite most rules written by Anna Ritchie
![Page 17: RMRS some background and current work. Talk overview RMRS: integrating processors via semantics Underspecified semantics from shallow processing Integration](https://reader037.vdocuments.net/reader037/viewer/2022103123/56649d425503460f94a1d128/html5/thumbnails/17.jpg)
Some cat sleeps (in RASP)[h3,e], <h3>, {h3:_sleep(e)}sleeps[h,x], <h1>, {h1:_some(x),RSTR(h1,h2),h2:_cat(x)}some cat
S->NP VP: Head=VP, ARG1(<VP anchor>,<NP hook.index>)[h3,e], <h3>, {h3:_sleep(e), ARG1(h3,x),
h1:_some(x),RSTR(h1,h2),h2:_cat(x)}some cat sleeps
![Page 18: RMRS some background and current work. Talk overview RMRS: integrating processors via semantics Underspecified semantics from shallow processing Integration](https://reader037.vdocuments.net/reader037/viewer/2022103123/56649d425503460f94a1d128/html5/thumbnails/18.jpg)
Real rule ...<!--rule><name>S/np_vp</name><dtrs><dtr>NP</dtr><dtr>VP</dtr></dtrs><head>RULE</head><semstruct><hook><index>E</index><label>H1</label></hook><slots><noanchor/></slots><ep><gpred>PRPSTN_M_REL</gpred><label>H1</label><var>H2</var></ep><rarg><rargname>ARG1</rargname><label>H3</label><var>X</var></rarg><hcons hreln='qeq'><hi><var>H2</var></hi><lo><var>H</var></lo></hcons></semstruct><equalities><rv>X</rv><dh><dtr>NP</dtr><he>INDEX</he></dh></equalities><equalities><rv>H</rv><dh><dtr>VP</dtr><he>LABEL</he></dh></equalities><equalities><rv>H3</rv><dh><dtr>VP</dtr><he>ANCHOR</he></dh></equalities><equalities><rv>E</rv><dh><dtr>VP</dtr><he>INDEX</he></dh></equalities></rule-->
![Page 19: RMRS some background and current work. Talk overview RMRS: integrating processors via semantics Underspecified semantics from shallow processing Integration](https://reader037.vdocuments.net/reader037/viewer/2022103123/56649d425503460f94a1d128/html5/thumbnails/19.jpg)
ERG-RMRS / RASP-RMRS
![Page 20: RMRS some background and current work. Talk overview RMRS: integrating processors via semantics Underspecified semantics from shallow processing Integration](https://reader037.vdocuments.net/reader037/viewer/2022103123/56649d425503460f94a1d128/html5/thumbnails/20.jpg)
Inchoative
![Page 21: RMRS some background and current work. Talk overview RMRS: integrating processors via semantics Underspecified semantics from shallow processing Integration](https://reader037.vdocuments.net/reader037/viewer/2022103123/56649d425503460f94a1d128/html5/thumbnails/21.jpg)
Infinitival subject (unbound in RASP-RMRS)
![Page 22: RMRS some background and current work. Talk overview RMRS: integrating processors via semantics Underspecified semantics from shallow processing Integration](https://reader037.vdocuments.net/reader037/viewer/2022103123/56649d425503460f94a1d128/html5/thumbnails/22.jpg)
Ditransitive: missing ARG3
![Page 23: RMRS some background and current work. Talk overview RMRS: integrating processors via semantics Underspecified semantics from shallow processing Integration](https://reader037.vdocuments.net/reader037/viewer/2022103123/56649d425503460f94a1d128/html5/thumbnails/23.jpg)
Mismatch: Expletive it
![Page 24: RMRS some background and current work. Talk overview RMRS: integrating processors via semantics Underspecified semantics from shallow processing Integration](https://reader037.vdocuments.net/reader037/viewer/2022103123/56649d425503460f94a1d128/html5/thumbnails/24.jpg)
Mismatch: larger numbers
![Page 25: RMRS some background and current work. Talk overview RMRS: integrating processors via semantics Underspecified semantics from shallow processing Integration](https://reader037.vdocuments.net/reader037/viewer/2022103123/56649d425503460f94a1d128/html5/thumbnails/25.jpg)
Comments on RASP-RMRS Fast enough (not significant compared to RASP
processing time because no ambiguity) Too many RASP rules! Need to generalise over
classes. Requires SEM-I – API for MRS/RMRS from deep
grammar RASP and ERG may change:
compatible test suites – semi-automatic rule update? alternative technique for composition?
Parse selection – need to generalise over RMRSs weighted intersections of RMRSs (cf RASP grammatical
relations)
![Page 26: RMRS some background and current work. Talk overview RMRS: integrating processors via semantics Underspecified semantics from shallow processing Integration](https://reader037.vdocuments.net/reader037/viewer/2022103123/56649d425503460f94a1d128/html5/thumbnails/26.jpg)
SEM-I: semantic interface Meta-level: manually specified `grammar’
relations (constructions and closed-class) Object-level: linked to lexical database for
deep grammars Object-level SEM-I auto-generated from expanded
lexical entries in deep grammars (because type can contribute relations)
Validation of other lexicons Need closed class items for RMRS
construction from shallow processing
![Page 27: RMRS some background and current work. Talk overview RMRS: integrating processors via semantics Underspecified semantics from shallow processing Integration](https://reader037.vdocuments.net/reader037/viewer/2022103123/56649d425503460f94a1d128/html5/thumbnails/27.jpg)
Alignment and XML Comparing RMRSs for same text
efficiently uses characterization labels RMRSs according to their source in
the text currently characters, but byte offset?
Japanese etc? RMRS-XML RMRS seen as levels of mark-up:
standoff annotation
![Page 28: RMRS some background and current work. Talk overview RMRS: integrating processors via semantics Underspecified semantics from shallow processing Integration](https://reader037.vdocuments.net/reader037/viewer/2022103123/56649d425503460f94a1d128/html5/thumbnails/28.jpg)
SciBorg: Chemistry texts eScience project starting in October at Cambridge
Computer Laboratory (Copestake, Teufel), Chemistry (Murray-Rust), CeSC (Parker)
Aims: Develop an NL markup language which will act as a platform
for extraction of information. Link to semantic web languages.
Develop IE technology and core ontologies for use by publishers, researchers, readers, vendors and regulatory organisations.
Model scientific argumentation and citation purpose in order to support novel modes of information access.
Demonstrate the applicability of this infrastructure in a real-world eScience environment.
![Page 29: RMRS some background and current work. Talk overview RMRS: integrating processors via semantics Underspecified semantics from shallow processing Integration](https://reader037.vdocuments.net/reader037/viewer/2022103123/56649d425503460f94a1d128/html5/thumbnails/29.jpg)
Research markup Chemistry: The primary aims of the present study are
(i) the synthesis of an amino acid derivative that can be incorporated into proteins /via/ standard solid-phase synthesis methods, and (ii) a test of the ability of the derivative to function as a photoswitch in a biological environment.
Computational Linguistics: The goal of the work reported here is to develop a method that can automatically refine the Hidden Markov Models to produce a more accurate language model.
![Page 30: RMRS some background and current work. Talk overview RMRS: integrating processors via semantics Underspecified semantics from shallow processing Integration](https://reader037.vdocuments.net/reader037/viewer/2022103123/56649d425503460f94a1d128/html5/thumbnails/30.jpg)
RMRS and research markup Specify cues in RMRS Deep process cues: feasible because
domain-independent more general and reliable than shallow
techniques allows for complex interrelationships
Use zones for advanced citation maps and other enhancements to repositories
![Page 31: RMRS some background and current work. Talk overview RMRS: integrating processors via semantics Underspecified semantics from shallow processing Integration](https://reader037.vdocuments.net/reader037/viewer/2022103123/56649d425503460f94a1d128/html5/thumbnails/31.jpg)
Conclusions RMRS: semantic representation
language allowing linking of deep and shallower processors
RMRS construction: phrase-level compatibility between processors
Many potential applications