the acl 2013 conference handbook is already available and can

ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS

4 – 9 August | Sofia, Bulgaria

ConferenceHandbook

Useful Information

The 51st Annual Mee�ng of the Associa�on for Computa�onal Linguistics (ACL 2013)

August 4 (Sun) to August 9 (Fri), 2013National Palace of Cul�re, Sofia, Bulgaria

The Association for Computational LinguisticsThe Department of Computa�onal Linguistics, Institute for Bulgarian Language, Bulgarian Academy of Sciences

ACL 2013 is held under the aegis of the President of the Republic of Bulgaria Mr. Rosen Plevneliev

For the first �me, the annual mee�ng of the Associa�on for Computa�onal Linguistics (ACL) takes place in Bulgaria. ACL 2013 will be held in Sofia, Bulgaria's capital, August 4-9, 2013. As in previous years, the program of the conference in�udes a poster session, �torials, work�ops and demonstra�ons in addi�on to the main conference.

ACL is the premier conference of the field of computa�onal linguistics, covering a broad �e�rum of diverse resear� areas that are concerned with computa�onal approa�es to na�ral language. An exci�ng new development this year is that the conference program will in�ude the presenta�on of papers that have been accepted at Transa�ions of the ACL (TACL), the new journal of the ACL.

On behalf of the organizing commi�ee I invite you to join us in Sofia for ACL 2013!

Hinri� S�uetzeGeneral Chair

WELCOME TO ACL 2013!

h�p://acl2013.org

For the first time, the annual meeting of the Association for Computational Linguistics (ACL) takes place in Bulgaria. ACL 2013 will be held in Sofia, Bulgaria's capital, August 4-9, 2013. As in previous years, the program of the conference includes a poster session, tutorials, workshops and demonstrations in addition to the main conference.

ACL is the premier conference of the field of computational linguistics, covering a broad spectrum of diverse research areas that are concerned with computational approaches to natural language. An exciting new development this year is that the conference program will include the presentation of papers that have been accepted at Transactions of the ACL (TACL), the new journal of the ACL.

On behalf of the organizing committee I invite you to join us in Sofia for ACL 2013!

Hinrich SchuetzeGeneral Chair

WELCOME TO ACL 2013!

Conference Committee

General ChairHinrich Schuetze, University of Munich

Program Co-ChairsPascale Fung, The Hong Kong University of Science and TechnologyMassimo Poesio, University of Essex

Local ChairSvetla Koeva, Bulgarian Academy of Sciences

Workshop Co-ChairsAoife Cahill, Educational Testing ServiceQun Liu, Dublin City University & Chinese Academy of Sciences

Tutorial Co-ChairsJohan Bos, University of GroningenKeith Hall, Google

Demo Co-ChairsMiriam Butt, University of KonstanzSarmad Hussain, Al-Khawarizmi Institute of Computer Science

Publication ChairsRoberto Navigli, Sapienza University of Rome (Chair)Jing-Shin Chang, National Chi Nan University (Co-Chair)

Faculty Advisors (Student Research Workshop)Steven Bethard, University of Colorado Boulder & KU LeuvenPreslav I. Nakov, Qatar Computing Research InstituteFeiyu Xu, DFKI, German Research Center for Artificial Intelligence

Student Chairs (Student Research Workshop)Anik Dey, The Hong Kong University of Science & TechnologyEva Vecchi, Università di TrentoSebastian Krause, German Research Center for Artificial IntelligenceIvelina Nikolova, Bulgarian Academy of Sciences

Mentoring ChairLeo Wanner, Universitat Pompeu Fabra

Publicity Co-ChairsAnisava Miltenova, Bulgarian Academy of SciencesIvan Derzhanski, Bulgarian Academy of SciencesAnna Korhonen, University of Cambridge

Business ManagerPriscilla Rasmussen, ACL

VenueThe National Palace of Culture, the largest convention centre in Bulgaria, prides itself on its unique architecture and great flexibility.

Located in the heart of Sofia and within walking distance of major hotels and tourist sites, it is the natural choice for an event of the scale of the ACL annual meeting.

The facilities include a conference hall seating 3600 and a number of customizable smaller halls and offices. The pedestrian plaza in front of the National Palace of Culture is a signature place for the Bulgarian capital.

51st ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS

Conference Committee

General ChairHinrich Schuetze, University of Munich

Program Co-ChairsPascale Fung, The Hong Kong University of Science and TechnologyMassimo Poesio, University of Essex

Local ChairSvetla Koeva, Bulgarian Academy of Sciences

Workshop Co-ChairsAoife Cahill, Educational Testing ServiceQun Liu, Dublin City University & Chinese Academy of Sciences

Tutorial Co-ChairsJohan Bos, University of GroningenKeith Hall, Google

Demo Co-ChairsMiriam Butt, University of KonstanzSarmad Hussain, Al-Khawarizmi Institute of Computer Science

Publication ChairsRoberto Navigli, Sapienza University of Rome (Chair)Jing-Shin Chang, National Chi Nan University (Co-Chair)

Faculty Advisors (Student Research Workshop)Steven Bethard, University of Colorado Boulder & KU LeuvenPreslav I. Nakov, Qatar Computing Research InstituteFeiyu Xu, DFKI, German Research Center for Artificial Intelligence

Student Chairs (Student Research Workshop)Anik Dey, The Hong Kong University of Science & TechnologyEva Vecchi, Università di TrentoSebastian Krause, German Research Center for Artificial IntelligenceIvelina Nikolova, Bulgarian Academy of Sciences

Mentoring ChairLeo Wanner, Universitat Pompeu Fabra

Publicity Co-ChairsAnisava Miltenova, Bulgarian Academy of SciencesIvan Derzhanski, Bulgarian Academy of SciencesAnna Korhonen, University of Cambridge

Business ManagerPriscilla Rasmussen, ACL

VenueThe National Palace of Culture, the largest convention centre in Bulgaria, prides itself on its unique architecture and great flexibility.

Located in the heart of Sofia and within walking distance of major hotels and tourist sites, it is the natural choice for an event of the scale of the ACL annual meeting.

The facilities include a conference hall seating 3600 and a number of customizable smaller halls and offices. The pedestrian plaza in front of the National Palace of Culture is a signature place for the Bulgarian capital.

Floor 0

Conference rooms situated on this floor – Hall 4, Hall 5, Hall 6

Main entrance

Hall 5Hall 4

Hall 6

Hall 1.4 Hall 1.5

Hall 1.2 Hall 1.7

Floor 1

Conference rooms situated on this floor – Halls 1.2, 1.4, 1.5, 1.7

Floor 0

Main entrance

Hall 5Hall 4

Hall 6

Hall 1.4 Hall 1.5

Hall 1.2 Hall 1.7

Floor 1

Conference rooms situated on this floor – Halls 1.2, 1.4, 1.5, 1.7

Floor 2

Poster sessions

POSTER SESSION AM – MultilingualityNLPCEE – NLP for Languages of Central and Eastern Europe and the BalkansNLPW – NLP for the Web and Social MediaSLP – Spoken Language ProcessingWS – Word Segmentation

POSTER SESSION BSA – Sentiment Analysis, Opinion Mining and Text ClassificationSMLM – Statistical and Machine Learning Methods in NLPTACL – Transactions of ACLTM – Text Mining and Information Extraction

POSTER SESSION ADCP – Discourse, Coreference and PragmaticsEM – Evaluation MethodsLSO – Lexical Semantics and OntologiesLRLP – Low Resource Language ProcessingNLPa – NLP ApplicationsTACL – Transactions of ACL

POSTER SESSION BS&P – Syntax and ParsingS – Semantics

Floor 3

Poster sessions

POSTER SESSION ACMP – Cognitive Modeling and PsycholinguisticsIR – Information RetrievalLR – Language ResourcesNLPc – NLP and Creativity

POSTER SESSION BQA – Question AnsweringS&G – Summarization & GenerationT&C – Tagging & Chunking

POSTER SESSION ADIS – Dialogue and Interactive SystemsMT:MAE – Machine Translation: Methods, Applications, Evaluation

POSTER SESSION BMT:SM – Machine Translation: Statistical Models

Floor 2

Poster sessions

POSTER SESSION AM – MultilingualityNLPCEE – NLP for Languages of Central and Eastern Europe and the BalkansNLPW – NLP for the Web and Social MediaSLP – Spoken Language ProcessingWS – Word Segmentation

POSTER SESSION BSA – Sentiment Analysis, Opinion Mining and Text ClassificationSMLM – Statistical and Machine Learning Methods in NLPTACL – Transactions of ACLTM – Text Mining and Information Extraction

POSTER SESSION ADCP – Discourse, Coreference and PragmaticsEM – Evaluation MethodsLSO – Lexical Semantics and OntologiesLRLP – Low Resource Language ProcessingNLPa – NLP ApplicationsTACL – Transactions of ACL

POSTER SESSION BS&P – Syntax and ParsingS – Semantics

Floor 3

Poster sessions

POSTER SESSION ACMP – Cognitive Modeling and PsycholinguisticsIR – Information RetrievalLR – Language ResourcesNLPc – NLP and Creativity

POSTER SESSION BQA – Question AnsweringS&G – Summarization & GenerationT&C – Tagging & Chunking

POSTER SESSION ADIS – Dialogue and Interactive SystemsMT:MAE – Machine Translation: Methods, Applications, Evaluation

POSTER SESSION BMT:SM – Machine Translation: Statistical Models

Floor 5

Hall 8Hall 7

Hall 9

Floor 7

Conference rooms situated on this floor – Hall 3

Hall 3

Floor 5

Hall 8Hall 7

Hall 9

Floor 7

Conference rooms situated on this floor – Hall 3

Hall 3

Floor 8

Conference rooms situated on this floor – Hall 3.1, Hall 3.2, Hall 10

Hall 3.2

Hall 10

Hall 3.1

thTutorials – Sun, August 4

Time Program Venue

8.00 – 18.00 Registration Floor 0

9.00 – 12.30 Morning Tutorials

Tutorial 1 Visual Features for Linguists: Basic image analysis techniques for multimodally-curious NLPers Hall 3.1

by Elia Bruni and Marco Baroni

Tutorial 2 Variational Inference for Structured NLP Models Hall 3.2

by David Burkett and Dan Klein

Tutorial 3 Decipherment Hall 4

by Kevin Knight

Tutorial 4 The mathematics of language learning Hall 1.5

by Andras Kornai, James Rogers, Gerald Penn and Anssi Yli-Jyrä

10.30 – 11.00 Coffee break Hall 9

12.30 – 14.00 Lunch break

14.00 – 17.30 Afternoon Tutorials

Tutorial 5 Exploiting Social Media for Natural Language Processing: Bridging the Gap between

Language-centric and Real-world Applications Hall 3.1

by Simone Paolo Ponzetto and Andrea Zielinski

Tutorial 6 Robust Automated Natural Language Processing with

Multiword Expressions and Collocations Hall 3.2

by Valia Kordoni and Markus Egg

Tutorial 7 Semantic Parsing with Combinatory Categorial Grammars Hall 4

by Yoav Artzi, Nicholas FitzGerald and Luke Zettlemoyer

15.30 – 16.00 Coffee break

18.30 – 21.00 Welcome Reception Sky Plaza

Floor 8

Conference rooms situated on this floor – Hall 3.1, Hall 3.2, Hall 10

Hall 3.2

Hall 10

Hall 3.1

thTutorials – Sun, August 4

Time Program Venue

9.00 – 12.30 Morning Tutorials

Tutorial 1 Visual Features for Linguists: Basic image analysis techniques for multimodally-curious NLPers Hall 3.1

by Elia Bruni and Marco Baroni

Tutorial 2 Variational Inference for Structured NLP Models Hall 3.2

by David Burkett and Dan Klein

Tutorial 3 Decipherment Hall 4

by Kevin Knight

10.30 – 11.00 Coffee break Hall 9

14.00 – 17.30 Afternoon Tutorials

Tutorial 5 Exploiting Social Media for Natural Language Processing: Bridging the Gap between

Language-centric and Real-world Applications Hall 3.1

by Simone Paolo Ponzetto and Andrea Zielinski

Tutorial 6 Robust Automated Natural Language Processing with

Multiword Expressions and Collocations Hall 3.2

by Valia Kordoni and Markus Egg

Tutorial 7 Semantic Parsing with Combinatory Categorial Grammars Hall 4

by Yoav Artzi, Nicholas FitzGerald and Luke Zettlemoyer

18.30 – 21.00 Welcome Reception Sky Plaza

VISUAL FEATURES FOR LINGUISTS: BASIC IMAGE ANALYSIS TECHNIQUES FOR MULTIMODALLY-CURIOUS NLPERSby Elia Bruni and Marco Baroni

thSunday, August 4 , 9.00 – 12.30 → Hall 3.1

Abstract

Features automatically extracted from images constitute a new and rich source of semantic knowledge that can complement information extracted from text. The convergence between vision- and text-based information can be exploited in scenarios where the two modalities must be combined to solve a target task (e.g., generating verbal descriptions of images, or finding the right images to illustrate a story). However, the potential applications for integrated visual features go beyond mixed-media scenarios. Because of their complementary nature with respect to language, visual features might provide perceptually grounded semantic information that can be exploited in purely linguistic domains.

The tutorial will first introduce basic techniques to encode image contents in terms of low-level features, such as the widely adopted SIFT descriptors. We will then show how these low-level descriptors are used to induce more abstract features, focusing on the well-established bags-of-visual-words method to represent images, but also briefly introducing more recent developments, that include capturing spatial information with pyramid representations, soft visual word clustering via Fisher encoding and attribute-based image representation. Next, we will discuss some example applications, andwe will conclude with a brief practical illustration of visual feature extraction using a software package we developed.

Presenters:

Elia BruniCenter for Mind/Brain Sciences, University of [email protected]

Marco BaroniCenter for Mind/Brain Sciences, University of [email protected]

Tutorial 1

VARIATIONAL INFERENCE FOR STRUCTURED NLP MODELSby David Burkett and Dan Klein

Abstract

Historically, key breakthroughs in structured NLP models, such as chain CRFs or PCFGs, have relied on imposing careful constraints on the locality of features in order to permit efficient dynamic programming for computing expectations or finding the highest-scoring structures. However, as modern structured models become more complex and seek to incorporate longer-range features, it is more and more often the case that performing exact inference is impossible (or at least impractical) and it is necessary to resort to some sort of approximation technique, such as beam search, pruning, or sampling. In the NLP community, one increasingly popular approach is the use of variational methods for computing approximate distributions.

The goal of the tutorial is to provide an introduction to variational methods for approximate inference, particularly mean field approximation and belief propagation. The intuition behind the mathematical derivation of variational methods is fairly simple: instead of trying to directly compute the distribution of interest, first consider some efficiently computable approximation of the original inference problem, then find the solution of the approximate inference problem that minimizes the distance to the true distribution. Though the full derivations can be some what tedious, the resulting procedures are quite straightforward, and typically consist of an iterative process of individually updating specific components of the model, conditioned on the rest. Although we will provide some theoretical background, the main goal of the tutorial is to provide a concrete procedural guide to using these approximate inference techniques, illustrated with detailed walkthroughs of examples from recent NLP literature.

Once both variational inference procedures have been described in detail, we'll provide a summary comparison of the two, along with some intuition about which approach is appropriate when. We'll also provide a guide to further exploration of the topic, briefly discussing other variational techniques, such as expectation propagation and convex relaxations, but concentrating mainly on providing pointers to additional resources for those who wish to learn more.

Presenters:

David BurkettComputer Science Division | Department of Electrical Engineering and Computer SciencesUniversity of California at [email protected]

Dan KleinComputer Science Division | Department of Electrical Engineering and Computer SciencesUniversity of California at [email protected]

Tutorial 2

VISUAL FEATURES FOR LINGUISTS: BASIC IMAGE ANALYSIS TECHNIQUES FOR MULTIMODALLY-CURIOUS NLPERSby Elia Bruni and Marco Baroni

Abstract

Features automatically extracted from images constitute a new and rich source of semantic knowledge that can complement information extracted from text. The convergence between vision- and text-based information can be exploited in scenarios where the two modalities must be combined to solve a target task (e.g., generating verbal descriptions of images, or finding the right images to illustrate a story). However, the potential applications for integrated visual features go beyond mixed-media scenarios. Because of their complementary nature with respect to language, visual features might provide perceptually grounded semantic information that can be exploited in purely linguistic domains.

The tutorial will first introduce basic techniques to encode image contents in terms of low-level features, such as the widely adopted SIFT descriptors. We will then show how these low-level descriptors are used to induce more abstract features, focusing on the well-established bags-of-visual-words method to represent images, but also briefly introducing more recent developments, that include capturing spatial information with pyramid representations, soft visual word clustering via Fisher encoding and attribute-based image representation. Next, we will discuss some example applications, andwe will conclude with a brief practical illustration of visual feature extraction using a software package we developed.

Presenters:

Elia BruniCenter for Mind/Brain Sciences, University of [email protected]

Marco BaroniCenter for Mind/Brain Sciences, University of [email protected]

Tutorial 1

VARIATIONAL INFERENCE FOR STRUCTURED NLP MODELSby David Burkett and Dan Klein

Abstract

Historically, key breakthroughs in structured NLP models, such as chain CRFs or PCFGs, have relied on imposing careful constraints on the locality of features in order to permit efficient dynamic programming for computing expectations or finding the highest-scoring structures. However, as modern structured models become more complex and seek to incorporate longer-range features, it is more and more often the case that performing exact inference is impossible (or at least impractical) and it is necessary to resort to some sort of approximation technique, such as beam search, pruning, or sampling. In the NLP community, one increasingly popular approach is the use of variational methods for computing approximate distributions.

The goal of the tutorial is to provide an introduction to variational methods for approximate inference, particularly mean field approximation and belief propagation. The intuition behind the mathematical derivation of variational methods is fairly simple: instead of trying to directly compute the distribution of interest, first consider some efficiently computable approximation of the original inference problem, then find the solution of the approximate inference problem that minimizes the distance to the true distribution. Though the full derivations can be some what tedious, the resulting procedures are quite straightforward, and typically consist of an iterative process of individually updating specific components of the model, conditioned on the rest. Although we will provide some theoretical background, the main goal of the tutorial is to provide a concrete procedural guide to using these approximate inference techniques, illustrated with detailed walkthroughs of examples from recent NLP literature.

Once both variational inference procedures have been described in detail, we'll provide a summary comparison of the two, along with some intuition about which approach is appropriate when. We'll also provide a guide to further exploration of the topic, briefly discussing other variational techniques, such as expectation propagation and convex relaxations, but concentrating mainly on providing pointers to additional resources for those who wish to learn more.

Presenters:

David BurkettComputer Science Division | Department of Electrical Engineering and Computer SciencesUniversity of California at [email protected]

Dan KleinComputer Science Division | Department of Electrical Engineering and Computer SciencesUniversity of California at [email protected]

Tutorial 2

DECIPHERMENTby Kevin Knight

thSunday, August 4 , 9.00 – 12.30 → Hall 4

Abstract

The first natural language processing systems had a straightforward goal: decipher coded messages sent by the enemy. Sixty years later, we have many more applications, including web search, question answering, summarization, speech recognition, and language translation. This tutorial explores connections between early decipherment research and today's NLP work. We find that many ideas from the earlier era have become core to the field, while others still remain to be picked up and developed.

We first cover classic military and diplomatic cipher types, including complex substitution ciphers implemented in the first electro-mechanical encryption machines. We look at mathematical tools (language recognition, frequency counting, smoothing) developed to decrypt such ciphers on proto-computers. We show algorithms and extensive empirical results for solving different types of ciphers, and we show the role of algorithms in recent decipherments of historical documents.

We then look at how foreign language can be viewed as a code for English, a concept developed by Alan Turing and Warren Weaver. We describe recently published work on building automatic translation systems from non-parallel data. We also demonstrate how some of the same algorithmic tools can be applied to natural language tasks like part-of-speech tagging and word alignment.

Turning back to historical ciphers, we explore a number of unsolved ciphers, giving results of initial computer experiments on several of them. Finally, we look briefly at writing as a way to encipher phoneme sequences, covering ancient scripts and modern applications.

Presenter:

Kevin KnightInformation Sciences Institute, University of Southern [email protected]

Tutorial 3

Over the past decade, attention has gradually shifted from the estimation of parameters to the learning of linguistic structure (for a survey see Smith 2011). The Mathematics of Language (MOL) SIG put together this tutorial, composed of three lectures, to highlight some alternative learning paradigms in speech, syntax, and semantics in the hopes of accelerating this trend.

Given the broad range of competing formal models such as templates in speech, PCFGs and various MCS models in syntax, logic-based and association-based models in semantics, it is somewhat surprising that the bulk of the applied work is still performed by HMMs. A particularly significant case in point is provided by PCFGs, which have not proved competitive with straight trigram models. Undergirding the practical failure of PCFGs is a more subtle theoretical problem, that the nonterminals in better PCFGs cannot be identified with the kind of nonterminal labels that grammarians assume, and conversely, PCFGs embodying some form of grammatical knowledge tend not to outperform flatly initialized models that make no use of such knowledge. A natural response to this outcome is to retrench and use less powerful formal models, and the first lecture will be spent in the subregular space of formal models even less powerful than finite state automata.

Compounding the enormous variety of formal models one may consider is the bewildering range of ML techniques one may bring to bear. In addition to the surprisingly useful classical techniques inherited from multivariate statistics such as Principal Component Analysis (PCA, Pearson 1901) and Linear Discriminant Analysis (LDA, Fisher 1936), computational linguists have experimented with a broad range of neural net, nearest neighbor, maxent, genetic/evolutionary, decision tree, max margin, boost, simulated annealing, and graphical model learners.

While many of these learners became standard in various domains of ML, within CL the basic HMM approach proved surprisingly resilient, and it is only very recently that deep learning techniques from neural computing are becoming competitive not just in speech, but also in OCR, paraphrase, sentiment analysis, parsing and vector-based semantic representations. The second lecture will provide a mathematical introduction to some of the fundamental techniques that lie beneath these linguistic applications of neural networks, such as: BFGS optimization, finite difference approximations of Hessians and Hessian-free optimization, contrastive divergence and variational inference.

In spite of the enormous progress brought by ML techniques, there remains a rather significant range of tasks where automated learners cannot yet get near human performance. One such is the unsupervised learning of word structure addressed by MorphoChallenge, another is the textual entailment task addressed by RTE. The third lecture recasts these and similar problems in terms of learning weighted edges in a sparse graph, and presents learning techniques that seem to have some potential to better find spare finite state and near-FS models than EM. We will provide a mathematical introduction to the Minimum Description Length (MDL) paradigm and spectral learning, and relate these to the better known L1 regularization technique and sparse overcomplete representations.

Presenters:

Andras KornaiBudapest Institute of Technology / Computer and Automation Research Institute, Hungarian Academy of [email protected], [email protected]

James RogersComputer Science Department, Earlham [email protected]

Gerald PennDepartment of Computer Science, University of Toronto / University of Trinity [email protected]

Anssi Yli-JyräDepartment of General Linguistics, University of [email protected]

THE MATHEMATICS OF LANGUAGE LEARNINGby Andras Kornai, James Rogers, Gerald Penn and Anssi Yli-Jyrä

Abstract

Tutorial 4

DECIPHERMENTby Kevin Knight

Abstract

The first natural language processing systems had a straightforward goal: decipher coded messages sent by the enemy. Sixty years later, we have many more applications, including web search, question answering, summarization, speech recognition, and language translation. This tutorial explores connections between early decipherment research and today's NLP work. We find that many ideas from the earlier era have become core to the field, while others still remain to be picked up and developed.

We first cover classic military and diplomatic cipher types, including complex substitution ciphers implemented in the first electro-mechanical encryption machines. We look at mathematical tools (language recognition, frequency counting, smoothing) developed to decrypt such ciphers on proto-computers. We show algorithms and extensive empirical results for solving different types of ciphers, and we show the role of algorithms in recent decipherments of historical documents.

We then look at how foreign language can be viewed as a code for English, a concept developed by Alan Turing and Warren Weaver. We describe recently published work on building automatic translation systems from non-parallel data. We also demonstrate how some of the same algorithmic tools can be applied to natural language tasks like part-of-speech tagging and word alignment.

Turning back to historical ciphers, we explore a number of unsolved ciphers, giving results of initial computer experiments on several of them. Finally, we look briefly at writing as a way to encipher phoneme sequences, covering ancient scripts and modern applications.

Presenter:

Kevin KnightInformation Sciences Institute, University of Southern [email protected]

Tutorial 3

Over the past decade, attention has gradually shifted from the estimation of parameters to the learning of linguistic structure (for a survey see Smith 2011). The Mathematics of Language (MOL) SIG put together this tutorial, composed of three lectures, to highlight some alternative learning paradigms in speech, syntax, and semantics in the hopes of accelerating this trend.

Given the broad range of competing formal models such as templates in speech, PCFGs and various MCS models in syntax, logic-based and association-based models in semantics, it is somewhat surprising that the bulk of the applied work is still performed by HMMs. A particularly significant case in point is provided by PCFGs, which have not proved competitive with straight trigram models. Undergirding the practical failure of PCFGs is a more subtle theoretical problem, that the nonterminals in better PCFGs cannot be identified with the kind of nonterminal labels that grammarians assume, and conversely, PCFGs embodying some form of grammatical knowledge tend not to outperform flatly initialized models that make no use of such knowledge. A natural response to this outcome is to retrench and use less powerful formal models, and the first lecture will be spent in the subregular space of formal models even less powerful than finite state automata.

Compounding the enormous variety of formal models one may consider is the bewildering range of ML techniques one may bring to bear. In addition to the surprisingly useful classical techniques inherited from multivariate statistics such as Principal Component Analysis (PCA, Pearson 1901) and Linear Discriminant Analysis (LDA, Fisher 1936), computational linguists have experimented with a broad range of neural net, nearest neighbor, maxent, genetic/evolutionary, decision tree, max margin, boost, simulated annealing, and graphical model learners.

While many of these learners became standard in various domains of ML, within CL the basic HMM approach proved surprisingly resilient, and it is only very recently that deep learning techniques from neural computing are becoming competitive not just in speech, but also in OCR, paraphrase, sentiment analysis, parsing and vector-based semantic representations. The second lecture will provide a mathematical introduction to some of the fundamental techniques that lie beneath these linguistic applications of neural networks, such as: BFGS optimization, finite difference approximations of Hessians and Hessian-free optimization, contrastive divergence and variational inference.

In spite of the enormous progress brought by ML techniques, there remains a rather significant range of tasks where automated learners cannot yet get near human performance. One such is the unsupervised learning of word structure addressed by MorphoChallenge, another is the textual entailment task addressed by RTE. The third lecture recasts these and similar problems in terms of learning weighted edges in a sparse graph, and presents learning techniques that seem to have some potential to better find spare finite state and near-FS models than EM. We will provide a mathematical introduction to the Minimum Description Length (MDL) paradigm and spectral learning, and relate these to the better known L1 regularization technique and sparse overcomplete representations.

Presenters:

Andras KornaiBudapest Institute of Technology / Computer and Automation Research Institute, Hungarian Academy of [email protected], [email protected]

James RogersComputer Science Department, Earlham [email protected]

Gerald PennDepartment of Computer Science, University of Toronto / University of Trinity [email protected]

Anssi Yli-JyräDepartment of General Linguistics, University of [email protected]

THE MATHEMATICS OF LANGUAGE LEARNINGby Andras Kornai, James Rogers, Gerald Penn and Anssi Yli-Jyrä

Abstract

Tutorial 4

EXPLOITING SOCIAL MEDIA FOR NATURAL LANGUAGE PROCESSING: BRIDGING THE GAP BETWEEN LANGUAGE-CENTRIC AND REAL-WORLD APPLICATIONSby Simone Paolo Ponzetto and Andrea Zielinski

Abstract

Social media like Twitter and micro-blogs provide a goldmine of text, shallow markup annotations and network structure. These information sources can all be exploited together in order to automatically acquire vast amounts of up-to-date, wide-coverage structured knowledge. This knowledge, in turn, can be used to measure the pulse of a variety of social phenomena like political events, activism and stock prices, as well as to detect emerging events such as natural disasters (earthquakes, tsunami, etc.).

The main purpose of this tutorial is to introduce social media as a resource to the Natural Language Processing (NLP) community both from a scientific and an application-oriented perspective. To this end, we focus on micro-blogs such as Twitter, and show how it can be successfully mined to perform complex NLP tasks such as the identification of events, topics and trends. Furthermore, this information can be used to build high-end socially intelligent applications that tap the wisdom of the crowd on a large scale, thus successfully bridging the gap between computational text analysis and real-world, mission-critical applications such as financial forecasting and natural crisis management.

Presenters:

Simone Paolo PonzettoResearch Group of Data and Web Science, Universität [email protected]

Andrea ZielinskiFraunhofer-Institut für Optronik, Systemtechnik und Bildauswertung (IOSB)[email protected]

Tutorial 5

ROBUST AUTOMATED NATURAL LANGUAGE PROCESSING WITH MULTIWORD EXPRESSIONS AND COLLOCATIONSby Valia Kordoni and Markus Egg

Abstract

Multi Word Expressions (MWEs) are a key issue and a current weakness for NLP tasks that need some degree of semantic interpretation, with potential contributions to natural language parsing and generation, as well as applications such as machine translation, information retrieval and information extraction. Therefore, a tutorial on this topic is of great relevance for researchers in NLP and related areas.

Attendees to this tutorial should gain insight into linguistic and distributional characteristics of MWEs and, most importantly, their relevance for robust automated natural language processing and language technology, with a thorough overview of methods and resources that support their use. We will provide a theoretical and practical introduction to the topic, with demonstrations of resources and tools available, aiming to equip the attendees with some starting recipes for MWE processing, including tools for the identification and resource construction (e.g. NSP, UCS, mwetoolkit) and annotation (e.g. jMWE). Our target audience includes researchers and practitioners in Language Technology (not necessarily experts in MWEs) who are interested in tasks that involve or could benefit from considering MWEs as a pervasive phenomenon in human language and communication.

Presenters:

Valia KordoniDepartment of English and American Studies, Humboldt University Berlin [email protected]

Markus EggDepartment of English and American Studies, Humboldt University Berlin [email protected]

Tutorial 6

EXPLOITING SOCIAL MEDIA FOR NATURAL LANGUAGE PROCESSING: BRIDGING THE GAP BETWEEN LANGUAGE-CENTRIC AND REAL-WORLD APPLICATIONSby Simone Paolo Ponzetto and Andrea Zielinski

Abstract

Social media like Twitter and micro-blogs provide a goldmine of text, shallow markup annotations and network structure. These information sources can all be exploited together in order to automatically acquire vast amounts of up-to-date, wide-coverage structured knowledge. This knowledge, in turn, can be used to measure the pulse of a variety of social phenomena like political events, activism and stock prices, as well as to detect emerging events such as natural disasters (earthquakes, tsunami, etc.).

The main purpose of this tutorial is to introduce social media as a resource to the Natural Language Processing (NLP) community both from a scientific and an application-oriented perspective. To this end, we focus on micro-blogs such as Twitter, and show how it can be successfully mined to perform complex NLP tasks such as the identification of events, topics and trends. Furthermore, this information can be used to build high-end socially intelligent applications that tap the wisdom of the crowd on a large scale, thus successfully bridging the gap between computational text analysis and real-world, mission-critical applications such as financial forecasting and natural crisis management.

Presenters:

Simone Paolo PonzettoResearch Group of Data and Web Science, Universität [email protected]

Andrea ZielinskiFraunhofer-Institut für Optronik, Systemtechnik und Bildauswertung (IOSB)[email protected]

Tutorial 5

ROBUST AUTOMATED NATURAL LANGUAGE PROCESSING WITH MULTIWORD EXPRESSIONS AND COLLOCATIONSby Valia Kordoni and Markus Egg

Abstract

Multi Word Expressions (MWEs) are a key issue and a current weakness for NLP tasks that need some degree of semantic interpretation, with potential contributions to natural language parsing and generation, as well as applications such as machine translation, information retrieval and information extraction. Therefore, a tutorial on this topic is of great relevance for researchers in NLP and related areas.

Attendees to this tutorial should gain insight into linguistic and distributional characteristics of MWEs and, most importantly, their relevance for robust automated natural language processing and language technology, with a thorough overview of methods and resources that support their use. We will provide a theoretical and practical introduction to the topic, with demonstrations of resources and tools available, aiming to equip the attendees with some starting recipes for MWE processing, including tools for the identification and resource construction (e.g. NSP, UCS, mwetoolkit) and annotation (e.g. jMWE). Our target audience includes researchers and practitioners in Language Technology (not necessarily experts in MWEs) who are interested in tasks that involve or could benefit from considering MWEs as a pervasive phenomenon in human language and communication.

Presenters:

Valia KordoniDepartment of English and American Studies, Humboldt University Berlin [email protected]

Markus EggDepartment of English and American Studies, Humboldt University Berlin [email protected]

Tutorial 6

SEMANTIC PARSING WITH COMBINATORY CATEGORIAL GRAMMARSby Yoav Artzi, Nicholas FitzGerald and Luke Zettlemoyer

Abstract

Semantic parsers map natural language sentences to formal representations of their underlying meaning. Building accurate semantic parsers without prohibitive engineering costs is a long-standing, open research problem.

The tutorial will describe general principles for building semantic parsers. The presentation will be divided into two main parts: modeling and learning. The modeling section will include best practices for grammar design and choice of semantic representation. The discussion will be guided by examples from several domains. To illustrate the choices to be made and show how they can be approached within a real-life representation language, we will use lambda-calculus meaning representations. In the learning part, we will describe a unified approach for learning Combinatory Categorial Grammar (CCG) semantic parsers, that induces both a CCG lexicon and the parameters of a parsing model. The approach learns from data with labeled meaning representations, as well as from more easily gathered weak supervision. It also enables grounded learning where the semantic parser is used in an interactive environment, for example to read and execute instructions.

The ideas we will discuss are widely applicable. The semantic modeling approach, while implemented in lambda-calculus, could be applied to manyother formal languages. Similarly, the algorithms for inducing CCGs focus on tasks that are formalism independent, learning the meaning of words and estimating parsing parameters. No prior knowledge of CCGs is required. The tutorial will be backed by implementation and experiments in the University of Washington Semantic Parsing Framework (UW SPF -http://yoavartzi.com/spf).

Presenters:

Yoav ArtziDepartment of Computer Science & Engineering, University of [email protected]

Nicholas FitzGeraldDepartment of Computer Science & Engineering, University of [email protected]

Luke ZettlemoyerDepartment of Computer Science & Engineering, University of [email protected]

Tutorial 7

MAINCONFERENCEPROGRAM

SEMANTIC PARSING WITH COMBINATORY CATEGORIAL GRAMMARSby Yoav Artzi, Nicholas FitzGerald and Luke Zettlemoyer

Abstract

Semantic parsers map natural language sentences to formal representations of their underlying meaning. Building accurate semantic parsers without prohibitive engineering costs is a long-standing, open research problem.

The tutorial will describe general principles for building semantic parsers. The presentation will be divided into two main parts: modeling and learning. The modeling section will include best practices for grammar design and choice of semantic representation. The discussion will be guided by examples from several domains. To illustrate the choices to be made and show how they can be approached within a real-life representation language, we will use lambda-calculus meaning representations. In the learning part, we will describe a unified approach for learning Combinatory Categorial Grammar (CCG) semantic parsers, that induces both a CCG lexicon and the parameters of a parsing model. The approach learns from data with labeled meaning representations, as well as from more easily gathered weak supervision. It also enables grounded learning where the semantic parser is used in an interactive environment, for example to read and execute instructions.

The ideas we will discuss are widely applicable. The semantic modeling approach, while implemented in lambda-calculus, could be applied to manyother formal languages. Similarly, the algorithms for inducing CCGs focus on tasks that are formalism independent, learning the meaning of words and estimating parsing parameters. No prior knowledge of CCGs is required. The tutorial will be backed by implementation and experiments in the University of Washington Semantic Parsing Framework (UW SPF -http://yoavartzi.com/spf).

Presenters:

Yoav ArtziDepartment of Computer Science & Engineering, University of [email protected]

Nicholas FitzGeraldDepartment of Computer Science & Engineering, University of [email protected]

Luke ZettlemoyerDepartment of Computer Science & Engineering, University of [email protected]

Tutorial 7

MAINCONFERENCEPROGRAM

9.00 – 9.30 Opening Session Hall 3

9.30 – 10.30 Invited Talk: Harald Baayen Hall 3

th10.30 – 11.00 Coffee Break 5 Floor

11.00 – 12.15 Parallel Sessions Halls 3, 6, 7, 8, 10

12.15 – 13.45 Lunch Break 12.15 – 13.45 Student Lunch Continental Plaza

nd rd18.30 – 21.00 Poster Session + System Demonstrations + 2 and 3 Floors Student Research Workshop + Poster Dinner

Invited Talk: Harald Baayen (Tuebingen/Alberta)When Parsing Makes Things Worse: An Eye-tracking Study of English Compounds

thMonday, August 5 , 2013, 9.30 am – 10.30 am Hall 3→Short Bio: Prof. Dr. Rolf Harald Baayen is regarded as one of the best and most innovative researchers in the field of vocabulary research and quantitative linguistics. He is a pioneer of computer-assisted and empirical linguistic research and psycholinguistics, and has made fundamental contributions to the understanding of human speech and the role of the memory in language processing. Prof. Harald Baayen was born in the USA in 1958. He got a PhD in 1989 and was a postdoctoral researcher at Vrije Universiteit Amsterdam. He worked at the Max Planck Institute for Psycholinguistics in Nijmegen, the Netherlands (1990 –�1998), and at Radboud Universiteit Nijmegen, initially as an associate and subsequently as a full professor in Quantitative Linguistics (2006). Since 2007, he has been professor at the Department of Linguistics, University of Alberta, Edmonton, Canada. His honors include the PIONEER Award of the Netherlands Organization for Scientific Research in 1998, and the Alexander von Humboldt Professorship 2012.

th Main Conference: Mon, August 5

Overview

Time Activity Hall 3 Hall 6 Hall 7 Hall 8 Hall 10 Other

9.30 Invited Talk 1:

Harald Baayen Hall 3

10.30 Coffee Break

11.00 Papers LP 1a LP 1b LP 1c LP 1d LP 1e

Machine Statistical & Semantics I Discourse, Syntax &

Translation: Machine Learning Coreference & Parsing I

Statistical Methods in NLP I Pragmatics I

Models I

12.15 Lunch Break

Machine Statistical & Semantics II Discourse, Syntax &

Translation: Machine Learning Coreference & Parsing II

Statistical Methods in NLP II Pragmatics II

Models II

Machine Statistical & Semantics III Low Resource Syntax &

Translation: Machine Learning Language Parsing III

Statistical Methods in NLP III Processing

Models III NLP

Applications

16.15 Coffee Break

16.45 Papers SP 4a SP 4b SP 4c SP 4d SP 4e

Machine NLP Applications Semantics Discourse, Syntax &

Translation: Coreference & Parsing

Statistical Pragmatics

Models

18.30 Poster Session + 2nd & 3rd

System Floor s

Demonstrations +

Buffet

21.00 End

thProgram – Mon, August 5

9.30 – 10.30 Invited Talk: Harald Baayen Hall 3

12.15 – 13.45 Lunch Break 12.15 – 13.45 Student Lunch Continental Plaza

nd rd18.30 – 21.00 Poster Session + System Demonstrations + 2 and 3 Floors Student Research Workshop + Poster Dinner

Invited Talk: Harald Baayen (Tuebingen/Alberta)When Parsing Makes Things Worse: An Eye-tracking Study of English Compounds

thMonday, August 5 , 2013, 9.30 am – 10.30 am Hall 3→Short Bio: Prof. Dr. Rolf Harald Baayen is regarded as one of the best and most innovative researchers in the field of vocabulary research and quantitative linguistics. He is a pioneer of computer-assisted and empirical linguistic research and psycholinguistics, and has made fundamental contributions to the understanding of human speech and the role of the memory in language processing. Prof. Harald Baayen was born in the USA in 1958. He got a PhD in 1989 and was a postdoctoral researcher at Vrije Universiteit Amsterdam. He worked at the Max Planck Institute for Psycholinguistics in Nijmegen, the Netherlands (1990 –�1998), and at Radboud Universiteit Nijmegen, initially as an associate and subsequently as a full professor in Quantitative Linguistics (2006). Since 2007, he has been professor at the Department of Linguistics, University of Alberta, Edmonton, Canada. His honors include the PIONEER Award of the Netherlands Organization for Scientific Research in 1998, and the Alexander von Humboldt Professorship 2012.

th Main Conference: Mon, August 5

Overview

9.30 Invited Talk 1:

Harald Baayen Hall 3

10.30 Coffee Break

Machine Statistical & Semantics I Discourse, Syntax &

Translation: Machine Learning Coreference & Parsing I

Statistical Methods in NLP I Pragmatics I

Models I

12.15 Lunch Break

Machine Statistical & Semantics II Discourse, Syntax &

Translation: Machine Learning Coreference & Parsing II

Statistical Methods in NLP II Pragmatics II

Models II

Machine Statistical & Semantics III Low Resource Syntax &

Translation: Machine Learning Language Parsing III

Statistical Methods in NLP III Processing

Models III NLP

Applications

16.15 Coffee Break

16.45 Papers SP 4a SP 4b SP 4c SP 4d SP 4e

Machine NLP Applications Semantics Discourse, Syntax &

Translation: Coreference & Parsing

Statistical Pragmatics

Models

18.30 Poster Session + 2nd & 3rd

System Floor s

Demonstrations +

Buffet

21.00 End

thProgram – Mon, August 5

8.00 – 18.00 Registration Floor 0→

9.00 – 9.30 Opening Session Hall 3→

9.30 Invited Talk: Harald Baayen (Tuebingen/Alberta) When Parsing Makes Things Worse: An Eye-tracking Study of English Compounds Hall 3→

th10.30 Coffee Break 5 Floor→

LONG PAPERS, SHORT PAPERS, POSTERS, SYSTEM DEMONSTRATIONS, STUDENT RESEARCH WORKSHOP, TACL

Oral Presentations → Halls 3, 6, 7, 8, 10

LONG PAPERS

LP 1a Machine Translation: Statistical Models I → Hall 3

11.00 A Shift-Reduce Parsing Algorithm for Phrase-based String-to-Dependency Translation Yang Liu

11.25 Integrating Translation Memory into Phrase-based Machine Translation during Decoding Kun Wang, Chengqing Zong and Keh-Yih Su 11.50 Training Nondeficient Variants of IBM-3 and IBM-4 for Word Alignment Thomas Schoenemann LP 1b Statistical and Machine Learning Methods in NLP I → Hall 6 11.00 Modelling Annotator Bias with Multi-task Gaussian Processes: An Application to Machine Translation Quality Estimation Trevor Cohn and Lucia Specia 11.25 Smoothed Marginal Distribution Constraints for Language Modeling Brian Roark, Cyril Allauzen and Michael Riley 11.50 Grounded Language Learning from Videos Described with Sentences Haonan Yu and Jeffrey Mark Siskind

LP 1c Semantics I → Hall 7

11.00 Plurality, Negation, and Quantification:Towards Comprehensive Quantifier Scope Disambiguation Mehdi Manshadi and James Allen

thExtended Daily Program – Mon, August 5

11.25 Joint Event Extraction via Structured Prediction with Global Features Qi Li, Heng Ji and Liang Huang

11.50 Language-Independent Discriminative Parsing of Temporal Expressions Gabor Angeli and Jakob Uszkoreit

LP 1d Discourse, Coreference and Pragmatics I → Hall 8

11.00 Graph-based Local Coherence Modeling Camille Guinaudeau and Michael Strube

11.25 Recognizing Rare Social Phenomena in Conversation: Empowerment Detection in Support Group Chatrooms Elijah Mayfield, David Adamson and Carolyn Penstein Rosé

11.50 Decentralized Entity-Level Modeling for Coreference Resolution Greg Durrett, David Hall and Dan Klein

LP 1e Syntax and Parsing I → Hall 10 11.00 Chinese Parsing Exploiting Characters Meishan Zhang, Yue Zhang, Wanxiang Che and Ting Liu

11.25 A Transition-based Dependency Parser Using a Dynamic Parsing Strategy Francesco Sartorio, Giorgio Satta and Joakim Nivre

11.50 General binarization for parsing and translation Matthias Büchse, Alexander Koller and Heiko Vogler 12.15 Lunch Break12.15 Student Lunch

LP 2a Machine Translation: Statistical Models II → Hall 3

13.45 Distortion Model Considering Rich Context for Statistical Machine Translation Isao Goto, Masao Utiyama, Eiichiro Sumita, Akihiro Tamura and Sadao Kurohashi

14.10 Word Alignment Modeling with Context Dependent Deep Neural Network Nan Yang, Shujie Liu, Mu Li, Ming Zhou and Nenghai Yu

14.35 Microblogs as Parallel Corpora Wang Ling, Guang Xiang, Chris Dyer, Alan Black and Isabel Trancoso

LP 2b Statistical and Machine Learning Methods in NLP II → Hall 6

13.45 Improved Bayesian Logistic Supervised Topic Models with Data Augmentation Jun Zhu, Xun Zheng and Bo Zhang

9.00 – 9.30 Opening Session Hall 3→

9.30 Invited Talk: Harald Baayen (Tuebingen/Alberta) When Parsing Makes Things Worse: An Eye-tracking Study of English Compounds Hall 3→

LONG PAPERS, SHORT PAPERS, POSTERS, SYSTEM DEMONSTRATIONS, STUDENT RESEARCH WORKSHOP, TACL

LONG PAPERS

LP 1a Machine Translation: Statistical Models I → Hall 3

11.00 A Shift-Reduce Parsing Algorithm for Phrase-based String-to-Dependency Translation Yang Liu

11.25 Integrating Translation Memory into Phrase-based Machine Translation during Decoding Kun Wang, Chengqing Zong and Keh-Yih Su 11.50 Training Nondeficient Variants of IBM-3 and IBM-4 for Word Alignment Thomas Schoenemann LP 1b Statistical and Machine Learning Methods in NLP I → Hall 6 11.00 Modelling Annotator Bias with Multi-task Gaussian Processes: An Application to Machine Translation Quality Estimation Trevor Cohn and Lucia Specia 11.25 Smoothed Marginal Distribution Constraints for Language Modeling Brian Roark, Cyril Allauzen and Michael Riley 11.50 Grounded Language Learning from Videos Described with Sentences Haonan Yu and Jeffrey Mark Siskind

LP 1c Semantics I → Hall 7

11.00 Plurality, Negation, and Quantification:Towards Comprehensive Quantifier Scope Disambiguation Mehdi Manshadi and James Allen

thExtended Daily Program – Mon, August 5

11.25 Joint Event Extraction via Structured Prediction with Global Features Qi Li, Heng Ji and Liang Huang

11.50 Language-Independent Discriminative Parsing of Temporal Expressions Gabor Angeli and Jakob Uszkoreit

LP 1d Discourse, Coreference and Pragmatics I → Hall 8

11.00 Graph-based Local Coherence Modeling Camille Guinaudeau and Michael Strube

11.25 Recognizing Rare Social Phenomena in Conversation: Empowerment Detection in Support Group Chatrooms Elijah Mayfield, David Adamson and Carolyn Penstein Rosé

11.50 Decentralized Entity-Level Modeling for Coreference Resolution Greg Durrett, David Hall and Dan Klein

LP 1e Syntax and Parsing I → Hall 10 11.00 Chinese Parsing Exploiting Characters Meishan Zhang, Yue Zhang, Wanxiang Che and Ting Liu

11.25 A Transition-based Dependency Parser Using a Dynamic Parsing Strategy Francesco Sartorio, Giorgio Satta and Joakim Nivre

11.50 General binarization for parsing and translation Matthias Büchse, Alexander Koller and Heiko Vogler 12.15 Lunch Break12.15 Student Lunch

LP 2a Machine Translation: Statistical Models II → Hall 3

13.45 Distortion Model Considering Rich Context for Statistical Machine Translation Isao Goto, Masao Utiyama, Eiichiro Sumita, Akihiro Tamura and Sadao Kurohashi

14.10 Word Alignment Modeling with Context Dependent Deep Neural Network Nan Yang, Shujie Liu, Mu Li, Ming Zhou and Nenghai Yu

14.35 Microblogs as Parallel Corpora Wang Ling, Guang Xiang, Chris Dyer, Alan Black and Isabel Trancoso

LP 2b Statistical and Machine Learning Methods in NLP II → Hall 6

13.45 Improved Bayesian Logistic Supervised Topic Models with Data Augmentation Jun Zhu, Xun Zheng and Bo Zhang

14.10 Fast and Robust Compressive Summarization with Dual Decomposition and Multi-Task Learning Miguel Almeida and Andre Martins

14.35 Unsupervised Transcription of Historical Documents Taylor Berg-Kirkpatrick, Greg Durrett and Dan Klein

LP 2c Semantics II → Hall 7

13.45 Adapting Discriminative Reranking to Grounded Language Learning Joohyun Kim and Raymond Mooney

14.10 Universal Conceptual Cognitive Annotation (UCCA) Omri Abend and Ari Rappoport

14.35 Linking Tweets to News: A Framework to Enrich Short Text Data in Social Media Weiwei Guo, Hao Li, Heng Ji and Mona Diab

LP 2d Discourse, Coreference and Pragmatics II → Hall 8

13.45 A Computational Approach to Politeness with Application to Social Factors Cristian Danescu-Niculescu-Mizil, Moritz Sudhof, Dan Jurafsky, Jure Leskovec and Christopher Potts

14.10 Modeling Thesis Clarity in Student Essays Isaac Persing and Vincent Ng

14.35 Translating Italian Connectives into Italian Sign Language Camillo Lugaresi and Barbara Di Eugenio LP 2e Syntax and Parsing II → Hall 10 13.45 Stop-probability Estimates Computed on a Large Corpus Improve Unsupervised Dependency Parsing David Marecek and Milan Straka

14.10 Transfer Learning for Constituency-based Grammars Yuan Zhang, Regina Barzilay and Amir Globerson

14.35 A Context Free TAG Variant Ben Swanson, Elif Yamangil, Stuart Shieber and Eugene Charniak

LP 3a Machine Translation: Statistical Models III → Hall 3 15.00 Fast and Adaptive Online Training of Feature-Rich Translation Models Spence Green, Sida Wang, Daniel Cer and Christopher D. Manning

15.25 Advancements in Reordering Models for Statistical Machine Translation Minwei Feng, Jan-Thorsten Peter and Hermann Ney

THEXTENDED DAILY PROGRAM – MON, AUGUST 5

15.50 A Markov Model of Machine Translation using Non-parametric Bayesian Inference Yang Feng and Trevor Cohn

LP 3b Statistical and Machine Learning Methods in NLP III → Hall 6 15.00 Scaling Semi-supervised Naive Bayes with Feature Marginals Michael Lucas and Doug Downey

15.25 Learning Latent Personas of Film Characters David Bamman, Brendan O'Connor and Noah A. Smith

15.50 Scalable Decipherment for Machine Translation via Hash Sampling Sujith Ravi

LP 3c Semantics III → Hall 7 15.00 Automatic Interpretation of the English Possessive Stephen Tratz and Eduard Hovy

15.25 Is a 204 cm Man Tall or Small ? Acquisition of Numerical Common Sense from the Web Katsuma Narisawa, Yotaro Watanabe, Junta Mizuno, Naoaki Okazaki and Kentaro Inui

15.50 Probabilistic Domain Modelling With Contextualized Distributional Semantic Vectors Jackie Chi Kit Cheung and Gerald Penn

LP 3d Low Resource Language Processing NLP Applications → Hall 8

15.00 Extracting Bilingual Terminologies from Comparable Corpora Ahmet Aker, Monica Paramita and Rob Gaizauskas

15.25 The Haves and the Have-Nots: Leveraging Unlabelled Corpora for Sentiment Analysis Kashyap Popat, Balamurali A.R, Pushpak Bhattacharyya and Gholamreza Haffari

15.50 Large-scale Semantic Parsing via Schema Matching and Lexicon Extension Qingqing Cai and Alexander Yates

LP 3e Syntax and Parsing III → Hall 10 15.00 Fast and Accurate Shift-Reduce Constituent Parsing Muhua Zhu, Yue Zhang, Wenliang Chen, Min Zhang and Jingbo Zhu

15.25 Nonconvex Global Optimization for Latent-Variable Models Matthew Gormley and Jason Eisner

15.50 Parsing with Compositional Vector Grammars Richard Socher, John Bauer, Christopher Manning and Andrew Ng 16.15 Coffee Break