an overview on portuguese nominalizations

32
Motivations Lexical Resources and Corpora -Ura nominalizations in Portuguese First Results Co-predication General Conclusions An Overview on Portuguese Nominalizations Livy Real 1 Alexandre Rademaker 12 1 IBM Research, Brazil 2 FGV/EMAp, Brazil TYTLES, ESSLLI, 2015 Livy Real, Alexandre Rademaker TYTLES, ESSLLI, Barcelona - 2015 An Overview on Portuguese Nominalizations

Upload: livy-real

Post on 13-Apr-2017

281 views

Category:

Science


3 download

TRANSCRIPT

Page 1: An Overview on Portuguese Nominalizations

Motivations Lexical Resources and Corpora -Ura nominalizations in Portuguese First Results Co-predication General Conclusions

An Overview on Portuguese Nominalizations

Livy Real1 Alexandre Rademaker12

1IBM Research, Brazil

2FGV/EMAp, Brazil

TYTLES, ESSLLI, 2015

Livy Real, Alexandre Rademaker TYTLES, ESSLLI, Barcelona - 2015

An Overview on Portuguese Nominalizations

Page 2: An Overview on Portuguese Nominalizations

Motivations Lexical Resources and Corpora -Ura nominalizations in Portuguese First Results Co-predication General Conclusions

Outline

1 Motivations

2 Lexical Resources and Corpora

3 -Ura nominalizations in Portuguese

4 First Results

5 Co-predication

6 General Conclusions

Livy Real, Alexandre Rademaker TYTLES, ESSLLI, Barcelona - 2015

An Overview on Portuguese Nominalizations

Page 3: An Overview on Portuguese Nominalizations

Motivations Lexical Resources and Corpora -Ura nominalizations in Portuguese First Results Co-predication General Conclusions

Motivations

Main purpose: to describe nominalizations formed by aspecific morpheme in Portuguese considering all its possiblemeanings

Empirical description: Corpus based + Lexical Resourcesanalysis

Nominalizations formed by -ura

Brazilian and European Portuguese

Trying to get generalizations on type relations from a singleword and co-predication structures

Livy Real, Alexandre Rademaker TYTLES, ESSLLI, Barcelona - 2015

An Overview on Portuguese Nominalizations

Page 4: An Overview on Portuguese Nominalizations

Motivations Lexical Resources and Corpora -Ura nominalizations in Portuguese First Results Co-predication General Conclusions

Motivations - Why -ura?

Ambiguous suffix (Real, 2006, 2008).

Not productive anymore, so we do have a finite set of wordsformed by -ura.

Straightforward correspondents in other neo-Latin Language(-ure in French: coupure, -ura in Catalan: obertura).

Previous literature (Sandmann, 1988; Rocha, 1999; Real,2006).

We already known co-predication structures with wordsformed by -ura.

Example

A assinatura levou tres horas e estava ilegıvel.The signature/signing took three hours and was unreadable.

Livy Real, Alexandre Rademaker TYTLES, ESSLLI, Barcelona - 2015

An Overview on Portuguese Nominalizations

Page 5: An Overview on Portuguese Nominalizations

Motivations Lexical Resources and Corpora -Ura nominalizations in Portuguese First Results Co-predication General Conclusions

Lexical Resources and Corpora

Lexical resources - OpenWordNet-PT (de Paiva et al., 2012)http://wnpt.brlcloud.com/wn/

Dictionaries + Corpora + Google Engine

Corpus Brasileiro http://www.linguateca.pt/ACDC

DictionariesPorto’s Dictionaries http://www.infopedia.pt

Caldas Aulete Dictionary http://www.aulete.com.br

Houaiss Dictionary http://www.houaiss.uol.com.br

Livy Real, Alexandre Rademaker TYTLES, ESSLLI, Barcelona - 2015

An Overview on Portuguese Nominalizations

Page 6: An Overview on Portuguese Nominalizations

Motivations Lexical Resources and Corpora -Ura nominalizations in Portuguese First Results Co-predication General Conclusions

Lexical Resources and Corpora

OpenWordnet-PT

Standard wordnet for PortugueseCompletely linked to Princeton’s WordnetFreely available in RDFAutomatically created and manually curated

OpenWordnet-PT has more -ura entries than any availabledictionary for Portuguese; easier to search.

Livy Real, Alexandre Rademaker TYTLES, ESSLLI, Barcelona - 2015

An Overview on Portuguese Nominalizations

Page 7: An Overview on Portuguese Nominalizations

Motivations Lexical Resources and Corpora -Ura nominalizations in Portuguese First Results Co-predication General Conclusions

-Ura Nominalizations - Selection

Extraction all nominals finished by the graphic form -ura (442words) in OpenWordnet-PT

Manual selection of words formed by the morpheme -ura (150words)

Examples

dobradura, assinatura, brochura, semeadura, tesoura, queimadura,viatura, gordura, legislatura, purpura, arquitetura, floricultura,armadura, manjedoura, Cingapura, jura

Livy Real, Alexandre Rademaker TYTLES, ESSLLI, Barcelona - 2015

An Overview on Portuguese Nominalizations

Page 8: An Overview on Portuguese Nominalizations

Motivations Lexical Resources and Corpora -Ura nominalizations in Portuguese First Results Co-predication General Conclusions

-Ura Nominalizations - Categorization

Parted from proposed categorization (Real, 2014) for eventivenouns

Checked other possible readings for each word in allconsidered dictionaries

Types

event, result, physical result, locative, collective, means, property,instrument, a given portion, rest, function, duration of a function,science/art.

Livy Real, Alexandre Rademaker TYTLES, ESSLLI, Barcelona - 2015

An Overview on Portuguese Nominalizations

Page 9: An Overview on Portuguese Nominalizations

Motivations Lexical Resources and Corpora -Ura nominalizations in Portuguese First Results Co-predication General Conclusions

-Ura Nominalizations - Examples

Event Deduziu-se que a mae lhe deu muita chicotada acada travessura.‘It was deduced that the mother gave him a lot ofwhiplashes at every trick (every time hemisbehaved).’

Result A analise do material revelou que, 30 dias apos amicroenxertia, ocorreu a soldadura parcial dosmicroenxertos.‘The analysis of the material showed that, 30 daysafter micrografting, occurred the partial welding ofmicro-grafts.’

Livy Real, Alexandre Rademaker TYTLES, ESSLLI, Barcelona - 2015

An Overview on Portuguese Nominalizations

Page 10: An Overview on Portuguese Nominalizations

Motivations Lexical Resources and Corpora -Ura nominalizations in Portuguese First Results Co-predication General Conclusions

-Ura Nominalizations- Examples

Physical Result A varredura mostra somente picos, como podeser visto na Figura 8, onde o espelho de simetria de0o e mostrado.‘The scan shows only peaks, as it can be seen inFigure 8, where the symmetry mirror of 0o is shown.’

Locative Meu certificado esta na pasta com meus documentosna prefeitura, mas o prefeito nao o reconheceu.‘My certificate is the folder with my documents inthe city hall, but the Mayor did not recognized it.’

Livy Real, Alexandre Rademaker TYTLES, ESSLLI, Barcelona - 2015

An Overview on Portuguese Nominalizations

Page 11: An Overview on Portuguese Nominalizations

Motivations Lexical Resources and Corpora -Ura nominalizations in Portuguese First Results Co-predication General Conclusions

-Ura Nominalizations - Examples

Collective Uffizi tem o mais completo testemunho do seculoXV, um momento decisivo da historia da arte,marcado pela passagem da tradicao bizantinamedieval para a pintura do Renascimento.‘Uffizi has the most complete reference of XVCentury, a decisive moment of Art History, marked bythe passage of Medieval Bizantine tradition to theRenaissance painting’

Means A narrativa e um cavalo: um meio de transporte cujotipo de andadura, trote ou galope, depende dopercurso a ser executado.‘The narrative is a horse: a means of transportationwhich type of gait, trot or gallop, depends on theroute to run.’

Livy Real, Alexandre Rademaker TYTLES, ESSLLI, Barcelona - 2015

An Overview on Portuguese Nominalizations

Page 12: An Overview on Portuguese Nominalizations

Motivations Lexical Resources and Corpora -Ura nominalizations in Portuguese First Results Co-predication General Conclusions

-Ura Nominalizations - Examples

Means A narrativa e um cavalo: um meio de transporte cujotipo de andadura, trote ou galope, depende dopercurso a ser executado.‘The narrative is a horse: a means of transportationwhich type of gait, trot or gallop, depends on theroute to run.’

Property Possui cerca de 48% de umidade e 24% de gordura.‘It has around 48% of umidity and 24% of fat.’

Instrument Caricaturizada, a gostosona desfila engravatada, comchapeu, abotoadura e tudo mais.‘Caricatured, the hot girl parades with tie, hat,cufflink and everything.’

Livy Real, Alexandre Rademaker TYTLES, ESSLLI, Barcelona - 2015

An Overview on Portuguese Nominalizations

Page 13: An Overview on Portuguese Nominalizations

Motivations Lexical Resources and Corpora -Ura nominalizations in Portuguese First Results Co-predication General Conclusions

-Ura Nominalizations - Examples

A given portion Assim verificamos que os 587 pes que aquelas dezpropriedades dos Calca Pereira comportam podiamrender uma media de 23,5 moeduras, isto e, uns 940alqueires de azeite, que valeriam, ao preco de 60reais o alqueire, 5640 reais.‘Thus we have verified that the 587 feet of those tenproperties from Calca Pereira family include couldyield an average of 23.5 milling portions, that is,some 940 acres of olive, which would be worth at theprice of 60 reais per bushel, 5640 reais.’

Livy Real, Alexandre Rademaker TYTLES, ESSLLI, Barcelona - 2015

An Overview on Portuguese Nominalizations

Page 14: An Overview on Portuguese Nominalizations

Motivations Lexical Resources and Corpora -Ura nominalizations in Portuguese First Results Co-predication General Conclusions

-Ura Nominalizations - Examples

Rest O arroz-caril, confeccionado com especiarias emoedura de coco, era caracterıstico de Goa e estavamuito difundido em Mocambique.‘The rice-curry, made with spices and coconutgrinding, was characteristic of Goa and waswidespread in Mozambique.’

Function Mario renunciou a magistratura em novembro.‘Mario resigned to the magistracy in November.’

Livy Real, Alexandre Rademaker TYTLES, ESSLLI, Barcelona - 2015

An Overview on Portuguese Nominalizations

Page 15: An Overview on Portuguese Nominalizations

Motivations Lexical Resources and Corpora -Ura nominalizations in Portuguese First Results Co-predication General Conclusions

-Ura Nominalizations - Examples

Duration of a function Para a legislatura de 1995-1998, os dadosprovem do Brasil.‘For the legislative period 1995-1998, the datacomes from Brazil.’

Science/Art A Italia exprimiu-se, durante certos seculos, pelaarquitetura, escultura, pintura.‘Italy expressed herself, during some centuries, by thearchitecture, sculpture, painting.’

Livy Real, Alexandre Rademaker TYTLES, ESSLLI, Barcelona - 2015

An Overview on Portuguese Nominalizations

Page 16: An Overview on Portuguese Nominalizations

Motivations Lexical Resources and Corpora -Ura nominalizations in Portuguese First Results Co-predication General Conclusions

First Results

150 words formed by -ura

33 lexicalized and idiosyncratic senses

2 possible types of action nominals in Portuguese are notpossible (or frequent) types to words formed by -ura:resultative state and abstract result.

Livy Real, Alexandre Rademaker TYTLES, ESSLLI, Barcelona - 2015

An Overview on Portuguese Nominalizations

Page 17: An Overview on Portuguese Nominalizations

Motivations Lexical Resources and Corpora -Ura nominalizations in Portuguese First Results Co-predication General Conclusions

First Results

Type Number of SensesEvent 74Result 62

Physical result 43Property 38

Instrument 24Collective 21

Science/Art 20Locative 8Means 8

A given portion 7Function 6

Rest 5Duration of a function 3

Lexicalized 33

Livy Real, Alexandre Rademaker TYTLES, ESSLLI, Barcelona - 2015

An Overview on Portuguese Nominalizations

Page 18: An Overview on Portuguese Nominalizations

Motivations Lexical Resources and Corpora -Ura nominalizations in Portuguese First Results Co-predication General Conclusions

First Results

•Trying to find generalization on the types words can assume:

Livy Real, Alexandre Rademaker TYTLES, ESSLLI, Barcelona - 2015

An Overview on Portuguese Nominalizations

Page 19: An Overview on Portuguese Nominalizations

Motivations Lexical Resources and Corpora -Ura nominalizations in Portuguese First Results Co-predication General Conclusions

First Results - Generalizations

A nominal form that has the type ‘rest’ always belongs to thetype ‘event’ (as lavadura ‘washing’ and varredura ‘scan’);

A noun that belongs to the type ‘a given portion’ (as moedura‘milling’ and semeadura ‘sowing’) has always the followingtypes: ‘event’, ‘result’, ‘event.result’;

Every noun that belongs to ‘duration of a function’ also holdsthe ‘function’ type;

Nouns that belong to the type ‘means’ do not belong to anyother type.

Livy Real, Alexandre Rademaker TYTLES, ESSLLI, Barcelona - 2015

An Overview on Portuguese Nominalizations

Page 20: An Overview on Portuguese Nominalizations

Motivations Lexical Resources and Corpora -Ura nominalizations in Portuguese First Results Co-predication General Conclusions

Co-predication - Corpus Brasileiro

Search for co-predications with all words on Corpus Brasileiro(more than 1 billion words from various textual genders);

Manually checked 2000 random sentences for very commonwords — as assinatura (70699 sentences) and gordura (29874sentences);

Manually check all the sentences with rare words — asenxaguadura (12 sentences) and andadura (35 sentences);

Standard co-predications not found!

Livy Real, Alexandre Rademaker TYTLES, ESSLLI, Barcelona - 2015

An Overview on Portuguese Nominalizations

Page 21: An Overview on Portuguese Nominalizations

Motivations Lexical Resources and Corpora -Ura nominalizations in Portuguese First Results Co-predication General Conclusions

Co-predication - Examples from Corpus Brasileiro

Essa assinatura que eu dei nesse relatorio da oposicao euestou confirmando.‘This signature I did on this opposition’s report I amconfirming.’

A assinatura que ocorreu na tarde, foi mostrado na noite nostelejornais RedeTV!‘The signature/signing that took place in the afternoon, wasshown on the evening news programs in RedeTV!’

Livy Real, Alexandre Rademaker TYTLES, ESSLLI, Barcelona - 2015

An Overview on Portuguese Nominalizations

Page 22: An Overview on Portuguese Nominalizations

Motivations Lexical Resources and Corpora -Ura nominalizations in Portuguese First Results Co-predication General Conclusions

Co-predication - Google

Search on Google with regular expression following previousproposed structures (Jezek & Melloni, 2011)

Constraints on co-predication with ANs

i. Split co-predication between main clause and subordinate clause;ii. Temporal disjunction between the two predications;iii. Omission of the internal argument.

Example:”assinatura *, que *,”

Livy Real, Alexandre Rademaker TYTLES, ESSLLI, Barcelona - 2015

An Overview on Portuguese Nominalizations

Page 23: An Overview on Portuguese Nominalizations

Motivations Lexical Resources and Corpora -Ura nominalizations in Portuguese First Results Co-predication General Conclusions

Co-predication - Google

Again, we did not find any standard co-predication!

Example

Acho que fica mais escuro e brilhante que com a opcao de pinturametalizada (que custa quase 1000 euros).‘I think it gets darker and brighter than with the option ofmetallic painting (which costs about 1000 euros)’

Livy Real, Alexandre Rademaker TYTLES, ESSLLI, Barcelona - 2015

An Overview on Portuguese Nominalizations

Page 24: An Overview on Portuguese Nominalizations

Motivations Lexical Resources and Corpora -Ura nominalizations in Portuguese First Results Co-predication General Conclusions

Co-predication - Again!

We produced sentences with rare and very frequent words formedby -ura and have tested with 3 native speakers (not related tolinguistic fields):

A enxaguadura, que levou uma hora, ficou imunda.‘The rinse, which took an hour, was filthy.’

A abertura, que levou uma hora, ficou otima.‘The opening, which took an hour, was great.’

A arquitetura, que levou 5 anos, e quase toda produzida porNeymer.‘The architecture, which took five years, is almost entirelyproduced by Neymer.’

Livy Real, Alexandre Rademaker TYTLES, ESSLLI, Barcelona - 2015

An Overview on Portuguese Nominalizations

Page 25: An Overview on Portuguese Nominalizations

Motivations Lexical Resources and Corpora -Ura nominalizations in Portuguese First Results Co-predication General Conclusions

Co-predication

Surprise again!!! Almost none of the sentences were accepted bythe three speakers!

Livy Real, Alexandre Rademaker TYTLES, ESSLLI, Barcelona - 2015

An Overview on Portuguese Nominalizations

Page 26: An Overview on Portuguese Nominalizations

Motivations Lexical Resources and Corpora -Ura nominalizations in Portuguese First Results Co-predication General Conclusions

Co-predication - Speakers’ evaluation

?!! A abertura, que levou uma hora, ficou otima.‘The opening, which took an hour, was great.’

??! A arquitetura, que levou 5 anos, e quase todaproduzida por Neymer.‘The architecture, which took five years, is almostentirely produced by Neymer.’

??? A enxaguadura, que levou uma hora, ficou imunda.‘The rinse, which took an hour, was filthy.’

Word Frequency in Corpus BrasileiroAbertura 70699

Arquitetura 25669Enxaguadura 12

Livy Real, Alexandre Rademaker TYTLES, ESSLLI, Barcelona - 2015

An Overview on Portuguese Nominalizations

Page 27: An Overview on Portuguese Nominalizations

Motivations Lexical Resources and Corpora -Ura nominalizations in Portuguese First Results Co-predication General Conclusions

Co-predication - Hypothesis

Hypothesis: it seems that there is a relation between frequency ofuse and grammatically of co-predications.

Livy Real, Alexandre Rademaker TYTLES, ESSLLI, Barcelona - 2015

An Overview on Portuguese Nominalizations

Page 28: An Overview on Portuguese Nominalizations

Motivations Lexical Resources and Corpora -Ura nominalizations in Portuguese First Results Co-predication General Conclusions

General conclusions

A nominal form that has the type ‘rest’ belongs to the type‘event’ (as lavadura ‘washing’ and varredura ‘scan’), butco-predications between them are impossible;

A noun that belongs to the type ‘a given portion’ (as moedura‘milling’ and semeadura ‘sowing’) has always the followingtypes: ‘event’, ‘result’, ‘event.result’, but any co-predicationwith ‘a given portion’ is blocked;

All lexicalized senses can not be co-predicated with any othertype.

Livy Real, Alexandre Rademaker TYTLES, ESSLLI, Barcelona - 2015

An Overview on Portuguese Nominalizations

Page 29: An Overview on Portuguese Nominalizations

Motivations Lexical Resources and Corpora -Ura nominalizations in Portuguese First Results Co-predication General Conclusions

General Remarks

It seems that the occurrence of a given word in the dailylexicon does have a big effect on grammatically ofco-predication. Is this a cognitive issue? How to deal withthat?

If so, the formal approaches that consider co-predication onlyunder a syntactic-semantic perspective are loosing veryimportant features of this phenomenon as pointed out by(Real & Retore 2014). Now we have a stronger evidence toargue that co-predication has an idiosyncratic nature.

Livy Real, Alexandre Rademaker TYTLES, ESSLLI, Barcelona - 2015

An Overview on Portuguese Nominalizations

Page 30: An Overview on Portuguese Nominalizations

Motivations Lexical Resources and Corpora -Ura nominalizations in Portuguese First Results Co-predication General Conclusions

General Conclusions

Obrigada! Thank you!

Livy Real, Alexandre Rademaker TYTLES, ESSLLI, Barcelona - 2015

An Overview on Portuguese Nominalizations

Page 31: An Overview on Portuguese Nominalizations

Motivations Lexical Resources and Corpora -Ura nominalizations in Portuguese First Results Co-predication General Conclusions

More Remarks

(?) I cannot believe this so small construction took threeyears!

Nao acredito que esta construcao de merda levou tres anos!

(?) How a signature so ugly took three minutes?

Como uma assinatura tao feia levou tres minutos?

Livy Real, Alexandre Rademaker TYTLES, ESSLLI, Barcelona - 2015

An Overview on Portuguese Nominalizations

Page 32: An Overview on Portuguese Nominalizations

Motivations Lexical Resources and Corpora -Ura nominalizations in Portuguese First Results Co-predication General Conclusions

More Remarks

Should we start to search for the non-formal variables whichare around co-predications?

If type (mis)match is more related to pragmatics and contextsthan to true-formal variables, proposals that understand eachnoun as a type are not so problematic.

Livy Real, Alexandre Rademaker TYTLES, ESSLLI, Barcelona - 2015

An Overview on Portuguese Nominalizations