an overview on portuguese nominalizations
TRANSCRIPT
Motivations Lexical Resources and Corpora -Ura nominalizations in Portuguese First Results Co-predication General Conclusions
An Overview on Portuguese Nominalizations
Livy Real1 Alexandre Rademaker12
1IBM Research, Brazil
2FGV/EMAp, Brazil
TYTLES, ESSLLI, 2015
Livy Real, Alexandre Rademaker TYTLES, ESSLLI, Barcelona - 2015
An Overview on Portuguese Nominalizations
Motivations Lexical Resources and Corpora -Ura nominalizations in Portuguese First Results Co-predication General Conclusions
Outline
1 Motivations
2 Lexical Resources and Corpora
3 -Ura nominalizations in Portuguese
4 First Results
5 Co-predication
6 General Conclusions
Livy Real, Alexandre Rademaker TYTLES, ESSLLI, Barcelona - 2015
An Overview on Portuguese Nominalizations
Motivations Lexical Resources and Corpora -Ura nominalizations in Portuguese First Results Co-predication General Conclusions
Motivations
Main purpose: to describe nominalizations formed by aspecific morpheme in Portuguese considering all its possiblemeanings
Empirical description: Corpus based + Lexical Resourcesanalysis
Nominalizations formed by -ura
Brazilian and European Portuguese
Trying to get generalizations on type relations from a singleword and co-predication structures
Livy Real, Alexandre Rademaker TYTLES, ESSLLI, Barcelona - 2015
An Overview on Portuguese Nominalizations
Motivations Lexical Resources and Corpora -Ura nominalizations in Portuguese First Results Co-predication General Conclusions
Motivations - Why -ura?
Ambiguous suffix (Real, 2006, 2008).
Not productive anymore, so we do have a finite set of wordsformed by -ura.
Straightforward correspondents in other neo-Latin Language(-ure in French: coupure, -ura in Catalan: obertura).
Previous literature (Sandmann, 1988; Rocha, 1999; Real,2006).
We already known co-predication structures with wordsformed by -ura.
Example
A assinatura levou tres horas e estava ilegıvel.The signature/signing took three hours and was unreadable.
Livy Real, Alexandre Rademaker TYTLES, ESSLLI, Barcelona - 2015
An Overview on Portuguese Nominalizations
Motivations Lexical Resources and Corpora -Ura nominalizations in Portuguese First Results Co-predication General Conclusions
Lexical Resources and Corpora
Lexical resources - OpenWordNet-PT (de Paiva et al., 2012)http://wnpt.brlcloud.com/wn/
Dictionaries + Corpora + Google Engine
Corpus Brasileiro http://www.linguateca.pt/ACDC
DictionariesPorto’s Dictionaries http://www.infopedia.pt
Caldas Aulete Dictionary http://www.aulete.com.br
Houaiss Dictionary http://www.houaiss.uol.com.br
Livy Real, Alexandre Rademaker TYTLES, ESSLLI, Barcelona - 2015
An Overview on Portuguese Nominalizations
Motivations Lexical Resources and Corpora -Ura nominalizations in Portuguese First Results Co-predication General Conclusions
Lexical Resources and Corpora
OpenWordnet-PT
Standard wordnet for PortugueseCompletely linked to Princeton’s WordnetFreely available in RDFAutomatically created and manually curated
OpenWordnet-PT has more -ura entries than any availabledictionary for Portuguese; easier to search.
Livy Real, Alexandre Rademaker TYTLES, ESSLLI, Barcelona - 2015
An Overview on Portuguese Nominalizations
Motivations Lexical Resources and Corpora -Ura nominalizations in Portuguese First Results Co-predication General Conclusions
-Ura Nominalizations - Selection
Extraction all nominals finished by the graphic form -ura (442words) in OpenWordnet-PT
Manual selection of words formed by the morpheme -ura (150words)
Examples
dobradura, assinatura, brochura, semeadura, tesoura, queimadura,viatura, gordura, legislatura, purpura, arquitetura, floricultura,armadura, manjedoura, Cingapura, jura
Livy Real, Alexandre Rademaker TYTLES, ESSLLI, Barcelona - 2015
An Overview on Portuguese Nominalizations
Motivations Lexical Resources and Corpora -Ura nominalizations in Portuguese First Results Co-predication General Conclusions
-Ura Nominalizations - Categorization
Parted from proposed categorization (Real, 2014) for eventivenouns
Checked other possible readings for each word in allconsidered dictionaries
Types
event, result, physical result, locative, collective, means, property,instrument, a given portion, rest, function, duration of a function,science/art.
Livy Real, Alexandre Rademaker TYTLES, ESSLLI, Barcelona - 2015
An Overview on Portuguese Nominalizations
Motivations Lexical Resources and Corpora -Ura nominalizations in Portuguese First Results Co-predication General Conclusions
-Ura Nominalizations - Examples
Event Deduziu-se que a mae lhe deu muita chicotada acada travessura.‘It was deduced that the mother gave him a lot ofwhiplashes at every trick (every time hemisbehaved).’
Result A analise do material revelou que, 30 dias apos amicroenxertia, ocorreu a soldadura parcial dosmicroenxertos.‘The analysis of the material showed that, 30 daysafter micrografting, occurred the partial welding ofmicro-grafts.’
Livy Real, Alexandre Rademaker TYTLES, ESSLLI, Barcelona - 2015
An Overview on Portuguese Nominalizations
Motivations Lexical Resources and Corpora -Ura nominalizations in Portuguese First Results Co-predication General Conclusions
-Ura Nominalizations- Examples
Physical Result A varredura mostra somente picos, como podeser visto na Figura 8, onde o espelho de simetria de0o e mostrado.‘The scan shows only peaks, as it can be seen inFigure 8, where the symmetry mirror of 0o is shown.’
Locative Meu certificado esta na pasta com meus documentosna prefeitura, mas o prefeito nao o reconheceu.‘My certificate is the folder with my documents inthe city hall, but the Mayor did not recognized it.’
Livy Real, Alexandre Rademaker TYTLES, ESSLLI, Barcelona - 2015
An Overview on Portuguese Nominalizations
Motivations Lexical Resources and Corpora -Ura nominalizations in Portuguese First Results Co-predication General Conclusions
-Ura Nominalizations - Examples
Collective Uffizi tem o mais completo testemunho do seculoXV, um momento decisivo da historia da arte,marcado pela passagem da tradicao bizantinamedieval para a pintura do Renascimento.‘Uffizi has the most complete reference of XVCentury, a decisive moment of Art History, marked bythe passage of Medieval Bizantine tradition to theRenaissance painting’
Means A narrativa e um cavalo: um meio de transporte cujotipo de andadura, trote ou galope, depende dopercurso a ser executado.‘The narrative is a horse: a means of transportationwhich type of gait, trot or gallop, depends on theroute to run.’
Livy Real, Alexandre Rademaker TYTLES, ESSLLI, Barcelona - 2015
An Overview on Portuguese Nominalizations
Motivations Lexical Resources and Corpora -Ura nominalizations in Portuguese First Results Co-predication General Conclusions
-Ura Nominalizations - Examples
Means A narrativa e um cavalo: um meio de transporte cujotipo de andadura, trote ou galope, depende dopercurso a ser executado.‘The narrative is a horse: a means of transportationwhich type of gait, trot or gallop, depends on theroute to run.’
Property Possui cerca de 48% de umidade e 24% de gordura.‘It has around 48% of umidity and 24% of fat.’
Instrument Caricaturizada, a gostosona desfila engravatada, comchapeu, abotoadura e tudo mais.‘Caricatured, the hot girl parades with tie, hat,cufflink and everything.’
Livy Real, Alexandre Rademaker TYTLES, ESSLLI, Barcelona - 2015
An Overview on Portuguese Nominalizations
Motivations Lexical Resources and Corpora -Ura nominalizations in Portuguese First Results Co-predication General Conclusions
-Ura Nominalizations - Examples
A given portion Assim verificamos que os 587 pes que aquelas dezpropriedades dos Calca Pereira comportam podiamrender uma media de 23,5 moeduras, isto e, uns 940alqueires de azeite, que valeriam, ao preco de 60reais o alqueire, 5640 reais.‘Thus we have verified that the 587 feet of those tenproperties from Calca Pereira family include couldyield an average of 23.5 milling portions, that is,some 940 acres of olive, which would be worth at theprice of 60 reais per bushel, 5640 reais.’
Livy Real, Alexandre Rademaker TYTLES, ESSLLI, Barcelona - 2015
An Overview on Portuguese Nominalizations
Motivations Lexical Resources and Corpora -Ura nominalizations in Portuguese First Results Co-predication General Conclusions
-Ura Nominalizations - Examples
Rest O arroz-caril, confeccionado com especiarias emoedura de coco, era caracterıstico de Goa e estavamuito difundido em Mocambique.‘The rice-curry, made with spices and coconutgrinding, was characteristic of Goa and waswidespread in Mozambique.’
Function Mario renunciou a magistratura em novembro.‘Mario resigned to the magistracy in November.’
Livy Real, Alexandre Rademaker TYTLES, ESSLLI, Barcelona - 2015
An Overview on Portuguese Nominalizations
Motivations Lexical Resources and Corpora -Ura nominalizations in Portuguese First Results Co-predication General Conclusions
-Ura Nominalizations - Examples
Duration of a function Para a legislatura de 1995-1998, os dadosprovem do Brasil.‘For the legislative period 1995-1998, the datacomes from Brazil.’
Science/Art A Italia exprimiu-se, durante certos seculos, pelaarquitetura, escultura, pintura.‘Italy expressed herself, during some centuries, by thearchitecture, sculpture, painting.’
Livy Real, Alexandre Rademaker TYTLES, ESSLLI, Barcelona - 2015
An Overview on Portuguese Nominalizations
Motivations Lexical Resources and Corpora -Ura nominalizations in Portuguese First Results Co-predication General Conclusions
First Results
150 words formed by -ura
33 lexicalized and idiosyncratic senses
2 possible types of action nominals in Portuguese are notpossible (or frequent) types to words formed by -ura:resultative state and abstract result.
Livy Real, Alexandre Rademaker TYTLES, ESSLLI, Barcelona - 2015
An Overview on Portuguese Nominalizations
Motivations Lexical Resources and Corpora -Ura nominalizations in Portuguese First Results Co-predication General Conclusions
First Results
Type Number of SensesEvent 74Result 62
Physical result 43Property 38
Instrument 24Collective 21
Science/Art 20Locative 8Means 8
A given portion 7Function 6
Rest 5Duration of a function 3
Lexicalized 33
Livy Real, Alexandre Rademaker TYTLES, ESSLLI, Barcelona - 2015
An Overview on Portuguese Nominalizations
Motivations Lexical Resources and Corpora -Ura nominalizations in Portuguese First Results Co-predication General Conclusions
First Results
•Trying to find generalization on the types words can assume:
Livy Real, Alexandre Rademaker TYTLES, ESSLLI, Barcelona - 2015
An Overview on Portuguese Nominalizations
Motivations Lexical Resources and Corpora -Ura nominalizations in Portuguese First Results Co-predication General Conclusions
First Results - Generalizations
A nominal form that has the type ‘rest’ always belongs to thetype ‘event’ (as lavadura ‘washing’ and varredura ‘scan’);
A noun that belongs to the type ‘a given portion’ (as moedura‘milling’ and semeadura ‘sowing’) has always the followingtypes: ‘event’, ‘result’, ‘event.result’;
Every noun that belongs to ‘duration of a function’ also holdsthe ‘function’ type;
Nouns that belong to the type ‘means’ do not belong to anyother type.
Livy Real, Alexandre Rademaker TYTLES, ESSLLI, Barcelona - 2015
An Overview on Portuguese Nominalizations
Motivations Lexical Resources and Corpora -Ura nominalizations in Portuguese First Results Co-predication General Conclusions
Co-predication - Corpus Brasileiro
Search for co-predications with all words on Corpus Brasileiro(more than 1 billion words from various textual genders);
Manually checked 2000 random sentences for very commonwords — as assinatura (70699 sentences) and gordura (29874sentences);
Manually check all the sentences with rare words — asenxaguadura (12 sentences) and andadura (35 sentences);
Standard co-predications not found!
Livy Real, Alexandre Rademaker TYTLES, ESSLLI, Barcelona - 2015
An Overview on Portuguese Nominalizations
Motivations Lexical Resources and Corpora -Ura nominalizations in Portuguese First Results Co-predication General Conclusions
Co-predication - Examples from Corpus Brasileiro
Essa assinatura que eu dei nesse relatorio da oposicao euestou confirmando.‘This signature I did on this opposition’s report I amconfirming.’
A assinatura que ocorreu na tarde, foi mostrado na noite nostelejornais RedeTV!‘The signature/signing that took place in the afternoon, wasshown on the evening news programs in RedeTV!’
Livy Real, Alexandre Rademaker TYTLES, ESSLLI, Barcelona - 2015
An Overview on Portuguese Nominalizations
Motivations Lexical Resources and Corpora -Ura nominalizations in Portuguese First Results Co-predication General Conclusions
Co-predication - Google
Search on Google with regular expression following previousproposed structures (Jezek & Melloni, 2011)
Constraints on co-predication with ANs
i. Split co-predication between main clause and subordinate clause;ii. Temporal disjunction between the two predications;iii. Omission of the internal argument.
Example:”assinatura *, que *,”
Livy Real, Alexandre Rademaker TYTLES, ESSLLI, Barcelona - 2015
An Overview on Portuguese Nominalizations
Motivations Lexical Resources and Corpora -Ura nominalizations in Portuguese First Results Co-predication General Conclusions
Co-predication - Google
Again, we did not find any standard co-predication!
Example
Acho que fica mais escuro e brilhante que com a opcao de pinturametalizada (que custa quase 1000 euros).‘I think it gets darker and brighter than with the option ofmetallic painting (which costs about 1000 euros)’
Livy Real, Alexandre Rademaker TYTLES, ESSLLI, Barcelona - 2015
An Overview on Portuguese Nominalizations
Motivations Lexical Resources and Corpora -Ura nominalizations in Portuguese First Results Co-predication General Conclusions
Co-predication - Again!
We produced sentences with rare and very frequent words formedby -ura and have tested with 3 native speakers (not related tolinguistic fields):
A enxaguadura, que levou uma hora, ficou imunda.‘The rinse, which took an hour, was filthy.’
A abertura, que levou uma hora, ficou otima.‘The opening, which took an hour, was great.’
A arquitetura, que levou 5 anos, e quase toda produzida porNeymer.‘The architecture, which took five years, is almost entirelyproduced by Neymer.’
Livy Real, Alexandre Rademaker TYTLES, ESSLLI, Barcelona - 2015
An Overview on Portuguese Nominalizations
Motivations Lexical Resources and Corpora -Ura nominalizations in Portuguese First Results Co-predication General Conclusions
Co-predication
Surprise again!!! Almost none of the sentences were accepted bythe three speakers!
Livy Real, Alexandre Rademaker TYTLES, ESSLLI, Barcelona - 2015
An Overview on Portuguese Nominalizations
Motivations Lexical Resources and Corpora -Ura nominalizations in Portuguese First Results Co-predication General Conclusions
Co-predication - Speakers’ evaluation
?!! A abertura, que levou uma hora, ficou otima.‘The opening, which took an hour, was great.’
??! A arquitetura, que levou 5 anos, e quase todaproduzida por Neymer.‘The architecture, which took five years, is almostentirely produced by Neymer.’
??? A enxaguadura, que levou uma hora, ficou imunda.‘The rinse, which took an hour, was filthy.’
Word Frequency in Corpus BrasileiroAbertura 70699
Arquitetura 25669Enxaguadura 12
Livy Real, Alexandre Rademaker TYTLES, ESSLLI, Barcelona - 2015
An Overview on Portuguese Nominalizations
Motivations Lexical Resources and Corpora -Ura nominalizations in Portuguese First Results Co-predication General Conclusions
Co-predication - Hypothesis
Hypothesis: it seems that there is a relation between frequency ofuse and grammatically of co-predications.
Livy Real, Alexandre Rademaker TYTLES, ESSLLI, Barcelona - 2015
An Overview on Portuguese Nominalizations
Motivations Lexical Resources and Corpora -Ura nominalizations in Portuguese First Results Co-predication General Conclusions
General conclusions
A nominal form that has the type ‘rest’ belongs to the type‘event’ (as lavadura ‘washing’ and varredura ‘scan’), butco-predications between them are impossible;
A noun that belongs to the type ‘a given portion’ (as moedura‘milling’ and semeadura ‘sowing’) has always the followingtypes: ‘event’, ‘result’, ‘event.result’, but any co-predicationwith ‘a given portion’ is blocked;
All lexicalized senses can not be co-predicated with any othertype.
Livy Real, Alexandre Rademaker TYTLES, ESSLLI, Barcelona - 2015
An Overview on Portuguese Nominalizations
Motivations Lexical Resources and Corpora -Ura nominalizations in Portuguese First Results Co-predication General Conclusions
General Remarks
It seems that the occurrence of a given word in the dailylexicon does have a big effect on grammatically ofco-predication. Is this a cognitive issue? How to deal withthat?
If so, the formal approaches that consider co-predication onlyunder a syntactic-semantic perspective are loosing veryimportant features of this phenomenon as pointed out by(Real & Retore 2014). Now we have a stronger evidence toargue that co-predication has an idiosyncratic nature.
Livy Real, Alexandre Rademaker TYTLES, ESSLLI, Barcelona - 2015
An Overview on Portuguese Nominalizations
Motivations Lexical Resources and Corpora -Ura nominalizations in Portuguese First Results Co-predication General Conclusions
General Conclusions
Obrigada! Thank you!
Livy Real, Alexandre Rademaker TYTLES, ESSLLI, Barcelona - 2015
An Overview on Portuguese Nominalizations
Motivations Lexical Resources and Corpora -Ura nominalizations in Portuguese First Results Co-predication General Conclusions
More Remarks
(?) I cannot believe this so small construction took threeyears!
Nao acredito que esta construcao de merda levou tres anos!
(?) How a signature so ugly took three minutes?
Como uma assinatura tao feia levou tres minutos?
Livy Real, Alexandre Rademaker TYTLES, ESSLLI, Barcelona - 2015
An Overview on Portuguese Nominalizations
Motivations Lexical Resources and Corpora -Ura nominalizations in Portuguese First Results Co-predication General Conclusions
More Remarks
Should we start to search for the non-formal variables whichare around co-predications?
If type (mis)match is more related to pragmatics and contextsthan to true-formal variables, proposals that understand eachnoun as a type are not so problematic.
Livy Real, Alexandre Rademaker TYTLES, ESSLLI, Barcelona - 2015
An Overview on Portuguese Nominalizations