agora: putting museum objects into their art-historic context

Download Agora: putting museum objects into their art-historic context

Post on 15-Jan-2015




1 download

Embed Size (px)


The digital era has presented big challenges, but also great opportunities for the museum world. One of these opportunities is the way museums can open up their collections to the public. Many museums are now actively exploring possibilities to present their collections online for visitors who cannot come to the museum, or to show objects for which they do not have space in the exhibition halls. Often they will put together themed Web sites for online exhibitions in which objects are presented in a certain context. However, these themed Web sites usually only cover only a small part of their collection. For the majority of the objects, the context is not made explicit. In the Agora project, we aim to make this context explicit in an automatic way in order to help users understand and interpret museum objects. We do this by linking museum objects to historical events and explicitly presenting these links in an event-driven browsing environment. In the first part of my talk, I will explain the theoretical framework we have developed in the Agora project to represent historical contexts as well as the general challenges to the project. In the second part of my talk, I will focus on the particular challenges in information extraction for building the event thesaurus and linking museum objects. These slides are from a presentation given at the Eurecom seminar on July 20 2012


  • 1. Agora: putting museum objectsinto their art-historic contextMarieke van EURECOM July 2012

2. Introduction BA, MA & PhDComputational Linguistics/Information Extraction@Tilburg University Since 2009: SemWeb group@VU University Amsterdam 3. Overview The Agora Project Digital Hermeneutics Building an Event Thesaurusfor Dutch Experiments & Results OutlookImage src: 4. The Agora Project Collaboration VU CS &History departments,Netherlands Institute forSound and Vision andRijksmuseum Amsterdam Facilitate and investigatedigitally mediated publichistory 5. Digitising Heritage Galleries, libraries, archives andmuseums (GLAMS) are digitisingtheir data and presenting it online This changes the role of GLAMSfrom information interpreters toinformation providers In the online setting, objects caneasily start to lead their own lives Image source: 6. Digital Hermeneutics An object on its own has nomeaning; event descriptionsprovide historical context A single event only gives partof the historical context;chains of events (narratives)provide a more completeoverview Image src: IoPReKrojkY/s1600/42st.jpg 7. Event Dimension 19/12/1948rma:creationDate sem:hasBeginTimeStamp sem:hasBeginTimeStamp sem:Actorsem:Actor rdf:typerdf:typeNetherlandsrma:makerMohammedTohaPainting: Three Fighter Aircraft in the Sky sem:sem: rma:creationPlace hasActorhasActoragora:depictsEventagora:createsEvent Yogyakarta sem:hasPlaceMohammed Tohasem:Event rdf:typeThe Attack on sem:hasPlace rdf:type Paints "Three Fighterrdf:type sem:Event Yogyakarta Aircraft in the Sky" sem:Place 8. Narratives 1945 - 1946Armed sem:hasTimeStampConict sem: eventType The Attack onYogyakarta sem:hasPlaceIndonesia sem:hasActorKNIL agora:hasBiographicalRelation19/12/1948 - 31/12/1948Armed sem:hasTimeStampConict sem: eventTypeOperationCrow sem:hasPlaceSumatra sem:hasActorKNILagora:hasBiographicalRelation 01/03/1949sem:hasTimeStampAttack sem: eventType The Attack onYogyakarta sem:hasPlaceYogyakarta sem:hasActorKNIL 9. Event-driven Browsing 10. Event-driven Browsing 11. Event-driven Browsing 12. Building an Event Thesaurus There are no extensive structuredevent descriptions Rijksmuseum Amsterdam has aat list of 1,693 events: onlynames and very much focused on17th century Holland Our goal: create a list of historically relevant events provide actors, locations, times & types for each eventImage src: 13. First Attempt Pattern based event-nameextraction In Dutch Wikipedia we found 2,444 event candidates 1209 (56.3%) correct 169 (13.9%) partially correct Off-the-shelf named entityrecognition (P/R/F1) Person 77/77/77 Location 75/58/66 Organisation 32/37/34 Image src: %205.jpg 14. First Attempt Co-occurrence based event-relation nder only actor, location and/or date found for 392events 49.6% actor is correct 41.1% location is correct 51.5% date is correct Image src: %205.jpg 15. First Attempt Problems event element recognition: Shallow grammatical processing (post-war rebuilding and during the North sea ood recognised as 1 event) Missing locations (Battle of LOC pattern fails) No distinction between entities and action nouns (German Occupancy vs German Occupants look the same for the approach) Named Entity Recogniser not suited for domain Image src: %205.jpg 16. First Attempt Problems event relationnder: Relies on redundancy inthe data, only works forpopular events Too coarse-grained (whowere the actors/locationsin WWII) Evaluation is hard!Image src: 17. Back to the drawing board... Analysis of event names Combinations of sortal nouns witha PP and a named entity e.g., Battleof Stalingrad, Death of John Lennon Combinations of nominalized verbswith a PP and a named entity e.g,Excavation of Troy, Election ofObama. Combinations of a referentialadjective with an event type andnamed entity e.g., the Americaninvasion of Iraq. Transparent proper names: GreatWar Opaque proper names: Eventnames that can not be decomposedon morphological grounds e.g.,Holocaust, Spanish Fury Image src: molinotrashre10.jpg 18. Back to the drawing board... Improve Named EntityRecognition Add gazetteers forhistorical names Post-processing for titlesand improved NEboundaries Image src: molinotrashre10.jpg 19. Back to the drawing board... Finding Event Relations Use structure Wikipedia/DBpedia Shallow parsing Hierarchies of actors &locationsImage src: 20. Current Work Spotlight (P/R/F) Stanford (P/R/F1) Freire (P/R/F1)Person54.05/7.52/13.2058.60/34.46/43.40 79.17/71.16/74.95 Location 64.52/30.77/41.67 67.19/66.15/66.67 80.00/61.54/69.57Organisation0/0/0 9.78/25.71/14.1789.66/74.29/81.25 Still some work to be done, butFreire et al. (2012) shows that smartfeatures can work with small amountsof training data Combine classiers Add post-processing MISC Class remains to be done... 21. Current WorkWordPOS CHUNK NERU.N.NNP I-NPI-ORGofficialNNI-NPOEkeus NNP I-NPI-PERheads VBZ I-VPOfor INI-PPOBaghdad NNP I-NPI-LOC. . O O [CoNLL2003]focus,minthree,mintwo,minone,plusone,plustwo,fnfreq,lnfreq,ncfreq,orgfreq,geo,n,v,a,adv,pn,cap,allcaps,beg,end,length,capfreq,class"is","wood",")","and","painted","dark",0,0,0,2.45253198865684,0,0,0,1,0,0,0,0,0,0,2,0,"O""painted",")","and","is","dark","grey",0,0,0,0,0,0,0,0,1,0,0,0,0,0,7,0,"O""dark","and","is","painted","grey",".",0,0,0,0.493875418347986,0,0,1,0,1,0,0,0,0,0,4,0,"O""grey","is","painted","dark",".","William",0,0,0,0.0768052510316108,0,1,1,1,1,0,0,0,0,0,4,0,"O"".","painted","dark","grey","William","Herschel",0,0,0,2.36647279037729,0,0,0,0,0,0,0,1,0,0,1,0,"O""William","dark","grey",".","Herschel","made",8.2034429051892,3.27892030900003,0,4.67158565874127,0,0,0,0,0,0,1,0,0,0,7,0,"B-PER""Herschel","grey",".","William","made","many",2.36726761611533,2.39936346938848,0,0.443930767784,0,1,1,0,0,0,1,0,0,0,8,0,"I-PER""made",".","William","Herschel","many","telescopes",0,0,0,0.493875418347986,0,0,0,1,1,0,0,0,0,0,4,0,"O""many","William","Herschel","made","telescopes","of",0,0,0,0.0768052510316108,0,0,0,0,1,0,0,0,0,0,4,0,"O""telescopes","Herschel","made","many","of","this",0,0,0,0,0,0,0,0,0,0,0,0,0,0,10,0,"O" [Freire et al. 2012] 22. Current Work Build smarter extractors forevent names First focus on regular event names (e.g., Battle of LOC, War of YEAR) Use knowledge about action nouns vs static nouns (WordNet) 23. The Story So Far It takes time to learn tocommunicate in aninterdisciplinary project Dont try to solve too muchin one go Cycles of error analysis Domain adaptation is difcult:optimise for precision 24. Outlook Redesign of Agora demo (newversion autumn/winter) Include different perspectives(together with Semantics ofHistory) Ship model use case Historical Named EntityRecognition for English & Dutch 2nd round user studies (spring2013) 25. ? ? Questions?? src: Source: __SQUARESPACE_CACHEVERSION=1295297003883