serbia- f orum cultural heritage digitization project with emphasis on semantic indexing
DESCRIPTION
Serbia- F orum Cultural Heritage Digitization Project with Emphasis on Semantic Indexing. Aleksandar Mihajlović Vladisav Jelisavčić Bojan Marinković Zoran Ognjanović Veljko Milutinović Zoran Markovi ć Mathematical Institute of the Serbian Academy of Arts and Sciences - PowerPoint PPT PresentationTRANSCRIPT
1/47
Serbia-ForumCultural Heritage Digitization Project with Emphasis on Semantic Indexing
Aleksandar Mihajlović Vladisav JelisavčićBojan MarinkovićZoran OgnjanovićVeljko Milutinović
Zoran Marković
Mathematical Institute of the Serbian Academy of Arts and Sciences
Knez Mihailova 36, 11000 Belgrade, Serbia
2/47
What is Serbia-Forum?• Two characters– Credible encyclopedic articles written by credible authors
(similar to already present e-encyclopedias)– Digitization of cultural and national heritage
• Books• Archive documents• Images of culturally significant Serbian works of Art• 3D scanning of culturally significant places
– Churches– Monasteries– Museums– Homes of famous Serbs
• Etc• (quick digression on the next slide)
3/47
Motivation: Why Serbia- Forum?
• Research– Semantic searching
• Semantic text and image search
• Presence– Credible source of information about cultural heritage of Serbia– Free, centralized and easily accessible
• Preservation– Prolong the life of articles of cultural heritage for the generations
to come– Back up of historically and culturally significant documents
• Natural disasters• War • Examples on the next slide
4/47
Motivation for Digitization• Why is “Preservation” a top priority?
5/47
What is Serbia-Forum?• Axioms holding Serbia-Forum together– Primary axioms (fixed to the serbia-forum project):• Content is selected and controlled by government funded and
owned cultural and academic institutions• Each document is copy protected by a different license • Quality NOT quantity
– Secondary axioms (implementable by other projects):• Semantic search• Version tracking of every document• Information about the author of each document is supplied
(biography)
6/47
What already exists• Wikipedia• Europeana• Austria-Forum . . . Europaea-Forum• …• Why Serbia-Forum?
7/47
Wikipedia
8/47
Wikipedia• One product of Wikimedia
9/47
Wikipedia• A broad collection of written knowledge in the
form of articles (knowledge concerning the whole world)
10/47
Wikipedia• A broad collection of written knowledge in the
form of articles (knowledge concerning the whole world)– Free encyclopedia– Distributed organization • Product localized by “lingual” regions
– Wikipedia Serbia, Wikipedia Poland, Wikipedia Italy itd…
11/47
Wikipedia
12/47
Wikipedia• Wikipedia is based on free authorship merits of
the “CC license” (Creative Commons license)– Every article written for wikipedia may be used,
printed, sold and changed freely without breaking the law.
– One of the main factors for the success and expansion of Wikipedia
– Every user can write an article about any topic – Every article is apt to changes
13/47
Wikipedia• Articles are available in many languages – Wikipedia in English contains about 3 500 000 articles,– Wikipedia Germany about 1 385 000, – Wikipedia Spain about 880 000, etc…– Wikipedia Serbia contains over 156 000 articles in Serbian.
14/47
15/47
Wikipedia• Even if there exist a significant number of documents in
Serbian, a significantly smaller number of articles represent concepts related to Serbian cultural heritage.
• In the midst of insufficient representation, a need arises for the systematic collection and presentation of concepts, significant historical figures and events in Serbian history and culture.
• One solution is: http://wwww.serbia-forum.org
16/47
Wikipedia• Strategic constraints of Wikipedia (1/2)
– Encyclopedia of knowledge concerning the whole world, not only Serbia • Serbia and Wikipedia are connected by only one string: The Serbian language
– Tracking article changes is a hassle – There is only one valid license: “Creative Commons”
• Which makes thing a bit inflexible – The free authorship approach
sometimes doesn’t yield satisfactory results (credibility)• No editorial board to determine credibility of author and article
– Articles can contain stereotypes, false truths and biased information
17/47
Wikipedia• Strategic constraint of Wikipedia (2/2)– Contains only encyclopedic articles;
• Weak assortment of books• Weak assortment of documents • No archive contents
– Wikimedia Commons? Wikisource? Wikibooks?• Repeat: Only one valid license: “Creative Commons” • Out of context
18/47
Europeana• Collection of information concerning European cultural
heritage – Access to millions of books, pictures, museum pieces, movies
and archive data– It’s not an encyclopedia
• Under the supervision of the Europeana foundation– Over 2000 institutions all over Europe – Every institution individually is responsible for the selection and
presentation of its contents– Contribution is exclusively reserved for institutions
19/47
20/47
Europeana• Advantages?– Rights are protected– Credibility is ensured– Content is diverse
• Disadvantages?– It is just a portal– Collection of documents vs encyclopedia
21/47
Austria Forum
22/47
Austria-Forum• Collection of knowledge about Austria
23/47
Austria-Forum• Over 20.000 units of content
24/47
Austria-Forum• Over 20.000 units of content
– Indexing of content
25/47
Austria-Forum• Over 20.000 units of content
– Indexing of content – Biografies of the most renown Austrians
26/47
Austria-Forum• Biographies
27/47
Austria-Forum• Over 20.000 units of content
– Indexing of content – Biografies– Homeland Lexicon “Heimatlexikon”
• Popular themes are depicted through short films
28/47
Austria-Forum• Homeland Lexicon“Heimatlexikon”
29/47
Austria-Forum• Over 20.000 units of content
– Indexing of content – Biografies– Homeland Lexicon “Heimatlexikon”– Web books Austria
• Digitized books
30/47
Austria-Forum• Web books Austria
31/47
Austria-Forum• Over 20.000 units of content
– Indexing of content – Biografies– Homeland Lexicon “Heimatlexikon”– Web books Austria
• Digitized books • Web books internet application for reading digitized books
32/47
Austria-Forum• Web books Austria
33/47
Austria-Forum• Over 20.000 units of content
– Indexing of content – Biografies– Homeland Lexicon “Heimatlexikon”– Web books Austria – Austria-Forum Society
• Every member can make his/her own personal homepage and can personally contribute written articles to the Austria-Forum article collection
• Every member can contribute to the development of a single article – Similar to wikipedia
• Every change made in the article is documented, thus all changes can be tracked
34/47
Serbia-Forum
http://www.serbia-forum.org
35/47
Serbia-Forum• “The first and unofficial” version of the web
presentation of “Serbia-Forum” is already online .
36/47
Serbia-Forum
37/47
Serbia-Forum• Current content (current state)– Digitized documents – Digitized books– Photo gallery– Articles underway• Selected authors• Authors from various trusted societies & organizations
38/47
Serbia-Forum• Digitized documents
39/47
Serbia-Forum• Digitized books
40/47
Serbia-Forum• Photo gallery
41/47
Serbia-Forum• Semantic indexing of content (1/2)– Smart text content searching• Broad search queries lead to specific results
1850 – 1860
Austro-Hungary
New York, 1943
Nikola Tesla
Born 1856 in Austro-HungaryDied in New York City in 1943
42/47
Serbia-Forum• Semantic indexing of content (2/2)– Smart image content searching• Correlating an image to similar images
General Pavle Jurišić Sturm
Marshall Josip Broz Tito
43/47
Serbia-Forum• Articles
44/47
Serbia-Forum• Primary character– Content is under the supervision of credible government
institutions whose purpose is to preserve and all aspects of heritage in Serbia
– Infrastructure which will dictate how each document will be protected and by which license(Cultural heritage is rich in content that cannot be covered by the CC license)
– Quality and not quantity (small number of pearls)– Translation is adapted to the region of the user
45/47
Serbia-Forum• Secondary character– Semantic search – Version tracking of each document– Information about each article author(s) is available– Editorial board
46/47
Active Participating Institutions in Serbia
• Archives of Serbia • SANU (Serbian Academy of Arts and Science)• National Library of Serbia• Historical Archive of Belgrade• Filological Faculty of the University of Belgrade, Serbia• Faculty of Political Science of the University of
Belgrade, Serbia• And more…..
47/47
www.serbia-forum.org
Aleksandar Dedic Decembar 2010