xml and general dutch dictionary (anw) van der kamp, lexical databases and digital tools, april 29...
TRANSCRIPT
![Page 1: XML and General Dutch Dictionary (ANW) Van der Kamp, Lexical databases and digital tools, april 29 th, 2005, 1 Peter van der Kamp kamp@inl.nl](https://reader036.vdocuments.net/reader036/viewer/2022062318/551b1d695503462e578b6136/html5/thumbnails/1.jpg)
XML and General Dutch Dictionary (ANW)
Van der Kamp, Lexical databases and digital tools, april 29th, 2005, 1
Peter van der Kamp
www.inl.nl
![Page 2: XML and General Dutch Dictionary (ANW) Van der Kamp, Lexical databases and digital tools, april 29 th, 2005, 1 Peter van der Kamp kamp@inl.nl](https://reader036.vdocuments.net/reader036/viewer/2022062318/551b1d695503462e578b6136/html5/thumbnails/2.jpg)
Topics
• Characteristics
• Schema
• XML Dictionary Editor
• Problems to be solved
Van der Kamp, Lexical databases and digital tools, april 29th, 2005, 2
![Page 3: XML and General Dutch Dictionary (ANW) Van der Kamp, Lexical databases and digital tools, april 29 th, 2005, 1 Peter van der Kamp kamp@inl.nl](https://reader036.vdocuments.net/reader036/viewer/2022062318/551b1d695503462e578b6136/html5/thumbnails/3.jpg)
Characteristics
Online dictionary, no printed version
Dutch language (incl. Flanders) from 1970 - 2018
Based on a corpus of 100 mio words
Elaborated microstructure
XML
Van der Kamp, Lexical databases and digital tools, april 29th, 2005, 3
![Page 4: XML and General Dutch Dictionary (ANW) Van der Kamp, Lexical databases and digital tools, april 29 th, 2005, 1 Peter van der Kamp kamp@inl.nl](https://reader036.vdocuments.net/reader036/viewer/2022062318/551b1d695503462e578b6136/html5/thumbnails/4.jpg)
Schema characteristics
• Divided into 12 subschemas
• Currently all elements: zero or more occurrences except headword
• Currently 186 atomic elements
• Many enumerations (378, to be used as controlled vocabulary)
• Some elements allowed at different levels
Van der Kamp, Lexical databases and digital tools, april 29th, 2005, 4
![Page 5: XML and General Dutch Dictionary (ANW) Van der Kamp, Lexical databases and digital tools, april 29 th, 2005, 1 Peter van der Kamp kamp@inl.nl](https://reader036.vdocuments.net/reader036/viewer/2022062318/551b1d695503462e578b6136/html5/thumbnails/5.jpg)
Schema
Van der Kamp, Lexical databases and digital tools, april 29th, 2005, 5
Entry
PoS Sense
Entry
PoS
![Page 6: XML and General Dutch Dictionary (ANW) Van der Kamp, Lexical databases and digital tools, april 29 th, 2005, 1 Peter van der Kamp kamp@inl.nl](https://reader036.vdocuments.net/reader036/viewer/2022062318/551b1d695503462e578b6136/html5/thumbnails/6.jpg)
XML Dictionary Editor
User requirements:
• Don’t want to work with tags
• Tags invisible
Van der Kamp, Lexical databases and digital tools, april 29th, 2005, 6
![Page 7: XML and General Dutch Dictionary (ANW) Van der Kamp, Lexical databases and digital tools, april 29 th, 2005, 1 Peter van der Kamp kamp@inl.nl](https://reader036.vdocuments.net/reader036/viewer/2022062318/551b1d695503462e578b6136/html5/thumbnails/7.jpg)
XML Dictionary Editor (cont’d)
User requirements:
• Form like input• Use of predefined lists (controlled vocabulary)
Van der Kamp, Lexical databases and digital tools, april 29th, 2005, 7
![Page 8: XML and General Dutch Dictionary (ANW) Van der Kamp, Lexical databases and digital tools, april 29 th, 2005, 1 Peter van der Kamp kamp@inl.nl](https://reader036.vdocuments.net/reader036/viewer/2022062318/551b1d695503462e578b6136/html5/thumbnails/8.jpg)
XML Dictionary Editor (cont’d)
User requirements:
• Insert, add and remove elements must be easy
Van der Kamp, Lexical databases and digital tools, april 29th, 2005, 8
![Page 9: XML and General Dutch Dictionary (ANW) Van der Kamp, Lexical databases and digital tools, april 29 th, 2005, 1 Peter van der Kamp kamp@inl.nl](https://reader036.vdocuments.net/reader036/viewer/2022062318/551b1d695503462e578b6136/html5/thumbnails/9.jpg)
XML Dictionary Editor (cont’d)
Van der Kamp, Lexical databases and digital tools, april 29th, 2005, 9
User requirements: • Hide/show elements
Technical requirements• Subschema enabled
![Page 10: XML and General Dutch Dictionary (ANW) Van der Kamp, Lexical databases and digital tools, april 29 th, 2005, 1 Peter van der Kamp kamp@inl.nl](https://reader036.vdocuments.net/reader036/viewer/2022062318/551b1d695503462e578b6136/html5/thumbnails/10.jpg)
XML Dictionary Editor (cont’d)
XML editor, but…
…which one?
XMLWriter
Van der Kamp, Lexical databases and digital tools, april 29th, 2005, 10
![Page 11: XML and General Dutch Dictionary (ANW) Van der Kamp, Lexical databases and digital tools, april 29 th, 2005, 1 Peter van der Kamp kamp@inl.nl](https://reader036.vdocuments.net/reader036/viewer/2022062318/551b1d695503462e578b6136/html5/thumbnails/11.jpg)
XML Dictionary Editor (cont’d)
Currently the best possible solution:
Authentic (free XML content editor from Altova)
StyleVision (e-forms and stylesheet designer from Altova)
(http://www.altova.com)
Van der Kamp, Lexical databases and digital tools, april 29th, 2005, 11
![Page 12: XML and General Dutch Dictionary (ANW) Van der Kamp, Lexical databases and digital tools, april 29 th, 2005, 1 Peter van der Kamp kamp@inl.nl](https://reader036.vdocuments.net/reader036/viewer/2022062318/551b1d695503462e578b6136/html5/thumbnails/12.jpg)
![Page 13: XML and General Dutch Dictionary (ANW) Van der Kamp, Lexical databases and digital tools, april 29 th, 2005, 1 Peter van der Kamp kamp@inl.nl](https://reader036.vdocuments.net/reader036/viewer/2022062318/551b1d695503462e578b6136/html5/thumbnails/13.jpg)
XML Dictionary Editor: problems
Problem: hide element = delete element
Hide element important due to size of entry
Solution (to be implemented):
• Extra element <hide> in schema
• Checkbox as ‘data entry device’
• When unchecked: perform hide
Disadvantage:
<hide> is noise in dictionary entry
Van der Kamp, Lexical databases and digital tools, april 29th, 2005, 13
![Page 14: XML and General Dutch Dictionary (ANW) Van der Kamp, Lexical databases and digital tools, april 29 th, 2005, 1 Peter van der Kamp kamp@inl.nl](https://reader036.vdocuments.net/reader036/viewer/2022062318/551b1d695503462e578b6136/html5/thumbnails/14.jpg)
![Page 15: XML and General Dutch Dictionary (ANW) Van der Kamp, Lexical databases and digital tools, april 29 th, 2005, 1 Peter van der Kamp kamp@inl.nl](https://reader036.vdocuments.net/reader036/viewer/2022062318/551b1d695503462e578b6136/html5/thumbnails/15.jpg)
![Page 16: XML and General Dutch Dictionary (ANW) Van der Kamp, Lexical databases and digital tools, april 29 th, 2005, 1 Peter van der Kamp kamp@inl.nl](https://reader036.vdocuments.net/reader036/viewer/2022062318/551b1d695503462e578b6136/html5/thumbnails/16.jpg)
XML Dictionary Editor: problems
Problem: visualize difference between container elements and atomic elements.
Current implementation requires some schema knowledge
Van der Kamp, Lexical databases and digital tools, april 29th, 2005, 16
![Page 17: XML and General Dutch Dictionary (ANW) Van der Kamp, Lexical databases and digital tools, april 29 th, 2005, 1 Peter van der Kamp kamp@inl.nl](https://reader036.vdocuments.net/reader036/viewer/2022062318/551b1d695503462e578b6136/html5/thumbnails/17.jpg)
Conclusion / future work
Developing forms easy
Current implementation satisfying
Database solution (relational vs. xml)
Retrieval
Easy use of (X)query language
Van der Kamp, Lexical databases and digital tools, april 29th, 2005, 17