leiden university. the university to discover. dmt 2009 week 2 adriaan van der weel
Post on 19-Dec-2015
227 views
TRANSCRIPT
Leiden University. The university to discover.
DMT 2009Week 2
Adriaan van der Weel
Leiden University. The university to discover.
The digital lifecycle
Leiden University. The university to discover.
DMT 2009 Project
Erven F. Bohn correspondence-Transcription-Proofing-‘Enrichment’: encoding in TEI XML-Storage-Retrieval-Publication
Leiden University. The university to discover.
Recapitulation
Leiden University. The university to discover.
The computer as a medium 1
- From the book to digital ways of ordering information
- Media as knowledge machines- Characteristics of media determine the
nature of the knowledge machine
Leiden University. The university to discover.
The computer as a medium 2
- Publishing:
1.Text (like books, newspapers, etc.
2.Cultural heritage- Digital born vs digitised materials- Profit vs non-profit
Leiden University. The university to discover.
More than a medium
- The computer is
1. A medium (replacement for the book)
2. A Universal Machine
- As a Universal Machine the computer is the vehicle for ‘humanities computing’
- No clear division
Leiden University. The university to discover.
Digital ‘knowledge machine’
- More than a digital ‘book’:- Making text intelligent (main focus)- Using intelligent agents to mine text- People (Web 2.0)
- Markup- Documents- Rules- Styles
Leiden University. The university to discover.
The markup application
Leiden University. The university to discover.
HTML
- The WWW’s chief markup language- Markup application:
- Documents- Rules- Styles
- Homework- Seminar
Leiden University. The university to discover.
Document analysis
- What relevant knowable things exist- How can we classify them- What is their relationship
- Different categories of text have different relevant classes of knowable things, e.g.:
- Correspondence
- Drama
- Poetry
Leiden University. The university to discover.
From typography to markup
- Homo typographicus
- We are conditioned by books
- We live in the ‘Order of the Book’- Books order and structure:
- The ‘libroverse’ contains books
- Inside each book- Typography is an implicit structuring and
ordering device- Computers need explicit instructions:
- Markup: HTML, XML, TEI, etc.
Leiden University. The university to discover.
XML basics
1. Application-Document instance -Validation (DTD, Schema) -Styles/transformations (Week 5-8)2. The markup language-Elements-Attributes-Entities-ASCII and Unicode (next week)
Leiden University. The university to discover.
1. The markup application
Leiden University. The university to discover.
2. The markup language
XML Declaration
Document Type Declaration
Meta information
Text
Leiden University. The university to discover.
Elements
- Are like nouns- Recognizable structural components in the
text- Have a name: an opening and a closing tag- Have content- Examples:
- <p>This is the first lecture of Digital Media Technology.</p>
- <name>William Shakespeare</name>- NB: empty elements (<lb/>)
Leiden University. The university to discover.
Attributes
- Are like adjectives- Indicate properties of an element- Have a name and a value
- Examples:
- <p n="1">This is the first lecture of Digital Media Technology.</p>
- <name type="person">William Shakespeare</name>
Leiden University. The university to discover.
Entities
- Entities indicate the occurrence of special text string that will be replaced by a different literal value.
- They are used for:
- Special characters (i.e. characters that are not part of the ASCII character table)• Example: © will be rendered as ©
- Strings • Example: &BDMS; could be rendered as Book and Digital Media Studies
Leiden University. The university to discover.
Document Type Definition
- The ‘rules’- Also: schema
Leiden University. The university to discover.
Validation
- Checking the document instance against the rules laid down in the Document Type Definition (DTD) or schema
Leiden University. The university to discover.
The markup application