xml 20111006 hurd
TRANSCRIPT
W E N D Y H A G E N M A I E R & C A R I S H U R D F A L L 2 0 1 1
Introduction to XML
The Machine is Us/ing Us
What is XML?
• stands for “eXtensible Markup Language” • is a language for documents containing structured information
• structured information – contains both content (words, pictures, etc.) and some indication of what role that content plays • XML vs. HTML: XML was designed to transport and store data. HTML was designed to display data.
Tags?! What are those?
<?xml version= “1.0”?> <example> <quote type=“movie”>The dude abides.</quote> </example>
More info
<?xml version= “1.0”?> <example> <quote>The dude abides.</quote> </example>
F. I. L. O. First In Last Out
We’ve got the M, Let’s return to the X
X stands for “extensible” in XML – remember?
• Extensible means there is NO predefined tag set • XML is pretty general, in that it allows users to define their
own tags and the relationships between them
DTDs Make Rules for Tags
• Without rules, your document and tags will be worthless to others • Rules are set with a DTD
DTD = Document Type Definition
Your DTD defines all kinds of things! • what tags can be nested inside of other tags • what kind of information can be stored inside certain tags
Sample DTD #1
<?xml version =“1.0” encoding=“US-ASCII”?>…our XML declaration statement
<!DOCTYPE customerorder [ <!ELEMENT Customer (Name, Email)> <!ELEMENT Name (#PCDATA)> <!ELEMENT Email (#PCDATA)> <!ELEMENT Physical Address (street, unit#*,
city, state, zipcode)> <!ELEMENT street (#PCDATA)> <!ELEMENT unit# (#PCDATA)> <!ELEMENT city (#PCDATA)> <!ELEMENT state (#PCDATA)> <!ELEMENT zipcode (#PCDATA)> <!ELEMENT Shipment (ShipDate, ShipMode)> <!ELEMENT ShipDate (#PCDATA)> <!ELEMENT ShipMode (#PCDATA)> ]>
Parent and Child Elements ¡ Child elements are the only ones that can be
nested inside of Parent elements ¡ An example of DTD’s rule making abilities!
Parent = Customer ¡ Children =
÷ Name ÷ Email
Sample DTD #2 <!DOCTYPE TVSCHEDULE [ <!ELEMENT TVSCHEDULE (CHANNEL+)> <!ELEMENT CHANNEL (BANNER,DAY+)> <!ELEMENT BANNER (#PCDATA)> <!ELEMENT DAY (DATE,(HOLIDAY | PROGRAMSLOT+)+)> <!ELEMENT HOLIDAY (#PCDATA)> <!ELEMENT DATE (#PCDATA)> <!ELEMENT PROGRAMSLOT (TIME,TITLE,DESCRIPTION?)> <!ELEMENT TIME (#PCDATA)> <!ELEMENT TITLE (#PCDATA)> <!ELEMENT DESCRIPTION (#PCDATA)> <!ATTLIST TVSCHEDULE NAME CDATA #REQUIRED> <!ATTLIST CHANNEL CHAN CDATA #REQUIRED> <!ATTLIST PROGRAMSLOT VTR CDATA #IMPLIED> <!ATTLIST TITLE RATING CDATA #IMPLIED> <!ATTLIST TITLE LANGUAGE CDATA #IMPLIED> ]>
Element Type Declarations
Element Type Declaration Definition
EMPTY No content allowed
ANY Any content allowed
a,b Specific order (a, followed by b)
X|Y Either/or (X or Y)
a,b,(X|Y) Groups (a, then b, then X or Y)
* Zero or more elements allowed
+ One or more elements allowed
? Zero or one element allowed
One and only one allowed
#PCDATA Parsed character data
Sample DTD #2 (again) <!DOCTYPE TVSCHEDULE [ <!ELEMENT TVSCHEDULE (CHANNEL+)> <!ELEMENT CHANNEL (BANNER,DAY+)> <!ELEMENT BANNER (#PCDATA)> <!ELEMENT DAY (DATE,(HOLIDAY | PROGRAMSLOT+)+)> <!ELEMENT HOLIDAY (#PCDATA)> <!ELEMENT DATE (#PCDATA)> <!ELEMENT PROGRAMSLOT (TIME,TITLE,DESCRIPTION?)> <!ELEMENT TIME (#PCDATA)> <!ELEMENT TITLE (#PCDATA)> <!ELEMENT DESCRIPTION (#PCDATA)> <!ATTLIST TVSCHEDULE NAME CDATA #REQUIRED> <!ATTLIST CHANNEL CHAN CDATA #REQUIRED> <!ATTLIST PROGRAMSLOT VTR CDATA #IMPLIED> <!ATTLIST TITLE RATING CDATA #IMPLIED> <!ATTLIST TITLE LANGUAGE CDATA #IMPLIED> ]>
Attribute List
<example> <quote type=“movie”>The dude abides.</quote> </example>
<!ELEMENT quote (#PCDATA)> <!ATTLIST quote type CDATA #REQUIRED>
Attribute example in XML:
Attribute example defined in DTD:
http : / / t inyurl .com/3r5o9y2
Now let’s try doing it!
1. Download two files to your desktop
2. Save them in a new folder called “xml_practice”
3. Open them in Oxygen
XSD versus DTD
Applying XML
EAD! • stands for Encoded Archival Description • is a XSD (but all the rules are already made for you!) • started in 1993 as part of The Berkeley Project at the Univ. of California – Berkeley • based on commonalities between finding aids sent in by archivists around the world • The prototype of EAD was released on February 26,1996! • Today the official EAD standards are maintained by The Library of Congress and the Society of American Archivists. • If you’re in Archival Enterprise I and have any questions about the EAD assignment, feel free to come to the lab and ask us questions!
http://www.loc.gov/ead/ http://www.archivists.org/saagroups/ead/
Applying XML (cont’d)
There are hundreds, but just to name a few…
• XHTML http://www.w3.org/TR/xhtml1/ • CML http://www.xml-cml.org/ • WML http://www.openmobilealliance.org/Technical/wapindex.aspx • ThML http://www.ccel.org/ThML/ • WebML http://webml.org/webml/page1.do • LegalXML http://www.legalxml.org/committees/index.shtml • Text Encoding Initiative (TEI): http://www.tei-c.org/index.xml
Resources
• http://www.w3.org/XML/ • http://www.w3schools.com/xml/default.asp • http://www.w3schools.com/dtd/default.asp • http://www.tizag.com/xmlTutorial/ • http://www.loc.gov/ead/ • http://www.archivists.org/saagroups/ead/ • XML syntax validator:
http://www.w3schools.com/xml/xml_validator.asp • More about ATTLIST:
http://www.w3schools.com/dtd/dtd_attributes.asp