2 dtd - validating xml documents
DESCRIPTION
DTD-XMLTRANSCRIPT
XML – A Quick Refresher
DTD | Atul Kahate 3
Exercise Create an XML document to store
student information in the following format: Roll Number Name Marks
Web Technologies XML J2EE Network Programming
Rank
DTD | Atul Kahate 4
Exercise We have a book titled Web
Technologies. Create an XML document that stores information about this book in the following format: Name: Web Technologies Part (1-3)
Chapters (Chapter 1-6) Sections (Section 1-3) Summary
DTD | Atul Kahate 5
Exercise Create an XML document to store
employee information in the following format: Employee Number Name Department Manager Name Projects assigned
Project 1 Project 2 Project 3
Designation
DTD | Atul Kahate 6
XML Naming rules Should contain at least one letter: a-z or
A-Z. Can start with an alphabet or
underscore. Can contain letters, digits, hyphens,
underscores, full stops.
Comments: Enclose in <!-- and --> tags. e.g. <!– This is a comment -->
DTD | Atul Kahate 7
Exercise Identify valid and
invalid element names from the list below <Project05> <PROJECT05> <Project.05> <_05Project> <project05> <project_05>
Identify valid and invalid element names from the list below <Project=05> <PROJECT:05> <Project 5> <Project%05> <05project> <.project.05>
DTD | Atul Kahate 8
Some XML Terminology DTD
Elements
Attributes
Entities
Markup
DTD | Atul Kahate 9
Terminology Snapshot<?xml version="1.0"?><?xml-stylesheet type="text/xsl" href="books_list.xsl"?>
<BOOKS><BOOK pubyear="1929">
<BOOK_TITLE>Look Homeward, Angel</BOOK_TITLE><AUTHOR>Wolfe, Thomas</AUTHOR>
</BOOK><BOOK pubyear="1973">
<BOOK_TITLE>Gravity's Rainbow</BOOK_TITLE><AUTHOR>Pynchon, Thomas</AUTHOR>
</BOOK><BOOK pubyear="1977">
<BOOK_TITLE>Cards as Weapons</BOOK_TITLE><AUTHOR>Jay, Ricky</AUTHOR>
</BOOK><BOOK pubyear="2001">
<BOOK_TITLE>Computer Networks</BOOK_TITLE><AUTHOR>Tanenbaum, Andrew</AUTHOR>
</BOOK></BOOKS>
XML tag
Root element
Element name
Element value
End element indicator
Start element indicator
Attribute name
Attribute Value
Document Type Definition (DTD)
DTD | Atul Kahate 11
Document Type Definition (DTD) Describes the components and
guidelines in an XML document Lists
Elements Attributes and their possible values Entities Interaction of all of the above
“Rule book” for an XML document
DTD | Atul Kahate 12
DTD Example<?xml version=“1.0” standalone=“yes”?><!DOCTYPE EMPLOYEE [
<!ELEMENT EMPLOYEE (ORG, NAME, DEPT, SALARY)><!ELEMENT ORG (#PCDATA)> <!ELEMENT NAME (#PCDATA)><!ELEMENT DEPT (#PCDATA)><!ELEMENT SALARY (#PCDATA)>
]><EMPLOYEE>
<ORG>test </ORG><NAME>Parag</NAME><DEPT>J2EE</DEPT><SALARY>10000</SALARY>
</EMPLOYEE>
DTD | Atul Kahate 13
Simple XML Example<?xml version=“1.0”?><message>
<header><date> 25 September 2004 </date><from> Me </from><to> You </to><subject> Test message </subject>
</header><body> Hello World</body><signature> XYZ company </signature>
</message>
DTD | Atul Kahate 14
DTD for this XML Example
<!ELEMENT message (header, body, signature)>
<!ELEMENT header (date, from, to, subject) >
DTD | Atul Kahate 15
Elements Used to organize information in an XML
document
Similar to HTML elements (e.g. <P> <P>)
Every XML document must have exactly one root element
Empty elements are shown by using <empty_element/>
DTD | Atul Kahate 16
Attributes Part of the element tag
Provide additional information about elements
Optional
Can be specified only in the element start tag
DTD Types
DTD | Atul Kahate 18
Types of DTD External
DTD and XML document are physically different documents
More common for professional documents
Internal DTD is declared inside the XML document Useful for simple documents
DTD | Atul Kahate 19
External DTD Declaration inside an XML document
<!DOCTYPE book SYSTEM http://www.example.com/dtd/book.dtd>
DTD declaration types SYSTEM – Specifies the URL from where the
parser can obtain the actual DTD PUBLIC – Does not specify a particular
location for the DTD
DTD | Atul Kahate 20
Example of External DTD Addressbook.xml<?xml version="1.0"?><!DOCTYPE DOCUMENT SYSTEM "addressbook.dtd"><ADDRESSBOOK>
<CONTACT><NAME> Name 1 </NAME><ADDRESS> Address 1 </ADDRESS><CITY> City 1 </CITY><PIN> Pin 1 </PIN><PHONE> Phone 1 </PHONE>
</CONTACT></ADDRESSBOOK>
Addressbook.dtd<!ELEMENT ADDRESSBOOK (CONTACT)><!ELEMENT CONTACT (NAME, ADDRESS, CITY, PIN, PHONE)><!ELEMENT NAME (#PCDATA)><!ELEMENT ADDRESS (#PCDATA)><!ELEMENT CITY (#PCDATA)><!ELEMENT PIN (#PCDATA)><!ELEMENT PHONE (#PCDATA)>
DTD | Atul Kahate 21
Internal DTD Example
<?xml version="1.0"?><!DOCTYPE EXAMPLE[
<!ELEMENT EXAMPLE (#PCDATA)>><EXAMPLE> Insert the comment:
</EXAMPLE>
DTD | Atul Kahate 22
Which DTD Type to Use? External DTDs
Allow sharing of one DTD among many XML documents
Allow keeping the structure (DTD) and data (XML) separate
Updates can be centralized Unnecessary duplications can be avoided
Internal DTDs Simpler to try out and test at first
DTD | Atul Kahate 23
Standalone XML Documents Do not have any external DTD May have an internal DTD Use of standalone keyword
<?XML version=“1.0” standalone=“yes”?>
… Use of the standalone keyword is
optional
DTD | Atul Kahate 24
Main Keywords used in DTD ELEMENT
Describes XML element type name and its permissible child elements
ATTLIST Declares XML element attribute names, plus
permissible/default values ENTITY
Declares special character references, text macros, or repetitive content
Declaring Elements in a DTD
DTD | Atul Kahate 26
Element markup
Symbol Name Description
<email>
Start tag
At the start of an element, the opening tag
Meeting Content In the middle of an element, its content
</email>
End tag At the end of an element, the closing tag
DTD | Atul Kahate 27
Element Markup – Note The element name in the start tag
and the end tag must match
e.g. the following is wrong<simple.text> This is wrong!
</simple.Text>
DTD | Atul Kahate 28
ELEMENT Declarations in a DTD – 1 ELEMENT tags are used to describe XML
elements in a DTD document The ELEMENT declaration can have one
of the following two forms<!ELEMENT name content_category><!ELEMENT name (content_model)> content_category and content_model
specify what kind of content can appear inside a given XML element
DTD | Atul Kahate 29
ELEMENT Declarations in a DTD – 2 content_category
Five types Any – Any well-formed XML data None (or Empty) – Cannot contain text or child
elements, but can contain attributes Text only – Can contain text, but no child elements Element only - Can contain child elements, but no text Mixed – Can contain a mixture of child elements and
text All these categories allow declarations of
attributes by using the ATTLIST tag
DTD | Atul Kahate 30
ELEMENT Declarations in a DTD – 3 ANY and EMPTY elements
Follow the first form of declaration, i.e. <!ELEMENT name content_cateogry>
ANY – Allows anything well-formed Example DTD declaration
<!ELEMENT AnythingAllowed ANY > Corresponding XML
<?xml version="1.0" encoding="utf-8" ?> <AnythingAllowed> <AChildElement>Hello</AChildElement> <AnotherChild> <ChildWithinChild>test</ChildWithinChild> </AnotherChild> </AnythingAllowed>
Or this <?xml version="1.0" encoding="utf-8" ?> <AnythingAllowed/>
DTD | Atul Kahate 31
ELEMENT Declarations in a DTD – 4 ANY and EMPTY elements … contd …
EMPTY – Cannot have text or sub-elements, but can contain attributes
Example DTD declarations <!ELEMENT employee EMPTY> <!ELEMENT building EMPTY>
Corresponding XML elements <employee></employee> <employee stillinservice = “true”/> <building name = “Main center”></building>
DTD | Atul Kahate 32
ELEMENT Declarations in a DTD – 5 Other categories
Element, Mixed, or PCDATA are used Syntax:
<!ELEMENT name (content_model) cardinality> Content models
Text only, Element only, or mixed Examples of each type follow
No content model is needed for ANY or EMPTY categories
DTD | Atul Kahate 33
ELEMENT Declarations in a DTD – 6 Content models
Text only <!ELEMENT name (#PCDATA) >
Element only <!ELEMENT name ((child1, child2) | (child3, child4)) >
Mixed <!ELEMENT name (#PCDATA | child1 | child2)* >
Two kinds of lists can appear within content models Sequence lists: Child elements must appear in the specified
order, using a comma to separate the element names as shown above
Choice lists: List of mutually exclusive child elements, separated by the pipe symbol as shown above
DTD | Atul Kahate 34
Sequence and Choice Lists – Another Example Choice Example
<!ELEMENT color (red | yellow | green)> Specifies that:
The color element must contain a red element, a yellow element, or a green element
Only one of these options can be chosen e.g. My favorite color is <color> <red> </red> </color> and
not so favorite color is <color> <yellow> </yellow> </color>.
Sequence and choice example:<! ELEMENT PersonName
((Mr | Ms | Dr), FirstName, MiddleName,
LastName))
>
DTD | Atul Kahate 35
ELEMENT Declarations in a DTD – 7 Text only (PCDATA) content
Only text data is allowed in the XML content DTD specifies this with the #PCDATA
keyword Example
DTD <!ELEMENT name (#PCDATA) >
XML <name>Atul Kahate</name>
DTD | Atul Kahate 36
ELEMENT Declarations in a DTD – 8 Element only content
Can contain child elements, but no text Example
DTD <!ELEMENT name (first, last) >
XML <name>
<first>Atul</first> </last>Kahate</last>
</name> The element name must contain exactly two child
elements first and last, and they must appear exactly once, in the specified sequence
DTD | Atul Kahate 37
ELEMENT Declarations in a DTD – 9 Mixed content
Allows child elements or text Example
DTD <!ELEMENT name (#PCDATA | (first, last) >
Allowed XMLs <name>Atul Kahate</name> OR <name>
<first>Atul</first> <last>Kahate</last>
</name>
DTD | Atul Kahate 38
Specifying Cardinality Used to specify how often an element or an element
group can repeat or be omitted ? specifies a zero or one occurrence
e.g. <!ELEMENT testing (one, two?, three)> Means that two can occur only once or not at all inside the
testing element * specifies zero or more times occurrence
e.g. <!ELEMENT match (result, round*)> Means that round can occur any number of times or not at
all inside the match element + specifies at least one or more occurrence
e.g. <!ELEMENT match (result, round+)> Means that round can occur at least once or more inside
the match element
DTD | Atul Kahate 39
Examples<!ELEMENT PersonName(
(Mr | Ms | Dr)?, FirstName, MiddleName*, LastName))>
<!ELEMENT PersonName(
SingleName |((Mr | Ms | Dr)?, FirstName, MiddleName*, LastName))
)>
DTD | Atul Kahate 40
Corresponding XML data <PersonName>
<Mr /><FirstName>Sachin</FirstName><LastName>Tendulkar</LastName>
</PersonName>
<PersonName><FirstName>Sachin</FirstName><LastName>Tendulkar</LastName>
</PersonName>
<PersonName><SingleName>Tendulkar</SingleName>
</PersonName>
DTD | Atul Kahate 41
Exercise For the following XML document, create
a DTD
<?xml version="1.0"?><ADDRESSBOOK>
<CONTACT><NAME> Name 1 </NAME><ADDRESS> Address 1 </ADDRESS><CITY> City 1 </CITY><PIN> Pin 1 </PIN><PHONE> Phone 1 </PHONE>
</CONTACT></ADDRESSBOOK>
DTD | Atul Kahate 42
Exercise Solution: DTD
<!ELEMENT ADDRESSBOOK (CONTACT)><!ELEMENT CONTACT (NAME, ADDRESS,
CITY, PIN, PHONE)><!ELEMENT NAME (#PCDATA)><!ELEMENT ADDRESS (#PCDATA)><!ELEMENT CITY (#PCDATA)><!ELEMENT PIN (#PCDATA)><!ELEMENT PHONE (#PCDATA)>
DTD | Atul Kahate 43
Exercises – 1 We want to keep the following
information regarding cricket scores. Suggest a DTD structure.
Batting Team Opposition Team Innings (1 or 2) Batting position (1 to 11) Batsman Name How Out? (e.g. caught Hayden, or not out) Bowler (e.g. McGrath, or not applicable) Runs scored
DTD | Atul Kahate 44
Exercises – 2 A college wants to maintain the following
information about its students. Design a DTD. Roll number of student Student name (Composed of first, middle, and last names; or as a
single name without any split) Trimester Number Subject Type (Compulsory or Elective) Maximum Marks Marks Obtained Total Maximum Marks Total Marks Obtained Percentage Result Rank
DTD | Atul Kahate 45
Exercises – 3 Create a DTD for this XML:<?xml version="1.0"?><order>
<items><item>
<item_code>A001</item_code><item_name>Book</item_name><item_quantity>2</item_quantity><item_rate>100</item_rate>
</item><item>
<item_code>B001</item_code><item_name>Watch</item_name><item_quantity>3</item_quantity><item_rate>75</item_rate>
</item></items><contact>
<customer_code>G6171612</customer_code><customer_name> test test</customer_name><customer_address>
<address_1>43 Navi Peth</address_1><address_2>Main building</address_2><city>Pune</city><state/><country>India</country><pin>411001</pin><phone_details>
<home_landline>25530833</home_landline><work>22981011</work><mobile>98111-32111</mobile>
</phone_details></customer_address>
</contact><payment_details>
<payment_method>credit card</payment_method><card_number>191921000102101188</card_number><brand>visa</brand><expiry_date>02-11</expiry_date><cheque_number/><cheque_issuing_bank/><amount>500</amount>
</payment_details></order>
Entities
DTD | Atul Kahate 47
Entities Unit of data Can contain binary data, images,
textual information, etc Included inside an XML document with
the & symbol Generally contain
Frequently used phrases Text strings Chunks of text
DTD | Atul Kahate 48
Entity Example DTD<!ELEMENT list (name*)><!ELEMENT name (#PCDATA)><!ENTITY prof "Professor">
XML<?xml version="1.0"?>
<!DOCTYPE list SYSTEM "professor.dtd">
<list><name>&prof; Douglas Comer</name><name>&prof; Andrew Tanenbaum</name>
</list>
DTD | Atul Kahate 49
Pre-defined Entities Available as a default
amp: ampersand (&) apos: apostrophe (‘) gt: greater than (>) lt: less than (<) quot: quotation mark (“)
Using the default entitiesPlease make sure that your offer is > $500Please make sure that your offer is > $500
Attributes
DTD | Atul Kahate 51
Usage of Attributes Example without attributes
<Person> <FirstName>Maithili</FirstName> <LastName>Shetty</LastName> <Department>Software</Department>
</Person> Same example, with all child elements
changed to attributes <Person FirstName=“Maithili”
LastName=“Shetty” Department=“Software” />
DTD | Atul Kahate 52
Specifying an Attribute
Symbol Description
< Start tag open delimiter
element.name Element name
attribute.name Attribute name
= Value indicator
‘ Attribute value start
Value Value of the attribute
‘ Attribute value end
> Start tag close delimiter
DTD | Atul Kahate 53
Attribute Declarations Used to describe the attributes inside an
element Syntax: <!ATTLIST element.name
attribute.definitions> Significance
Declares the names of the allowed attributes States the type of each attribute Makes it possible to specify a default value for each
attribute Each attribute declaration is as follows:
attribute.name attribute.type
DTD | Atul Kahate 54
Example Containing Entities and an Attribute<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE Invoice [
<!ENTITY copy "©"><!ENTITY Vendor "i-flex solutions limited"><!ENTITY Disclaimer "No warranty! © 2007 &Vendor;"><!ENTITY char_A "A"><!ELEMENT Invoice (Notice*)><!ATTLIST Invoicename CDATA #REQUIRED
><!ELEMENT Notice (#PCDATA)>
]><Invoice name="&Vendor;">
<Notice>&Disclaimer;</Notice><Notice>And here is &char_A;</Notice>
</Invoice>
DTD | Atul Kahate 55
Declaring Multiple Attributes<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE PersonName [
<!ELEMENT PersonName EMPTY><!ATTLIST PersonNametitle CDATA #REQUIRED
first CDATA #REQUIREDmiddle CDATA #REQUIREDlast CDATA #REQUIRED
>]><PersonName title="Mr" first="test" middle="test"
last="test"/>
DTD | Atul Kahate 56
Attribute Defaults We can specify if an attribute is
mandatoryAttribute Default Description
#REQUIRED Must appear
#IMPLIED Optional
#FIXED Optional, but if appears, must have the default value
Default value without any keyword
Optional, but if appears, can have any value conforming to its data type
DTD | Atul Kahate 57
Attributes - #REQUIRED DTD
<!ATTLIST Employee Height CDATA #REQUIRED>
Invalid XML <Employee>Hiten</Employee>
Valid XML <Employee
Height=“165”>Hiten</Employee>
DTD | Atul Kahate 58
Attributes - #IMPLIED DTD
<!ATTLIST Employee Height CDATA #IMPLIED>
Valid XML <Employee>Hiten</Employee>
Valid XML <Employee
Height=“165”>Hiten</Employee>
DTD | Atul Kahate 59
Attributes - #FIXED DTD
<!ELEMENT Employee (#PCDATA)> <!ATTLIST Employee Height CDATA #FIXED
“160”> Invalid XML
<Employee Height=“165”>Hiten</Employee> Valid XML
<Employee Height=“160”>Hiten</Employee> Valid XML
<Employee>Hiten</Employee>
DTD | Atul Kahate 60
Attributes – Default Values DTD
<!ELEMENT Employee (#PCDATA)> <!ATTLIST Employee Height CDATA “160”>
Valid XML <Employee Height=“165”>Hiten</Employee>
Valid XML <Employee Height=“160”>Hiten</Employee>
Valid XML <Employee>Hiten</Employee>
DTD | Atul Kahate 61
Important Attribute Types CDATA – Character data (Simple text
string) Enumerated values – One from a list of
values
DTD | Atul Kahate 62
CDATA Attribute Type Strings of characters
Any attribute that does not have anything specified is defaulted to string
Example <!ATTLIST book owner CDATA> DTD <book owner=“British library”> XML
DTD | Atul Kahate 63
Enumerated Attribute Type Lists of possible vales Example
<!ELEMENT fruit> <!ATTLIST fruit COLOR (RED | GREEN | PINK)
“RED”> RED is the default value
DTD | Atul Kahate 64
Exercise Create a DTD for the following XML example: <?xml version = "1.0"?>
<!DOCTYPE letter SYSTEM "letter.dtd">
<letter> <contact type = "sender"> <name>Nitin Pathak</name> <address>PO Box 1230</address> <address>Nigdi Post Office</address> <city>Pune</city> <pin>411001</pin> <state>Maharashtra</state> <flag gender = "M" /> </contact>
<contact type = "receiver"> <name>Leena Mohan</name> <address>PO Box 6171</address> <address>Thane Post Office</address> <city>Thane</city> <pin>400602</pin> <state>Maharashtra</state> <flag gender = "F" /> </contact>
<salutation>Dear madam:</salutation>
<paragraph>We are pleased to inform you that you have been selected for the position of assistant programmer</paragraph>
<paragraph>Please confirm your acceptance via a return letter</paragraph>
<closing>Sincerely,</closing>
<signature>Nitin Pathak - General Manager</signature> </letter>
DTD | Atul Kahate 65
Solution <!ELEMENT letter (contact+, salutation, paragraph+, closing,
signature)> <!ELEMENT contact (name, address+, city, pin, state, flag)> <!ELEMENT salutation (#PCDATA)> <!ELEMENT paragraph (#PCDATA)> <!ELEMENT closing (#PCDATA)> <!ELEMENT signature (#PCDATA)> <!ELEMENT name (#PCDATA)> <!ELEMENT address (#PCDATA)> <!ELEMENT city (#PCDATA)> <!ELEMENT pin (#PCDATA)> <!ELEMENT state (#PCDATA)> <!ELEMENT flag EMPTY> <!ATTLIST contact type CDATA #IMPLIED> <!ATTLIST flag gender (M | F) "M">
DTD | Atul Kahate 66
Case Study – Book information LIBRARY.DTD file describes the
documentElement DescriptionBOOK Identifies a book record
TITLE Describes a book’s title
AUTHOR Author of the book
PUBLISHER
Publisher of the book
COVER Hardback or paperback
CATEGORY Fiction, Fantasy, Sci-fi, etc
ISBN ISBN number
RATING Scale of 1-5
COMMENTS
Comments
DTD | Atul Kahate 67
Case Study – Elements<!ELEMENT book><!ELEMENT title><!ELEMENT author><!ELEMENT publisher><!ELEMENT cover EMPTY><!ELEMENT category EMPTY><!ELEMENT isbn><!ELEMENT rating EMPTY><!ELEMENT comments>
DTD | Atul Kahate 68
Case Study – Attributes<!ATTLIST cover TYPE (HARDBACK | PAPERBACK)
“PAPERBACK” #REQUIRED>
<!ATTLIST category CLASS (FICTION | FANTASY | SCIFI | MYSTERY | HORROR | NONFICTION | HISTORICAL | BIOGRAPHY) “FICTION” #REQUIRED>
<!ATTLIST rating NUMBER (1 | 2 | 3 | 4 | 5) “3” #REQUIRED>
DTD | Atul Kahate 69
Case Study – Modified Elements Declaration (Nesting)<!ELEMENT book (title, author, publisher, cover,
category, isbn, rating, comments?)><!ELEMENT title (#PCDATA)><!ELEMENT author (#PCDATA)><!ELEMENT publisher (#PCDATA)><!ELEMENT cover EMPTY><!ELEMENT category EMPTY><!ELEMENT isbn (#PCDATA)><!ELEMENT rating EMPTY><!ELEMENT comments (#PCDATA)>
DTD | Atul Kahate 70
Case Study – Let us put it all together<?xml version="1.0" standalone="yes"?>
<!DOCTYPE BOOK[
<!ELEMENT book (title, author, publisher, cover, category, isbn, rating, comments?)>
<!ELEMENT title (#PCDATA)>
<!ELEMENT author (#PCDATA)>
<!ELEMENT publisher (#PCDATA)>
<!ELEMENT cover EMPTY>
<!ATTLIST cover type (HARDBACK | PAPERBACK) "PAPERBACK">
<!ELEMENT category EMPTY>
<!ATTLIST category type (FICTION | FANTASY | SCIFI | MYSTERY | HORROR | NONFICTION |
HISTORICAL | BIOGRAPHY) "FICTION">
<!ELEMENT isbn (#PCDATA)>
<!ELEMENT rating EMPTY>
<!ATTLIST rating type (1 | 2 | 3 | 4 | 5) "3">
<!ELEMENT comments (#PCDATA)>
]>
<book>
<title>Computer Networks</title>
<author>Andrew Tanenbaum</author>
<publisher>Pearson Edcation</publisher>
<cover type="PAPERBACK"></cover>
<category></category>
<isbn>0-07-066789-X</isbn>
<rating type="5"></rating>
<comments> Easily the best of the best!</comments>
</book>
Thank you!
Any Questions?