xml i. learning objectives what is xml features of xml uses of xml structure of an xml document...

23
XML I

Upload: brooke-chavez

Post on 28-Mar-2015

260 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: XML I. Learning Objectives What is XML Features of XML Uses of XML Structure of an XML document Document Type Declaration Document Type Definitions (DTDs)

XML I

Page 2: XML I. Learning Objectives What is XML Features of XML Uses of XML Structure of an XML document Document Type Declaration Document Type Definitions (DTDs)

Learning Objectives

What is XMLFeatures of XMLUses of XMLStructure of an XML documentDocument Type DeclarationDocument Type Definitions (DTDs)

Page 3: XML I. Learning Objectives What is XML Features of XML Uses of XML Structure of an XML document Document Type Declaration Document Type Definitions (DTDs)

What is XML? XML means Extensible markup language. It is NOT a version of HTML Derived from SGML (Standard Generalized Mark-up

language, which was established in 1986 as a standard for generalized electronic document exchange.

Has 3 main features: structure, extensibility and validation.

XML defines a framework for transmitting structured data, hence an XML document is essentially a structured document for storing information.

Allows creation of custom mark-up tags for describing virtually anything.

XML documents are processed by an XML processor.

Page 4: XML I. Learning Objectives What is XML Features of XML Uses of XML Structure of an XML document Document Type Declaration Document Type Definitions (DTDs)

Uses of XML Applied use of its capability of storage, and exchange of

structural data between applications, that constitute the core of systems).

Examples of XML applications are Chemical Markup Language (CML), Extensible Financial Reporting Markup Language (XFRML), and Mathematical Markup Language.

Used in e-commerce to store, and transmit product, and other data, including financial information.

Used in Open Financial eXchange. Used in search engines to store, and search data. Applied use in virtually every sector.

Page 5: XML I. Learning Objectives What is XML Features of XML Uses of XML Structure of an XML document Document Type Declaration Document Type Definitions (DTDs)

By including, or referencing a Document type definition (DTD), XML documents can be validated.

XML Syntax FundamentalsXML syntax describes the constructs used to define the

structure and layout of an XML document, as well as the constraints involved.

An XML processor is a software module that reads an XML document, and provides access to its content and structure.

XML processors typically process documents on behalf of applications, and are readily available as software plug-ins.

IE 5.0 is an e.g. of an XML application that processes and displays XML documents.

Page 6: XML I. Learning Objectives What is XML Features of XML Uses of XML Structure of an XML document Document Type Declaration Document Type Definitions (DTDs)

Entity: The basic building block of an XML document. Contains either parsed or unparsed data. Parsed data consists of characters that are considered as

character data or mark-up, and are processed by an XML processor.

Unparsed character is handled as raw text and is not processed.

E.g. <name>John</name>, <name> and </name> are mark-up, while John is character data.

Markup: Used to provide a description of a document’s storage structure (entities) and logical structures (elements).

Elements: Describe the logical structure. They have start tags e.g. <name> and end tags ( </name> ), or a single empty tag (<name/>).

Page 7: XML I. Learning Objectives What is XML Features of XML Uses of XML Structure of an XML document Document Type Declaration Document Type Definitions (DTDs)

XML mark-up components include:1. Tags: Most obvious component in XML syntax, used to describe

elements.2. Processing instructions: Passed by the parser to the application.

Begin with <? and end with ?>. E.g <?xml version=“1.0”?> indicates that the document is based on xml version 1.0

3. Document type declarations: Used to specify information about the document, including the document’s root element, and the Document Type Definition (DTD). Must appear after the XML declaration, but before the root element e.g

<?xml version=“1.0”> <!DOCTYPE addressbook SYSTEM “Addressbook.dtd”> <addressbook> <contact>

addressbook declared in line 2 must correspond to <addressbook> in line 3, the root element of the document.

Page 8: XML I. Learning Objectives What is XML Features of XML Uses of XML Structure of an XML document Document Type Declaration Document Type Definitions (DTDs)

4. Entity references: Used to assign aliases to pieces of data. They are made within an ampersand (&) and a colon (;). E.g. &apos; corresponds to an apostrophe (‘) while &amp; corresponds to ‘&’.

5. Comments: Used to present information that is technically not part of the document’s content. Begin with <!– and end with -- >

6. Marked (CDATA) Sections: Used to block off text that is to be sidestepped by the parser. Defined by enclosing it in within <![CDATA[ and ]]>. E.g. <![CDATA[<name>John</name>]]. In this example, the name element is not recognized as mark-up and John is not recognized as parsed character data.

It is common to use CDATA sections to quote a piece of XML code, e.g. in a tutorial.

Page 9: XML I. Learning Objectives What is XML Features of XML Uses of XML Structure of an XML document Document Type Declaration Document Type Definitions (DTDs)

Styling XML for display Accomplished in 2 ways:

With the use of CSS. With XSL. More complex and advanced than CSS

Parsing XML Can be validating or non-validating. Validating parsers validate XML documents against a DTD

or XML Schema. E.g.s of XML parsers are The Lark and Larval XML parsers

for Java, Sun’s Project X Parser for Java, IBM’s XML Parser for Java, Oracle XML parser for Java, IBM’s XML Parser for C++.

Page 10: XML I. Learning Objectives What is XML Features of XML Uses of XML Structure of an XML document Document Type Declaration Document Type Definitions (DTDs)

Example of an XML Document

<?xml version=“1.0”?>

<!DOCTYPE addressbook SYSTEM “Addressbook.dtd”>

<addressbook>

<contact>

<name>Tony Benn</name>

<address>210 Temple road</address>

<city>London</city>

<postcode>NW9 0RT</postcode>

<phone>02082049565</phone>

</contact>

<contact>

<name>Peter Bloggs</name>

Page 11: XML I. Learning Objectives What is XML Features of XML Uses of XML Structure of an XML document Document Type Declaration Document Type Definitions (DTDs)

<address>230 The Vale</address>

<city>London</city>

<postcode>NW6 2BT</postcode>

<phone>02082029517</phone>

</contact>

</addressbook>

The above example is a well-formed XML document used to store contact information. However, it is not valid yet!

Note that the root element (<addressbook>) has nested child elements that are defined with opening and closing tags respectively.

Page 12: XML I. Learning Objectives What is XML Features of XML Uses of XML Structure of an XML document Document Type Declaration Document Type Definitions (DTDs)

XML Data Modelling Involves describing the structure of XML documents, for

the purpose of validation. After defining a data model, you can create structured XML

documents that must adhere to that model, to be valid. Valid vs Well-formed XML: It is perfectly legal to create

an XML document without a data model, in which case the document could be considered well-formed, but is not valid.

There are 2 approaches to creating data models:DTDs (Document Type Definitions) andXML Schemas

The data model (DTD or XML Schema) defines the arrangement of mark-up and character data within a valid XML document, i.e. the order of nesting of the elements.

Page 13: XML I. Learning Objectives What is XML Features of XML Uses of XML Structure of an XML document Document Type Declaration Document Type Definitions (DTDs)

Modelling Data with DTDs DTDs (Document Type Definitions) rely on specialized syntax

for describing the structure of XML vocabulary (class of document).

DTDs can be broken down into 2 subsets:Internal or Local DTD: Mark-up declarations are contained

in the prolog (section of document preceding the root element) of the same document.

External DTD: External mark-up declarations that can be referenced by one or more documents.

The 2 subsets may be combined, with Internal having higher precedence.

The DTD declares every element, attribute and entity used in the XML document.

It must be declared, or referenced in the document type declaration.

Page 14: XML I. Learning Objectives What is XML Features of XML Uses of XML Structure of an XML document Document Type Declaration Document Type Definitions (DTDs)

Example: Addressbook.dtd<!ELEMENT addressbook (contact)+><!ELEMENT contact (name, address, city, postcode, phone)><!ELEMENT name (#PCDATA)><!ELEMENT address (#PCDATA)><!ELEMENT city (#PCDATA)><!ELEMENT postcode (#PCDATA)><!ELEMENT phone (#PCDATA)>

<addressbook><contact><name>Tony Benn</name><address>210 Temple road</address><city>London</city><postcode>NW9 0RT</postcode><phone>02082049565</phone></contact>

Page 15: XML I. Learning Objectives What is XML Features of XML Uses of XML Structure of an XML document Document Type Declaration Document Type Definitions (DTDs)

Document type declaration syntax:<!DOCTYPE rootElem SYSTEM ExtDTDRef [InternalDTDDecl]>

where rootElem is the root element, ExtDTDRef is the External DTD reference, and InternalDTDDecl is the Internal DTD declaration.

Illustration:<!DOCTYPE movies SYSTEM “Movies.dtd” [<!ELEMENT actor (#PCDATA)> ]><movies><title>Lord of the rings</title><!– the other child elements go here -- >

External DTDs are more commonly used, and are especially useful when you are creating multiple documents of the same class; when you would like to use an existing DTD; or to make your document as concise as possible.

Page 16: XML I. Learning Objectives What is XML Features of XML Uses of XML Structure of an XML document Document Type Declaration Document Type Definitions (DTDs)

Internal DTDs are preferable in situations where you’re creating only one document, or to reduce the overhead associated with your documents.

Elements and Attributes The primary contents described in a DTD are elements and

attributes. Think of an element as a logical unit of information, and an Attribute as a characteristic of that information. By looking at a document as a group of information objects, it is

usually possible to associate each object with an element. Any leftover information would usually be represented as attributes.

Another approach is to consider the type of information and how it will be used.

Page 17: XML I. Learning Objectives What is XML Features of XML Uses of XML Structure of an XML document Document Type Declaration Document Type Definitions (DTDs)

Attributes provide tighter constraints on information, while elements on the other hand, are very loosely constrained and are better suited for long strings of text.

Attributes can be constrained against a predefined list of values, and can have default values.

Attributes are very concise, and are easier to parse. They however can not contain nested information.

Elements Declared with element declarations in the DTD. Syntax: <!ELEMENT ElementName Type> ElementName corresponds to the tag used to mark up that

element in the XML document. Type specifies the content. 4 types are supported in XML:

Page 18: XML I. Learning Objectives What is XML Features of XML Uses of XML Structure of an XML document Document Type Declaration Document Type Definitions (DTDs)

1. Empty types: The element doesn’t contain any content, but may contain attributes. In the DTD, they are declared in the form: <!ELEMENT ElementName EMPTY>

E.g <!ELEMENT img EMPTY>

Empty elements are defined in the XML document in 2 ways:a) <start tag><end tag> with no space in between e.g <img

src=“pic.gif”></img>.b) with an empty tag e.g <img/> or <img src=“pic.gif”/>

2. Element only type: The element only type contains child elements. Denoted by <!ELEMENT ElementName contentModel>

The content model is specified using a combination of special element declaration symbols and child element names.

The symbols represent the relationship of the child, to the container element.

Page 19: XML I. Learning Objectives What is XML Features of XML Uses of XML Structure of an XML document Document Type Declaration Document Type Definitions (DTDs)

Table of Special Symbols

Symbol Usage

Parentheses (()) Enclose a sequence or choice group of child elements

Comma (,) Separates the items in a sequence and establishes the order in which they must appear.

Pipe (|) Separates items in a choice group of elements.

No symbol Implies that the child element must appear exactly once

Question mark (?) Child element must appear only once or not at all

Asterisk (*) Child element can appear any number of times

Plus sign (+) Must appear at least once

Example:<!ELEMENT resume (intro, (education| experience+)+,hobbies?,references*)>

Page 20: XML I. Learning Objectives What is XML Features of XML Uses of XML Structure of an XML document Document Type Declaration Document Type Definitions (DTDs)

Mixed Elements Contain both character and child elements. The simplest mixed

element is that declared to contain only character data. Take the following form:<!ELEMENT ElementName (#PCDATA)>.E.g. <!ELEMENT city (#PCDATA)>

ANY Elements The ANY element, so named because it is declared with the

symbol ANY, can contain any type of element, or a combination of elements.

Due to its lack of structure, you should avoid using it. Typically used during development of a DTD, but should not

appear in a production DTD. Form: <!ELEMENT ElementName ANY>

Page 21: XML I. Learning Objectives What is XML Features of XML Uses of XML Structure of an XML document Document Type Declaration Document Type Definitions (DTDs)

Attributes Used to specify additional information about elements. Within an element, attributes are used to form name/value pairs

that describe a particular property of the element. Declared in a DTD with attribute list declaration which take the

form:

<! ATTLIST ElementName AttrName AttrType Default> There are 4 types of default types that can be specified: #REQUIRED: The attribute is required #IMPLIED: The attribute is optional #FIXED value: The attribute has a fixed value default: The default value of the attribute #REQUIRED implies that the attribute is required, and you

must define that attribute if you use the element.

Page 22: XML I. Learning Objectives What is XML Features of XML Uses of XML Structure of an XML document Document Type Declaration Document Type Definitions (DTDs)

Attribute Type Must be specified, in addition to the attribute default value. XML supports 10 attribute types:

CDATA- Unparsed character dataEnumerated: Series of string valuesNOTATION: A notation declared somewhere else in the DTDENTITY: An external binary entityENTITIES: Multiple external binary entities separated by

whitespace. ID: A unique identifier IDREF: Reference to an ID declared somewhere else in the DTD IDREFS: Multiple references to IDs declared somewhere else in

the DTDNMTOKEN: A name consisting of XML token characters (letters,

numbers, periods, dashes, colons and underscores).NMTOKENS: Multiple names consisting of XML token

characters.

Page 23: XML I. Learning Objectives What is XML Features of XML Uses of XML Structure of an XML document Document Type Declaration Document Type Definitions (DTDs)

String Attributes Most commonly used attribute Example:

<!ATTLIST player team CDATA #REQUIRED>

In the above example, the team to which a player belongs is a required character data attribute that must be defined in the player element.

<!ATTLIST player team CDATA #IMPLIED> would have made the team optional.

Another example:

<!ELEMENT movie (Producer, Director, Actor, Writer+, Duration)

<!ATTLIST movie type (comedy | thriller) #REQUIRED>

In this example, the movie element contains the child elements defined, but it also has a mandatory attribute called Type which has 2 possible values.