xml (extensible markup language)

134
Company Confidential 1 XML (eXtensible Markup Language) Prepared for: *Stars* New Horizons Certified Professional Course

Upload: jillian-lara

Post on 15-Mar-2016

70 views

Category:

Documents


3 download

DESCRIPTION

XML (eXtensible Markup Language). Prepared for: *Stars* New Horizons Certified Professional Course. Session Objective. What is a Markup Language ? What is XML ? HTML and XML Why do we need XML ? What is XML Used For ? Is XML just for Programmers ? XML & Its Features - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: XML (eXtensible Markup Language)

Company Confidential1

XML(eXtensible Markup Language)

Prepared for: *Stars*New Horizons Certified Professional

Course

Page 2: XML (eXtensible Markup Language)

• What is a Markup Language ?• What is XML ? • HTML and XML• Why do we need XML ? • What is XML Used For ?• Is XML just for Programmers ?• XML & Its Features• XML is Case Sensitivity• XML Processors• XML-Related Technologies• XML Evolution• XML building blocks

Session Objective

Page 3: XML (eXtensible Markup Language)

• The General Structure of XML• Structure of an XML Document• XML Tags • Elements and sub elements • XML Attributes • Elements and attributes• XML documents • XML as a Tree• Anatomy of an XML Document• XML Terminology• Well formed XML documents • Well formed XML

Session Objective (cont…)

Page 4: XML (eXtensible Markup Language)

• Entities• Comments• Names in XML• Namespaces• Namespaces and URIs• Namespace syntax• Valid XML documents • XML: The DTD • XML Syntax Rules• Presenting XML documents • Viewing XML• Viewing XML Files

Session Objective (cont…)

Page 5: XML (eXtensible Markup Language)

• Displaying XML with CSS• Displaying XML with XSLT• XML Parser• The XML DOM• XML to HTML• XML Application• Working with XmlReader & XmlWriter• Overview of XmlReader

– Reading an XML File• Overview of XmlWriter

– Writing an XML File• Extended document standards• Vocabulary

Session Objective (cont…)

Page 6: XML (eXtensible Markup Language)

What is a Markup Language?

A markup refers to the use of characters within a piece of information that can be used to process or identify that information in a particular way.

Example :- The following Example uses greater-than and less-than symbols to identify markup elements, or tags, that have specific purposes.

<HTML><HEAD>

<TITLE>Title Page</TITLE></HEAD><BODY>

<H1>This is text using a heading tag</H1> This is normal text.

</BODY></HTML>

Page 7: XML (eXtensible Markup Language)

What is a Markup Language?

• If the above HTML document were viewed in a browser, the browser would interpret the markup elements and display the content to reflect the author’s intentions.

• For example, the <H1> and </H1> tags are used to display the text in a large font, whereas the text immediately following it is displayed in the browser’s standard font.

• There are a number of different markup languages and types. Lets review three of the most common – SGML, HTML, XML.

Page 8: XML (eXtensible Markup Language)

What is XML?

XML is an acronym for eXtensible Markup Language. XML was designed to transport and store data. XML stands for eXtensible Markup Language XML is a markup language much like HTML XML was designed to carry data, not to display data XML tags are not predefined. We must define your own

tags XML is designed to be self-descriptive XML is a W3C Recommendation Like its predecessor HTML, it has its root in a standard

known as the Standard Generalized Markup Language, or SGML.

To understand why we need XML, let’s take a look at both SGML and HTML.

Page 9: XML (eXtensible Markup Language)

What is XML ? (cont…)

• Simpler SGML – XML is a meta-language.– A meta-language is a language that's used

to define other languages. – We can use XML for instance to define a

language like WML.– XML is a smaller version of SGML. – It's easy to master and that's a major

advantage compared to SGML which is a very complex meta-language.

Page 10: XML (eXtensible Markup Language)

What is XML ? (cont…)

SGML• Standard Generalized Markup Language (SGML) was

designed as a standard way to store data independent of any software application or platform.

• SGML is often referred to as a meta language. Meta languages are languages that are used for describing markup languages. HTML is a derivative of SGML and is therefore called an SGML application.

• There are a number of languages based on SGML. There are also a number of standard Data formats based on SGML.

• The real power behind SGML is its ability to declare Document Type Definitions, or DTD’s.

Page 11: XML (eXtensible Markup Language)

What is XML ? (cont…)

SGML• SGML provides the ability to define the contents of the

document, its markup characteristics, and its information model.

• The downside to SGML is that it is all encompassing, and has a lot of rules. It has so many aspects to it that it is almost impossible to implement all of them.

• For this reason, SGML is rarely used by itself. Instead, subsets have been created that target niche applications and needs.

Page 12: XML (eXtensible Markup Language)

What is XML ? (cont…)

HTML• Hyper Text Markup Language (HTML) is the first

internationally accepted derivative of SGML.

• HTML is really the document language of the World Wide Web.

• HTML was originally designed to represent document based data within a browser in very basic form.

• It has evolved over time to support application features through the use of JavaScript, Java Applets, and the inclusion of client-side plug-ins.

Page 13: XML (eXtensible Markup Language)

What is XML ? (cont…)

HTML• It is not a good language for applications to store

and share data. One of the primary reasons for this is its lack of support for DTD’s.

• DTD’s are external elements that define the contents and structure of the data.

• HTML’s structure is extremely limited compared to its predecessor SGML.

• HTML was designed as a document language, not a data language. This is where XML enters the picture.

Page 14: XML (eXtensible Markup Language)

What is XML? (cont…)

• XML: What it can do

• With XML we can : – Define data structures – Make these structures platform independent – Process XML defined data automatically – Define our own tags

• With XML we cannot :– Define how our data is shown. To show data,

we need other techniques.

Page 15: XML (eXtensible Markup Language)

XML Does not DO Anything• Maybe it is a little hard to understand, but XML does not

DO anything. • XML was created to structure, store, and transport

information.• The following example is a note to Tove from Jani,

stored as XML:

<note><to>Tove</to> <from>Jani</from><heading>Reminder</heading><body>Don't forget me this weekend!</body> </note>

What is XML? (cont…)

Page 16: XML (eXtensible Markup Language)

• The note above is quite self descriptive. • It has sender and receiver information, it also has a

heading and a message body.• But still, this XML document does not DO anything. • It is just pure information wrapped in tags. • Someone must write a piece of software to send,

receive or display it.

What is XML? (cont…)

Page 17: XML (eXtensible Markup Language)

XML is Just Plain Text

• XML is nothing special. It is just plain text. • Software that can handle plain text can also

handle XML. • However, XML-aware applications can handle the

XML tags specially. • The functional meaning of the tags depends on

the nature of the application.

What is XML? (cont…)

Page 18: XML (eXtensible Markup Language)

With XML You Invent Your Own Tags

• The tags in the example above (like <to> and <from>) are not defined in any XML standard. These tags are "invented" by the author of the XML document.

• That is because the XML language has no predefined tags.

• The tags used in HTML (and the structure of HTML) are predefined. HTML documents can only use tags defined in the HTML standard (like <p>, <h1>, etc.).

• XML allows the author to define his own tags and his own document structure.

What is XML? (cont…)

Page 19: XML (eXtensible Markup Language)

XML is Not a Replacement for HTML• XML is a complement to HTML.• It is important to understand that XML is not a

replacement for HTML. • In most web applications, XML is used to transport

data, while HTML is used to format and display the data.

• XML is a software and hardware independent tool for carrying information.

What is XML? (cont…)

Page 20: XML (eXtensible Markup Language)

XML is Everywhere• It has been amazing to see how quickly the XML standard

has developed, and how quickly a large number of software vendors have adopted the standard.

• XML is now as important for the Web as HTML was to the foundation of the Web.

• XML is everywhere. • It is the most common tool for data transmissions between

all sorts of applications.• It is becoming more and more popular in the area of storing

and describing information.

What is XML? (cont…)

Page 21: XML (eXtensible Markup Language)

HTML and XML

XML stands for eXtensible Markup Language

HTML is used to mark up text so it can be displayed to users

XML is used to mark up data so it can be processed by computers

HTML describes both structure (e.g. <p>, <h2>, <em>) and appearance (e.g. <br>, <font>, <i>)

XML describes only content, or “meaning”

HTML uses a fixed, unchangeable set of tags

In XML, you make up your own tags

Page 22: XML (eXtensible Markup Language)

HTML and XML (cont…)

• HTML and XML look similar, because they are both SGML languages (SGML = Standard Generalized Markup Language) – Both HTML and XML use elements enclosed in tags (e.g.

<body>This is an element</body>)– Both use tag attributes (e.g.,

<font face="Verdana" size="+1" color="red">)– Both use entities (&lt;, &gt;, &amp;, &quot;, &apos;)

• More precisely,– HTML is defined in SGML– XML is a (very small) subset of SGML

Page 23: XML (eXtensible Markup Language)

HTML and XML (cont…)

• HTML is for humans– HTML describes web pages– You don’t want to see error messages about the web

pages you visit– Browsers ignore and/or correct as many HTML errors as

they can, so HTML is often sloppy

• XML is for computers– XML describes data– The rules are strict and errors are not allowed

• In this way, XML is like a programming language– Current versions of most browsers can display XML

• However, browser support of XML is spotty at best

Page 24: XML (eXtensible Markup Language)

Why do we need XML?

• Data-exchange

- XML is used to aid the exchange of data. It makes it possible to define data in a clear way.

- Both the sending and the receiving party will use XML to understand the kind of data that's been sent.

- By using XML everybody knows that the same interpretation of the data is used.

Page 25: XML (eXtensible Markup Language)

Why do we need XML? (cont…)

• Replacement for EDI - EDI (Electronic Data Interchange) has been for several years the way to exchange data between businesses.

- EDI is expensive, it uses a dedicated communication infrastructure.

- XML is a good replacement for EDI. It uses the Internet for the data exchange. And it's very flexible.

Page 26: XML (eXtensible Markup Language)

Why do we need XML? (cont…)

• More possibilities

- XML makes communication easy. It's a great tool for transactions between businesses.

- We can define other languages with XML.

– A good example is WML (Wireless Markup Language), the language used in WAP-communications.

- WML is just an XML dialect.

Page 27: XML (eXtensible Markup Language)

What is XML Used For?

• XML has created a quite revolution on the Internet. • It is the first truly portable data format that was designed

for Internet and multi-language support.• The number of applications for XML are limitless. • Here are just a few of the areas that XML has gained

momentum.

• Business to Business E-Commerce– This is one of the areas that XML has moved most

rapidly. Getting businesses to talk the same data “language” has always been difficult.

– Differences in software and hardware platforms have always been big issues for companies that want to communicate electronically.

Page 28: XML (eXtensible Markup Language)

XML Vs. EDI (Electronic Data Interchange)

– Differences in data representation between organizations was another major hurdle. Having a standard for content modeling that could be interrogated by software was not part of the EDI model.

– This required that the software that would use the EDI output would have to understand its content model inherently. Each company had one or more content models that it would create for different things (e.g. purchase orders, proposals, etc.) . Things got real complicated real fast. A good analogy is that of the telephone.

– The problem with EDI was that each company that wanted to talk to each other had to reinvent the telephone for each type of discussion they wanted to have. Needless to say, EDI was not the panacea that many envisioned it to be.

What is XML Used For?

Page 29: XML (eXtensible Markup Language)

– XML addresses these issues in the way that it was designed.

– It is platform independent, it is structured, and it has a mechanism for strict content modeling that can be interrogated by any software using a standard XML processor.

– The result is that companies can communicate with each other using XML and engage in B2B E-Commerce without having to worry about hardware & software platform issues, or content modeling issues.

– Many large organizations are now using XML for such things as requests for proposals, purchase orders and transaction records.

– Since XML is designed for use with Internet protocols, the Internet has become the medium for transferring this XML data back and forth.

What is XML Used For?

Page 30: XML (eXtensible Markup Language)

• Catalogs– XML makes the perfect storage and transfer

mechanism for information such as catalogs. This includes not only catalogs like those used for e-commerce, but also things like parts and inventory.

– Companies can use XML to keep lists of information (catalogs) that can be shared and transferred between multiple departments, managers, outside vendors, etc.

What is XML Used For? (Cont...)

Page 31: XML (eXtensible Markup Language)

• Data Warehousing and Archiving– Storing large amounts of information for access

by multiple applications is known as Data Warehousing.

– XML can be used to store such information in usable chunks (documents) that can be retrieved from anywhere on the network or Internet, allowing users to get the information from anywhere.

– Another use of XML is that of data archiving. A good example is the archiving of relational database data.

What is XML Used For? (Cont...)

Page 32: XML (eXtensible Markup Language)

• Data Migration

– The problem however is that each system must understand the format of the data if they are going to be able to share such information.

– XML is the perfect mechanism to be used for data migration. By storing application data in XML, other applications can interrogate and query such information using XML’s open standards.

– Each application needs to know how to read one type of data storage, namely XML. This makes it much easier for application developers as they can use XML to export and import data.

What is XML Used For? (Contd...)

Page 33: XML (eXtensible Markup Language)

• XML Separates Data from HTML• XML Simplifies Data Sharing• XML Simplifies Data Transport• XML Simplifies Platform Changes• XML Makes Your Data More Available• XML is Used to Create New Internet Languages• If Developers Have Sense, future applications will

exchange their data in XML.

What is XML Used For? (Cont...)

Page 34: XML (eXtensible Markup Language)

Is XML just for Programmers?

• No.

• XML is making its way into all kinds of Internet technologies, including HTML.

• Microsoft is leading the way in incorporating XML into its browser.

• Starting with Microsoft IE4.0, we can create XML data islands within our HTML that can be used to dynamically update data within the document instead of having to retrieve it from the server.

Page 35: XML (eXtensible Markup Language)

• It also incorporated XML support inside of the Document Object Model (DOM), making it easy to request XML documents from the server without having to constantly refresh our HTML pages.

• It is expected that both Netscape Navigator and IE are going to increase their support for XML as new versions of their respective browsers become available.

Is XML just for Programmers?

Page 36: XML (eXtensible Markup Language)

Is XML just for Programmers? (cont…)

• Before long, XML will be as common to Web content providers and authors as HTML currently is.

• Since XML is truly extensible, do not be surprised to see XML versions of HTML documents supported in the future.

• XML is becoming the standard for all kinds of data storage.

Page 37: XML (eXtensible Markup Language)

• New word processors and spreadsheets are outputting their data in XML, making it much easier to import and export data between platforms and applications.

• XML is not just for programmers, even though most users will never directly interact with it.

• It is a world wide standard that makes data storage and transfer much easier and reliable than ever before.

Is XML just for Programmers? (cont…)

Page 38: XML (eXtensible Markup Language)

XML & Its Features

• XML incorporates many of the features of SGML, while learning from the limitations of HTML.

• Like SGML, XML utilizes DTD’s, making it flexible and extensible.

• The goals of XML were more focused than those of SGML, making it much easier to implement. These goals included:-

• XML could be used with existing Internet protocols (HTTP, MIME, etc.). This makes it the ideal format for sharing information on the Internet.

Page 39: XML (eXtensible Markup Language)

XML & Its Features

• XML support is application independent. Any application can utilize and support XML documents.

• XML is platform independent. Its use of technologies such as Unicode make it portable across machine types.

• XML is license free. It is controlled by an international standards organization (ISO). This means that it isn’t going to cost you anything to use it.

• XML is compatible with SGML.

Page 40: XML (eXtensible Markup Language)

XML & Its Features (cont)

• The feature set of XML was kept to a minimum so that applications could support it. Compare this goal with that of SGML.

• XML is a family of technologies. XML has already evolved to include support for such things as style sheets, hyperlinks, and the Document Object Model (DOM).

• XML takes the best of SGML (structured data definition capabilities) and the best of HTML (web addressing) .

Page 41: XML (eXtensible Markup Language)

• The result is a portable, highly usable, markup language that can be used by any number of applications to store and share structured data.

• Applications that will benefit or are already benefiting from XML include: – Office applications (word processors, spreadsheets, etc.)– Web applications (browsers, e-mail, etc.)– Server applications (database servers, e-mail servers,

etc.)

• At its core, XML appears very simple. However, the implications of its use are very complex.

XML & Its Features (cont)

Page 42: XML (eXtensible Markup Language)

XML is Case Sensitivity• XML is case sensitive, and this is a very

important point for creating well-formed documents and for portability.

• For example, the following will result in an error because the start tag <name> and the end tag </Name> are not recognized as the same.

<name><first>John</first><last>Doe</last></Name>

• The above would result in an error regarding unmatched tags.

• The following screen captures show how both XML-Notepad and IE5 react to the error.

Page 43: XML (eXtensible Markup Language)

• The reason that XML is case sensitive is very interesting.

• Most ASCII based text systems will convert text into upper case so that case-sensitivity is not an issue.

• However, XML is a portable standard, and by portable we mean language independent.

• This is the very reason why XML supports Unicode, and not ASCII.

• This makes it impossible to do any conversion with confidence since some character set conversions may behave differently than expected.

• XML defaults to lower case.

• It is recommended that we always use lower case as well.

XML is Case Sensitivity Cont.

Page 44: XML (eXtensible Markup Language)

XML Processors

• XML Processor is a piece of software, either an application or a library, that can process XML.

• A good example of an XML processor is XML document validation software.

• There are a number of such packages available for free and for sale on the Internet. Such applications can be used to validate the contents of an XML document and make sure that it is well-formed.

• A well-formed document is one that adheres to the rules of XML and any associated DTD’s.

• Other good examples include XML document viewers and XML document code libraries that can be used by software that you create to manipulate XML documents.

Page 45: XML (eXtensible Markup Language)

XML-Related Technologies

• DTD (Document Type Definition) and XML Schemas are used to define legal XML tags and their attributes for particular purposes

• CSS (Cascading Style Sheets) describe how to display HTML or XML in a browser

• XSLT (eXtensible Stylesheet LanguageTransformations) and XPath are used to translate from one form of XML to another

• DOM (Document Object Model), SAX (Simple API for XML, and JAXP (Java API for XML Processing) are all APIs for XML parsing

Page 46: XML (eXtensible Markup Language)

XML Evolution

• XML is still evolving.

• There are a number of derivatives and extensions to XML that are being used today. Some of these include: XSL / XSLT

• eXtensible Style Sheets (XSL/XSLT)

• XSL is a technology by which WE can embed XML within an HTML page and have the HTML processor (browser) populate the contents of the page using the embedded XML.

• This is a very powerful technology, although not many browsers currently support it.

• This is one of the most exciting XML implementations as it directly affects the way that data can be presented on the World Wide Web.

Page 47: XML (eXtensible Markup Language)

XML building blocks

• Aside from the directives, an XML document is built from:– elements: high in <high scale="F">103</high>– tags, in pairs: <high scale="F">103</high>– attributes: <high scale="F">103</high>– entities: <afternoon>Sunny &amp; hot</afternoon>– character data, which may be:

• parsed (processed as XML)--this is the default• unparsed (all characters stand for themselves)

Page 48: XML (eXtensible Markup Language)

The General Structure of XML

• Define our own tags – In XML, we define our own tags.– If we need a tag <TUTORIAL> or <STOCKRATE>,

that's no problem.

• DTD or Schema – If we want to use a tag, we'll have to define it's

meaning.– This definition is stored in a DTD (Document Type

Definition). – We can define our own DTD or use an existing one.– Defining a DTD actually means defining a XML

language.– An alternative for a DTD is Schema.

Page 49: XML (eXtensible Markup Language)

The General Structure of XML (cont…)

• Showing the results – Often it's not necessary to display the data in a XML

document. – It's for instance possible to store the data in a database

right away.– If we want to show the data, we can. – XML itself is not capable of doing so.– But XML documents can be made visible with the aid of

a language that defines the presentation.– XSL (eXtensible Stylesheet Language) is created for

this purpose. But the presentation can also be defined with CSS (Cascading Style Sheets).

Page 50: XML (eXtensible Markup Language)

Structure of an XML Document

Let’s get started by reviewing the structure of our examplebreaking this document down and see at how it really works.

• The XML Declaration • The Root Element• The Logical Structure

– Parents– Child– Siblings

• XML and Databases• The Physical Structure• Synchronous Structures

Page 51: XML (eXtensible Markup Language)

The XML Declaration• The first line in the example document is a processing

instruction that identifies the document as an XML document type.

• Processing instructions are special types of instructions.

• Let’s break the line down into parts.• <?xml version=“1.0”?>

Structure of an XML Document (Contd…)

Page 52: XML (eXtensible Markup Language)

• The first part is the processing instruction code. This is the <? and ?> characters. The ? is the identifier that is used to specify this markup type.

• The second point of note is the “xml” statement.• The third is the “version” attribute, which defines the

version of XML that this document complies with. • This instruction also specifies whether the document is

stand-alone (as is the case with this example), or requires a separate DTD in order to make sense of the data contained therein.

Structure of an XML Document (Contd…)

Page 53: XML (eXtensible Markup Language)

The Root Element

• Each XML document must have a root element, and there can be only one.

• A root element is the element that encapsulates all other elements in the XML document.

• In our example, <employees> is the root element. Notice that all other elements are located with the <employees> and </employees> tag.

<?xml version=”1.0”?><employees>. . . (rest of document omitted)</employees>

Structure of an XML Document (Contd…)

Page 54: XML (eXtensible Markup Language)

The Logical Structure• In theory, there are two types of

structure in XML. • The first is the logical structure, the

second is the physical structure.

• The logical structure has nothing to do with the physical entities associated in an XML document, but instead has to do with the order of the elements that it contains.

• The logical structure is independent of the physical structure because the logical structure includes all external entities that may be referenced by an XML document.

Employee.xml Logical Diagram

Structure of an XML Document (Contd…)

Page 55: XML (eXtensible Markup Language)

The Logical Structure• The following diagram illustrates the logical structure of

our employee example. • A key concept in XML is the idea of element relationship.

• Elements are said to be related to each other in one of three ways: parent, child, or sibling.

• Relationship is relative to the way that you are referring to an element.

Structure of an XML Document (Contd…)

Page 56: XML (eXtensible Markup Language)

Parents• Parent elements are elements that contain other

elements. • An example of a parent is the <name> element. • <name> has two children: <first> and <last>. An element

is a parent to the elements that it contains.

<name><first>John</first><last>Doe</last></name>

Structure of an XML Document (Contd…)

Page 57: XML (eXtensible Markup Language)

Child• Child elements are elements that are contained within a parent

element. • For example, both <first> and <last> are child elements to

<name>.

<name><first>John</first><last>Doe</last></name>

SiblingsSiblings are elements that share a parent are called siblings. • While <first> and <last> are children of <name>, they are also

siblings to each other. • Another example of siblings include <name>, <position>,

<address> and <phone>.

Structure of an XML Document (Contd…)

Page 58: XML (eXtensible Markup Language)

XML and Databases

• XML is hierarchical in nature.

• The concepts of parent, child, and sibling elements are not new to data storage and databases in general.

• Those who have used hierarchical databases like IMS and CICS will easily understand the logical structure of an XML document.

• For those who are familiar with newer databases that utilize the relational model, XML may seem a little odd.

• When we get into content modeling, you will see that it is easy to model both types of databases using XML.

Structure of an XML Document (Contd…)

Page 59: XML (eXtensible Markup Language)

The Physical Structure

• Understanding the logical structure is fundamental to using XML within your software applications.

• Understanding how to manipulate the physical structure is fundamental to creating and maintaining XML documents.

• The physical storage of an XML document is called an entity.

• An entity can reference another entity (i.e. one XML document references another XML document) and the result is a single logical structure of elements.

• This means that we could physically breakup our employee example into separate XML documents, make reference to them from a primary document, and treat them as a single logical entity.

Structure of an XML Document (Contd…)

Page 60: XML (eXtensible Markup Language)

The Physical Structure (Contd…)

• Being able to break a document’s logical structure into multiple physical parts and reference them individually or from other documents is extremely powerful.

• It can also get you into a lot trouble.

• Let’s say for arguments sake that Position.xml contained a </employees> tag. This would ruin the logical structure and cause the XML processor to be unable to process the document.

• We need to be very careful in defining the contents of physical structure so that our logical structure can remain intact.

Structure of an XML Document (Contd…)

Page 61: XML (eXtensible Markup Language)

XML Tags

• Tags – XML tags are created like HTML tags. – There's a start tag and a closing tag. <TAG>content</TAG>– The closing tag uses a slash after the opening bracket, just like in

HTML.– The text between the brackets is called an element.

– Syntax – Tags are case sensitive. – The tag <TRAVEL> differs from the tags <Travel> and <travel>.– Starting tags always need a closing tag. – All tags must be nested properly. – Comments can be used like in HTML: <!--Comments --> – Between the starting tag and the end tag XML expects the content. – <amount>135</amount> is a valid tag for an element amount that

has the content 135

Page 62: XML (eXtensible Markup Language)

XML Tags (cont…)

• Review of XML rules– Start with <?xml version="1"?>– XML is case sensitive– You must have exactly one root element that

encloses all the rest of the XML– Every element must have a closing tag– Elements must be properly nested– Attribute values must be enclosed in double or single

quotation marks– There are only five predeclared entities

Page 63: XML (eXtensible Markup Language)

XML Tags (cont…)

• Empty Tags – Besides a starting tag and a closing tag, we can use an empty tag. – An empty tag does not have a closing tag.– The syntax differs from HTML: <TAG/>

Page 64: XML (eXtensible Markup Language)

Elements & sub elements

• Elements and children

– With XML tags we define the type of data. But often data is more complex. It can consist of several parts.

– To describe the element car we can define the tags <car>mercedes</car>.

– This model might look like this:

– <car> <brand>volvo</brand> <type>v40</type> <color>green</color> </car>

Page 65: XML (eXtensible Markup Language)

Elements & sub elements (cont…)

– Besides the element car three other elements are used: brand, type and color.

– Brand, type and color are sub-elements of the element car.

– In the XML-code the tags of the sub-elements are enclosed within the tags of the element car.

– Sub-elements are also called children.

Page 66: XML (eXtensible Markup Language)

XML Attributes •Attributes

–Elements in XML can use attributes. The syntax is: <element attribute-name = "attribute-value">....</element>

–The value of an attribute needs to be quoted, even if it contains only numbers. <car color = "green">volvo</car>

–The same information can also be defined without using attributes: –<car>

<brand>volvo</brand> <color>green</color> </car>

•Avoid attributes –When possible, try to avoid attributes. –Data structures are more easy described in XML-tags.–Software that checks XML-documents can do a better job with tags than with attributes.

Page 67: XML (eXtensible Markup Language)

Elements and attributes

• Attributes and elements are somewhat interchangeable• Example using just elements:

<name> <first>David</first> <last>Matuszek</last></name>

• Example using attributes: <name first="David" last="Matuszek"></name>

• You will find that elements are easier to use in your programs--this is a good reason to prefer them

• Attributes often contain metadata, such as unique IDs• Generally speaking, browsers display only elements (values

enclosed by tags), not tags and attributes

Page 68: XML (eXtensible Markup Language)

XML documents

• The XML declaration – The first line of an XML document is the XML declaration. – It's a special kind of tag: <?xml version="1.0"?> – The version 1.0 is the actual version of XML.– The XML declaration makes clear that we're talking XML

and also which version is used.– The version identification will become important after new

versions of XML are used.

• The root element – All XML documents must have a root element.– All other elements in the same document are children of

this root element. – The root element is the top level of the structure in an

XML document.

Page 69: XML (eXtensible Markup Language)

XML documents (cont…)

• Structure of an XML page

– <?xml version="1.0"?> <root>

<element> <sub-element> content </sub-element> <sub-element> content </sub-element>

<element> </root>

– All elements must be nested. – The level of nesting can be arbitrarily deep.

Page 70: XML (eXtensible Markup Language)

XML documents (cont…)

• A real XML page

Page 71: XML (eXtensible Markup Language)

XML documents (cont…)

Another well-structured example <novel>

<foreword> <paragraph> This is the great American novel. </paragraph></foreword> <chapter number="1"> <paragraph>It was a dark and stormy night. </paragraph> <paragraph>Suddenly, a shot rang out! </paragraph> </chapter></novel>

Page 72: XML (eXtensible Markup Language)

XML as a Tree• An XML document represents a hierarchy; a hierarchy is a tree

novel

foreword chapternumber="1"

paragraph paragraph paragraph

This is the greatAmerican novel.

It was a darkand stormy night.

Suddenly, a shotrang out!

Page 73: XML (eXtensible Markup Language)

XML as a Tree (cont…)

• XML documents form a tree structure that starts at "the root" and branches to "the leaves".

• An Example XML Document<?xml version="1.0" encoding="ISO-8859-1"?> <note> <to>Tove</to> <from>Jani</from> <heading>Reminder</heading><body>Don't forget me this weekend!</body> </note>

Page 74: XML (eXtensible Markup Language)

XML as a Tree (cont…)

• The first line is the XML declaration. It defines the XML version (1.0) and the encoding used (ISO-8859-1 = Latin-1/West European character set).

• The next line describes the root element of the document (like saying: "this document is a note"):

• The next 4 lines describe 4 child elements of the root (to, from, heading, and body):

• And finally the last line defines the end of the root element:

• From this example, XML document contains a note to Tove from Jani.

• XML is pretty self-descriptive.

Page 75: XML (eXtensible Markup Language)

XML Documents Form a Tree Structure• XML documents must contain a root element. This element is "the

parent" of all other elements.• The elements in an XML document form a document tree. • The tree starts at the root and branches to the lowest level of the tree.

All elements can have sub elements (child elements):

<root> <child><subchild>.....</subchild> </child> </root>

• The terms parent, child, and sibling are used to describe the relationships between elements. Parent elements have children. Children on the same level are called siblings (brothers or sisters).

• All elements can have text content and attributes (just like in HTML).

XML as a Tree (cont…)

Page 76: XML (eXtensible Markup Language)

XML Tree (Contd…)

Example:The image above represents one book in the XML below:

<bookstore> <book category="COOKING">

<title lang="en">Everyday Italian</title> <author>Giada De Laurentiis</author> <year>2005</year> <price>30.00</price>

</book> <book category="CHILDREN">

<title lang="en">Harry Potter</title><author>J K. Rowling</author><year>2005</year> <price>29.99</price>

</book> <book category="WEB">

<title lang="en">Learning XML</title> <author>Erik T. Ray</author> <year>2003</year> <price>39.95</price>

</book> </bookstore>

• The root element in the example is <bookstore>. • All <book> elements in the document are contained within <bookstore>.• The <book> element has 4 children: <title>,< author>, <year>, <price>.

Page 77: XML (eXtensible Markup Language)

Anatomy of an XML Document• The following XML document is a sample that stores employee information.

<?xml version=”1.0”?><employees> <employee id=”A1234”> <name> <first>John</first> <last>Doe</last> </name> <position>Programmer</position> <address><street>123 Main Street</street><city>Anywhere</city><state>CA</state><zip>92000</zip> </address> <phone><main>(714) 555-1000</main><fax>(714) 555-1001</fax> </phone> </employee></employees>

• As WE can see, XML is very easy to read and understand.

• XML is similar and very different to HTML.

• It is similar in the fact that it is a markup language and uses tags. The tags however are not HTML tags.

• Unlike HTML, WE can create our own tags as we have done here.

Page 78: XML (eXtensible Markup Language)

XML TerminologyTags• A tag is an identifier to an element. • Just like HTML, tags are identified using less-than (<) and greater-than (>)

symbols. These symbols are considered markup and the text between them is known as the tag.

• HTML example :-<title>This is the title of my HTML page</title>

• The tag we are using here is the “title” tag. Notice that it has a start tag (<title>), and a different version of the tag at the end (</title>).

• The slash in the second tag is called a terminator tag. The difference between HTML tags and XML tags is that in XML we can name the tags what we want (provided we follow some basic rules).

• In our example XML document, we have defined lots of tags. In the following example, there are three different tags that are defined: <name>, <first> and <last>.<name><first>John</first><last>Doe</last></name>

Page 79: XML (eXtensible Markup Language)

Tags• In XML we have control of the tag names. There are only a few rules that we

have to follow when it comes to naming tags.1. The tag name must contain one letter (A-Z or a-z).

2. The tag name can contain digits, but they cannot be the first character.

3. Providing that we adhere to rule 1, we can begin our tag name with an underscore ( _ ) or a colon (:).

4. Spaces and/or tags are not allowed in tag names.

5. The only punctuation signs that we can use in your tag name are the hyphen ( - ) and a full stop ( . ). We cannot use underscores to separate long names (e.g.<this_is_my_long_tag_name>).

6. An underscore is allowed as a first character, it is not allowed anywhere else.

XML TerminologyCont.

Page 80: XML (eXtensible Markup Language)

6. We can use a hyphen or full stop to get the same effect (e.g. <this.is.my.long.tag.name> or <this-is-my-long-tag-name>).

7. Use tag names that make sense. This isn’t really a rule because it cannot be enforced by an XML processor.

• One of the features of XML is its readability.• The following XML would be well-formed, valid, and perfectly usable, but it is

far from readable.<ab><cd>David</cd><_>Doe</_></ab>

• Other than these enforcing these simple rules, XML let’s us name our tags as we like.

XML TerminologyCont.

Page 81: XML (eXtensible Markup Language)

Elements• In XML documents, data is stored in elements.

• Elements are identified using a start tag and an end tag.

• The name of the tag is defined by the author.

• We immediately can see the difference between HTML and XML, in the way that the tags are named.

• In HTML, we have a predefined set of tags to work with.

• In XML, we can create our own tag names so that it makes logical sense for what we are doing.

XML TerminologyCont.

Page 82: XML (eXtensible Markup Language)

Elements• In our example document, we have several elements.

For example, the following element called <position> contains a value of “programmer”.

• Note that the end tag uses a backslash (“/”) to denote the end of the element.

• <position>Programmer</position>

XML TerminologyCont.

Page 83: XML (eXtensible Markup Language)

• Elements can be empty, meaning that they do not contain any data. For example, if we did not have a value for the <position> element it would look like this:<position></position>

• In XML, it is not necessary to use two tags to represent an empty element. Instead, we can use an end tag, which allows us to create a start and an end tag within one element tag.

• An end tag has a backslash incorporated into it as shown:<position/> is equivalent to: <position></position>

XML TerminologyCont.

Page 84: XML (eXtensible Markup Language)

Attributes• Element tags can contain attributes, which give further

information about the elements they delimit. For example <employee id=“A1234”>. . . (contents omitted for brevity)</employee>

• In the above example, the <employee> tag has an attribute called “id”.

• The “id” attribute contains a value of “A1234”.

• Attributes can only be specified in the start tag of an element.

XML TerminologyCont.

Page 85: XML (eXtensible Markup Language)

Attributes

• Duplicate attributes are allowed in XML. For example, the following statement is allowed, although it may not make much sense:<employee id=“A1234” id=”abcd”>

• Unlike SGML and HTML, which consider this an error. • But XML specification states that this should be accepted and

handled by merging the two attribute values, and that the first declaration of an attribute is the one that can be referenced.

• Based on the specification, processing of an XML document should continue in this event.

• However, just like with HTML, viewers may not always implement the standard.

• Using the version of Microsoft XML Notepad that is available at the time of this writing, an error occurs, when it encounters the above declaration.

XML TerminologyCont.

Page 86: XML (eXtensible Markup Language)

Entities• An entity has many applications in XML. • The official definition of an entity is a storage object. • The first XML entity that we have encountered so far is the XML document. • The document as a whole is an entity. This entity is divided into elements.• There are other types of entities in XML as well. • One of the most powerful capabilities of XML is its ability to include

external entities. In other words, we can reference (and thus include) another file inside XML document.

• With HTML. The <IMG> tag references an external graphics file. The graphics file is not embedded inside of the HTML document, but it references it and the browser requests it from the server.<IMG SRC=”./images/somepicture.jpg”>

XML TerminologyCont.

Page 87: XML (eXtensible Markup Language)

Entities• We can do the same thing in XML.

<picture.1source=”./images/somepicture.jpg”/>• In this example, the graphics file is considered an unparsed

entity, as it is not parsed by the XML processor.• We can also include other XML documents. The result is that

the XML document is included in the document that referenced it and they are presented by the XML processor as one.

XML TerminologyCont.

Page 88: XML (eXtensible Markup Language)

Well formed XML documents

• Well formedness – An XML document needs to be well formed. – Well formed means that the document applies to the syntax

rules for XML.

• The Rules – To be well formed a document needs to comply to the

following rules:-– it contains a root element.– all other elements are children of the root element.– all elements are correctly paired. – the element name in a start-tag and an end-tag are exactly

the same. – attribute names are used only once within the same element

Page 89: XML (eXtensible Markup Language)

Well-formed XML

• Every element must have both a start tag and an end tag, e.g. <name> ... </name>– But empty elements can be abbreviated: <break />.– XML tags are case sensitive– XML tags may not begin with the letters xml, in any

combination of cases• Elements must be properly nested, e.g. not <b><i>bold and

italic</b></i>• Every XML document must have one and only one root element• The values of attributes must be enclosed in single or double

quotes, e.g. <time unit="days">• Character data cannot contain < or &

Page 90: XML (eXtensible Markup Language)

Entities

• Five special characters must be written as entities: &amp; for & (almost always necessary) &lt; for < (almost always necessary) &gt; for > (not usually necessary) &quot; for " (necessary inside double quotes) &apos; for ' (necessary inside single quotes)

• These entities can be used even in places where they are not absolutely required

• These are the only predefined entities in XML

Page 91: XML (eXtensible Markup Language)

Comments

• <!-- This is a comment in both HTML and XML -->• Comments can be put anywhere in an XML document• Comments are useful for:

– Explaining the structure of an XML document– Commenting out parts of the XML during development

and testing• Comments are not elements and do not have an end tag• The blanks after <!-- and before --> are optional• The character sequence -- cannot occur in the comment• The closing bracket must be -->

Page 92: XML (eXtensible Markup Language)

Comments (cont…)

• Comments are not displayed by browsers, but can be seen by anyone who looks at the source code

• Yes, XML has comments, and it is always a good idea to use them.

• Getting in a good habit of commenting what we do will save lot’s of time and frustration in the long run.

• The syntax for comments is: <!--This text is a comment-->

• The start of a comment is represented by the <!-- characters, while the end of the comment is represented by the -->.

Page 93: XML (eXtensible Markup Language)

• The XML specification does not require that comments be passed to an application.

• Some XML processors may strip the comments out before the application sees them.

• It is important to comment your XML documents. Even though XML is designed to be readable, everything is relative.

• Even XML can look pretty complicated when you take advantage of all of its capabilities.

Comments (Contd…)

Page 94: XML (eXtensible Markup Language)

Names in XML

• Names (as used for tags and attributes) must begin with a letter or underscore, and can consist of:– Letters, both Roman (English) and foreign– Digits, both Roman and foreign . (dot) - (hyphen) _ (underscore) : (colon) should be used only for namespaces– Combining characters and extenders (not used in

English)

Page 95: XML (eXtensible Markup Language)

Namespaces

• Recall that DTDs are used to define the tags that can be used in an XML document

• An XML document may reference more than one DTD• Namespaces are a way to specify which DTD defines

a given tag• XML, like Java, uses qualified names

– This helps to avoid collisions between names– Java: myObject.myVariable– XML: myDTD:myTag– Note that XML uses a colon (:) rather than a dot (.)

Page 96: XML (eXtensible Markup Language)

Namespaces and URIs

• A namespace is defined as a unique string– To guarantee uniqueness, typically a URI (Uniform

Resource Indicator) is used, because the author “owns” the domain

– It doesn't have to be a “real” URI; it just has to be a unique string

– Example: http://www.matuszek.org/ns

• There are two ways to use namespaces:– Declare a default namespace– Associate a prefix with a namespace, then use the

prefix in the XML to refer to the namespace

Page 97: XML (eXtensible Markup Language)

Namespace syntax• In any start tag you can use the reserved attribute name xmlns:

<book xmlns="http://www.matuszek.org/ns">– This namespace will be used as the default for all elements

up to the corresponding end tag– You can override it with a specific prefix

• You can use almost this same form to declare a prefix: <book xmlns:dave="http://www.matuszek.org/ns">– Use this prefix on every tag and attribute you want to use

from this namespace, including end tags--it is not a default prefix

<dave:chapter dave:number="1">To Begin</dave:chapter>

• You can use the prefix in the start tag in which it is defined: <dave:book xmlns:dave="http://www.matuszek.org/ns">

Page 98: XML (eXtensible Markup Language)

Valid XML documents• Valid

– To be of practical use, an XML document needs to be valid. – To be valid an XML document needs to apply to the following rules:-

• The document must be well formed. • The document must apply to the rules as defined in a Document

Type Definition (DTD)– If a document is valid, it's clearly defined what the data in the

document really means.– There's no possibility to use a tag that's not defined in the DTD. – Companies that exchange XML-documents can check them with the

same DTD.– Because a valid XML document is also well formed, there's no

possibility for typo's in the tags.

• Valid is about structure – A valid XML-document has a structure that's valid. – There's no check for the content.

Page 99: XML (eXtensible Markup Language)

Valid XML documents (cont…)

• We can make up our own XML tags and attributes, but...– ...any program that uses the XML must know what to expect!

• A DTD (Document Type Definition) defines what tags are legal and where they can occur in the XML

• An XML document does not require a DTD• XML is well-structured if it follows the rules given earlier• In addition, XML is valid if it declares a DTD and conforms to that

DTD• A DTD can be included in the XML, but is typically a separate

document• Errors in XML documents will stop XML programs• Some alternatives to DTDs are XML Schemas and RELAX NG

Page 100: XML (eXtensible Markup Language)

XML: the DTD

• Defining the language – To use XML we need a DTD (Document Type Definition).– A DTD contains the rules for a particular type of XML-

documents.– Actually it's the DD that defines the language.

• Elements – A DTD describes elements. – It uses the following syntax :-– The text <! ELEMENT, followed by the name of the element,

followed by a description of the element.– For instance: <!ELEMENT brand (#PCDATA)>– This DTD description defines the XML tag <brand>.

Page 101: XML (eXtensible Markup Language)

XML: the DTD (cont…)

• Data – The description (#PCDATA) stands for parsed character data.– It's the tag that is shown and also will be parsed (interpreted)

by the program that reads the XML document.– You can also define (#CDATA), this stands for character

data.– CDATA will not be parsed or shown.

• Sub elements – An element that contains sub elements is described thus :-– <!ELEMENT car (brand, type)> – <!ELEMENT brand (#PCDATA)> – <!ELEMENT type (#PCDATA)> – This means that the element car has two subtypes: brand and

type. – Each subtype can contain characters.

Page 102: XML (eXtensible Markup Language)

• Number of sub elements – If you use <!ELEMENT car (brand, type) >, the sub elements

brand and type can occur once inside the element car. – To change the number of possible occurrences the following

indications can be used: • + must occur at least one time but may occur more often • * may occur more often but may also be omitted • ? may occur once or not at all

– The indications are used behind the sub element name. – For instance: <!ELEMENT animal (color+)>

• Making choices – With the sign '|' , we define a choice between two sub elements. – We enter the sign between the names of the sub elements. – <!ELEMENT animal (wingsize|legsize)>

XML: the DTD (cont…)

Page 103: XML (eXtensible Markup Language)

• Empty elements – Empty elements get the description EMPTY. – For instance <!ELEMENT separator EMPTY> – It could define a separator line to be shown if the XML document

appears in a browser.

• DTD: external – A DTD can be an external document that's referred to. – Such a DTD starts with the text <!DOCTYPE name of root-

element SYSTEM "address"> – The address is an URL that points to the DTD. – In the XML document we make clear that we'll use this DTD with

the line :-<!DOCTYPE name of root-element SYSTEM "address">

– It should be typed after the line <?xml version="1.0"?>

XML: the DTD (cont…)

Page 104: XML (eXtensible Markup Language)

• DTD: internal – A DTD can also be included in the XML document

itself.

– After the line <?xml version="1.0"?> we must type <!DOCTYPE name of root-element [followed by the element definitions.

– The DTD part is closed with ]>

XML: the DTD (cont…)

Page 105: XML (eXtensible Markup Language)

XML Syntax Rules

All XML Elements Must Have a Closing Tag• All elements must have a closing tag:

<p>This is a paragraph</p> <p>This is another paragraph</p> 

XML Tags are Case Sensitive• XML elements are defined using XML tags.• XML tags are case sensitive. • With XML, the tag <Letter> is different from the tag

<letter>.• Opening and closing tags must be written with the same

case:<Message>This is incorrect</message> <message>This is correct</message>

Page 106: XML (eXtensible Markup Language)

XML Elements Must be Properly Nested• In XML, all elements must be properly nested within each other:• <b><i>This text is bold and italic</i></b>

XML Documents Must Have a Root Element• XML documents must contain one element that is the parent of all

other elements.• This element is called the root element.• <root> <child> <subchild>.....</subchild> </child> </root>

XML Attribute Values Must be Quoted• XML elements can have attributes in name/value pairs just like in

HTML.• In XML the attribute value must always be quoted. • <note date="12/11/2007"> <to>Tove</to> <from>Jani</from>

</note>

XML Syntax Rules

Page 107: XML (eXtensible Markup Language)

Entity References• Some characters have a special meaning in XML.• Character like "<" inside an XML element, it will generate an error

because the parser interprets it as the start of a new element.

• This will generate an XML error:<message>if salary < 1000 then</message>

• To avoid this error, replace the "<" character with an entity reference: <message>if salary &lt; 1000 then</message>

• There are 5 predefined entity references in XML:Comments in XML• The syntax for writing comments in XML is similar to that of

HTML.<!-- This is a comment -->

&lt; < less than

&gt; > greater than

&amp; & ampersand 

&apos; ' apostrophe

&quot; " quotation mark

XML Syntax RulesCont.

Page 108: XML (eXtensible Markup Language)

With XML, White Space is Preserved• HTML reduces multiple white space characters to a single white

space:HTML: Hello           my name is Tove Output: Hello my name is Tove.

• With XML, the white space in your document is not truncated.

XML Stores New Line as LF• In Windows applications, a new line is normally stored as a pair of

characters: carriage return (CR) and line feed (LF).• The character pair bears some resemblance to the typewriter actions

of setting a new line. • In Unix applications, a new line is normally stored as a LF character. • Macintosh applications use only a CR character to store a new line.

XML Syntax RulesCont.

Page 109: XML (eXtensible Markup Language)

Presenting XML documents

• Showing XML documents – XML is about defining data. – With XML, we can define documents that are

understood by computers.– But to make these documents understandable to

humans, we need to show them.

• CSS – Cascading Style sheets (CSS) offer possibilities to

show XML. – It works just like adding styles to HTML elements.

Page 110: XML (eXtensible Markup Language)

Presenting XML documents (cont…)

• XSL – The preferred solution is using XSL (eXtensible Style

sheet Language).– XSL can convert XML documents into HTML.– It can be used client side but the best solution is to

use XSL server side.– We can convert our XML documents to HTML, thus

making them visible to any browser.

Page 111: XML (eXtensible Markup Language)

Viewing XML

• XML is designed to be processed by computer programs, not to be displayed to humans

• Nevertheless, almost all current browsers can display XML documents– They don’t all display it the same way– They may not display it at all if it has errors– For best results, update your browsers to the newest

available versions• Remember:

HTML is designed to be viewed, XML is designed to be used

Page 112: XML (eXtensible Markup Language)

Viewing XML Files• Raw XML files can be viewed in all major browsers.• Don't expect XML files to be displayed as HTML pages.

– <?xml version="1.0" encoding="ISO-8859-1"?> <note>

<to>Tove</to><from>Jani</from><heading>Reminder</heading> <body>Don't forget me this weekend!</body>

</note> • The XML document will be displayed with color-coded root and

child elements. • A plus (+) or minus sign (-) to the left of the elements can be

clicked to expand or collapse the element structure. • To view the raw XML source (without the + and - signs), select

"View Page Source" or "View Source" from the browser menu.

Page 113: XML (eXtensible Markup Language)

Viewing XML Files (cont…)• Other XML Examples

– An XML CD catalog– This is a CD collection, stored as XML data. – An XML plant catalog– This is a plant catalog from a plant shop, stored as XML data. – A Simple Food Menu– This is a breakfast food menu from a restaurant, stored as

XML data. • Why Does XML Display Like This?

– XML documents do not carry information about how to display the data.

– Since XML tags are "invented" by the author of the XML document, browsers do not know if a tag like <table> describes an HTML table or a dining table.

– Without any information about how to display the data, most browsers will just display the XML document as it is.

– Different solutions to the display problem, using CSS, XSLT and JavaScript.

XML Document XML Document

XML Document

XML Document

Page 114: XML (eXtensible Markup Language)

Displaying XML with CSS

• With CSS (Cascading Style Sheets), we can add display information to an XML document.

• It is possible to use CSS to format an XML document.• Below is an example of how to use a CSS style sheet to format an

XML document:

• Take a look at this XML file: The CD catalog • Then look at this style sheet: The CSS file • Finally, view: The CD catalog formatted with the CSS file

• Below line links the XML file to the CSS file: <?xml-stylesheet type="text/css" href="cd_catalog.css"?>

• Formatting XML with CSS is not the most common method.• W3C recommend using XSLT instead.

XML DocumentC:\Documents and

Settings\Santosh.Negi\D

XML Document

Page 115: XML (eXtensible Markup Language)

Displaying XML with XSLT• With XSLT you can transform an XML document into HTML. • XSLT is the recommended style sheet language of XML.• XSLT (eXtensible Stylesheet Language Transformations) is

far more sophisticated than CSS. • One way to use XSLT is to transform XML into HTML before

it is displayed by the browser as demonstrated in these examples:View the XML file, the XSLT style sheet, and View the result.

• Below line links the XML file to the XSLT file<?xml-stylesheet type="text/xsl" href="simple.xsl"?>

XML Document XSL Stylesheet XML Document

C:\Documents and Settings\Santosh.Negi\D

Page 116: XML (eXtensible Markup Language)

Displaying XML with XSLT

Transforming XML with XSLT on the Server• In the example above, the XSLT transformation is done by

the browser, when the browser reads the XML file.• Different browsers may produce different result when

transforming XML with XSLT. To reduce this problem the XSLT transformation can be done on the server.

• View the result. • Note that the result of the output is exactly the same, either

the transformation is done by the web server or by the web browser.

XML Document XSL Stylesheet XML Document

C:\Documents and Settings\Santosh.Negi\D

Page 117: XML (eXtensible Markup Language)

XML Parser

• Most browsers have a built-in XML parser to read and manipulate XML.

• The parser converts XML into a JavaScript accessible object.

• All modern browsers have a built-in XML parser that can be used to read and manipulate XML.

• The parser reads XML into memory and converts it into an XML DOM object that can be accessed with JavaScript.

Page 118: XML (eXtensible Markup Language)

XML Parser (cont…)

• There are some differences between Microsoft's XML parser and the parsers used in other browsers.

• The Microsoft parser supports loading of both XML files and XML strings (text), while other browsers use separate parsers.

• All parsers contain functions to traverse XML trees, access, insert, and delete nodes (elements) and their attributes.

• When we talk about parsing XML, we often use the term "Nodes" about XML elements.

Page 119: XML (eXtensible Markup Language)

XML Parser (cont…)

Loading XML with Microsoft's XML Parser• Microsoft's XML parser is built into Internet Explorer 5 and

higher.• The following JavaScript fragment loads an XML document

("note.xml") into the parser:var xmlDoc=new ActiveXObject("Microsoft.XMLDOM"); xmlDoc.async="false"; xmlDoc.load("note.xml");

Example explained:• The first line of the script above creates an empty Microsoft

XML document object. • The second line turns off asynchronized loading, to make sure

that the parser will not continue execution of the script before the document is fully loaded.

• The third line tells the parser to load an XML document called "note.xml".

Page 120: XML (eXtensible Markup Language)

The XML DOM

• The XML DOM (XML Document Object Model) defines a standard way for accessing and manipulating XML documents.

• The DOM views XML documents as a tree-structure. • All elements can be accessed through the DOM tree. Their content

(text and attributes) can be modified or deleted, and new elements can be created.

• The elements, their text, and their attributes are all known as nodes.

• Example- we use the following DOM reference to get the text from the <to> element:

xmlDoc.getElementsByTagName("to")[0].childNodes[0].nodeValue

xmlDoc - the XML document created by the parser. getElementsByTagName("to")[0] - the first <to> element childNodes[0] - the first child of the <to> element (the text node) nodeValue - the value of the node (the text itself)

Page 121: XML (eXtensible Markup Language)

XML to HTML• In this example, we loop through an XML file (cd_catalog.xml),

and display each CD element as an HTML table row:• Try it yourself: Display XML data in an HTML table

• Example explained– We check the browser, and load the XML using the correct parser – We create an HTML table with <table border="1"> – We use getElementsByTagName() to get all XML CD nodes – For each CD node, we display data from ARTIST and TITLE as

table data. – We end the table with </table>

• Access Across Domains– For security reasons, modern browsers does not allow access

across domains.– Both the web page and the XML file it tries to load, must be

located on the same server.– Otherwise the xmlDoc.load() method, will generate the error

"Access is denied".

XML Document

C:\Documents and Settings\Santosh.Negi\D

Page 122: XML (eXtensible Markup Language)

XML Application•The XML Example Document

– Look at the following XML document ("cd_catalog.xml"), that represents a CD catalog:

•Load the XML Document– To load the XML document (cd_catalog.xml), we use the same code as

we used in the XML Parser– After the execution of this code, xmlDoc is an XML DOM object,

accessible by JavaScript.

•Display XML Data as an HTML Table– The following code displays an HTML table filled with data from the XML

DOM object: – For each CD element in the XML document, a table row is created.

Each table row contains two table data cells with ARTIST and TITLE data from the current CD element.

– Try it yourself: See how the XML data is displayed inside an HTML table.

XML Document

C:\Documents and Settings\Santosh.Negi\D

C:\Documents and Settings\Santosh.Negi\D

Page 123: XML (eXtensible Markup Language)

• Display XML Data in any HTML Element– XML data can be copied into any HTML element that

can display text.– The code is part of the <head> section of the HTML

file. It gets the XML data from the first <CD> element and displays it in the HTML element with the id="show":

– The body of the HTML document contains an onload event-attribute that will call the display() function when the page has loaded. It also contains a <div id='show'> element to receive the XML data.

– With the example above, you will only see data from the first CD element in the XML document.

XML Application (Contd…)

Page 124: XML (eXtensible Markup Language)

• Add a Navigation Script– The next() function makes sure that nothing is displayed

if you already are at the last CD element, and – The previous () function makes sure that nothing is

displayed if you already are at the first CD element.– The next() and previous() functions are called by clicking

next/previous buttons:

• All Together Now– With a little creativity we can create a full application.– We can easily develop this into a full application.

C:\Documents and Settings\Santosh.Negi\D

XML Application (Contd…)

C:\Documents and Settings\Santosh.Negi\D

Page 125: XML (eXtensible Markup Language)

Working with XmlReader & XmlWriter

• The System.Xml namespace provides the XmlReader and XmlWriter classes that enable us to parse and write XML data from streams or XML documents.

• These are abstract base classes that we can extend to create our own customized classes.

• In this session, we will learn about the functionalities that are provided by the XmlReader and XmlWriter classes.

• The XmlReader class allows us to access XML data from a stream or XML document.

• This class provides fast, non-cacheable, read-only, and forward-only access to XML data.

• The XmlReader class is an abstract class and provides methods that are implemented by the derived classes to provide access to the elements and attributes of XML data.

Page 126: XML (eXtensible Markup Language)

Overview of XmlReader

• We use XmlReader classes to determine various factors such as the depth of a node in an XML document, whether the node has attributes, the number of attributes in a node, and the value of an attribute.

• The XmlTextReader class is one of the derived classes of the XmlReader class and implements the methods defined by the XmlReader class.

• We use the XmlTextReader class to read XML data.

• The XmlTextReader class does not enable us to access or validate the document type definition (DTD) or schema information.

• The XmlValidatingReader class, which is another derived class of the XmlReader class, enables us to read XML data as well as supporting DTD and schema validation.

Page 127: XML (eXtensible Markup Language)

Overview of XmlReader (cont…)

• Reading XML Using XmlTextReader– The XmlTextReader class is used when we require fast

access to XML data but don't need support for DTD or schema validation.

– The XmlTextReader class should be used when we don't need to read the entire document into memory via the DOM.

• We can initialize an XmlTextReader object to read data from an XML document as shown in the following code.– XmlTextReader Reader = new XmlTextReader("ReadXM

L.xml");

Page 128: XML (eXtensible Markup Language)

Reading an XML FileVisual C# Code for Reading an XML File and write the data into The Listbox

XmlTextReader reader =new XmlTextReader("ReadXML.xml");

private void button1_Click(object sender, EventArgs e) { // XmlTextReader reader =new XmlTextReader(@"E:\ReadXML.xml"); while (reader.Read()) { switch (reader.NodeType) { case XmlNodeType.Element: lbNodes.Items.Add(reader.Name); while (reader.MoveToNextAttribute()) { lbNodes.Items.Add(reader.Name + "=" + reader.Value); //lbNodes.Items.Add(" " + reader.Name + "='" + reader.Value + "'"); } if (reader.HasAttributes ) { while (reader.MoveToNextAttribute()) lbNodes.Items.Add( "=" + reader.Value ); } break;

Page 129: XML (eXtensible Markup Language)

Reading an XML File (cont…)

case XmlNodeType.Text: lbNodes.Items.Add("=" + reader.Value); break;case XmlNodeType.EndElement: // lbNodes.Items.Add(reader.Name); // lbNodes.Items.Add(("</"+reader.Name+ ">")); break; } } }

Page 130: XML (eXtensible Markup Language)

Overview of XmlWriter

• The XmlWriter class is an abstract class that enables us to create XML streams and write data to well-formed XML documents.

• XmlWriter is used to perform tasks such as writing multiple documents into one output stream, writing valid names and tokens into the stream.

• XmlWriter is used to perform tasks such as encoding binary data and writing text output, managing output, and flushing and closing the output stream.

• The XmlTextWriter class, which is a derived class of XmlWriter, provides properties and methods that we use to write XML data to a file, stream, console, or other types of output.

Page 131: XML (eXtensible Markup Language)

Writing an XML File

Visual C# Code shows how to use XmlTextWriter to write XML to a file.

Private void button2_Click(object sender, EventArgs e) { XmlTextWriter textWriter = new XmlTextWriter(@"e:\Emp.xml",System.Text.Encoding.UTF8); textWriter.Formatting = Formatting.Indented; textWriter.WriteStartDocument(false); textWriter.WriteDocType("Employees", null, null, null); textWriter.WriteComment(“Example file for Star"); textWriter.WriteStartElement("Employees"); textWriter.WriteStartElement("Employee", null);

Page 132: XML (eXtensible Markup Language)

Writing an XML File (cont…)

textWriter.WriteElementString("FirstName", "Santosh");textWriter.WriteElementString("LastName", "Negi");textWriter.WriteElementString("DateOfBirth", "07-January-83");textWriter.WriteElementString("DateOfJoining", "17-September-02"); textWriter.WriteEndElement();textWriter.WriteEndElement();

//Write the XML to file and close the textWriter textWriter.Flush(); textWriter.Close(); Console.WriteLine("Press <Enter> to exit."); Console.Read(); }

Page 133: XML (eXtensible Markup Language)

Extended document standards

• We can define our own XML tag sets, but here are some already available:

– XHTML : HTML redefined in XML– SMIL : Synchronized Multimedia Integration

Language– MathML : Mathematical Markup Language– SVG : Scalable Vector Graphics– DrawML : Drawing MetaLanguage– ICE : Information and Content Exchange– ebXML : Electronic Business with XML– Cxml : Commerce XML– CBL : Common Business Library

Page 134: XML (eXtensible Markup Language)

Vocabulary• SGML : Standard Generalized Markup Language

• XML : Extensible Markup Language

• DTD : Document Type Definition

• Element : a start and end tag, along with their contents

• Attribute : a value given in the start tag of an element

• Entity : a representation of a particular character or string

• PI : a Processing Instruction, to possibly be used by a program that processes this XML

• Namespace : a unique string that references a DTD

• well-formed XML: XML that follows the basic syntax rules

• valid XML : well-formed XML that conforms to a DTD