Download - AdvXML_Lecture01_XML-Namspace
XML, NAMESPACE
Lecture 1
Advanced XML 1
Objectives
• Introduction to XML
– Outline the feature of markup language and list
their drawbacks
– Define and describe XML
– State the benefits and scope of XML
• Exploring XML
– Describe the structure of an XML document
– Explain the lifecycle of an XML document
Advanced XML 2
Objectives
• Exploring XML (Cont‟…)
– State the functions of editors for XML and list the
popularly used editors.
– State the functions of parsers for XML and list
names of commonly used parsers.
– State the functions of browsers for XML and list of
commonly used browsers
Advanced XML 3
Objectives
• Working with XML
– Explain the steps towards building an XML
– Define what is meant by well-formed XML
• XML Syntax
– State and describe the use of comments and processing instructions in XML
– Classify character data that is written between tags
– Describe entities, DOCTYPE declarations and attributes
Advanced XML 4
• Describe namespaces
–Define XML Namespaces
–Working with Namespaces Syntax
• Problems posed by prefixes
• Placing attributes in a namespace
• Default Namespaces
• Override default namespaces
Objectives
Advanced XML 5
Documents recorded
using paper and pen Typesetters formatting
documents
Tools used by typesetters
to format a document
Advanced XML 6
History of Markup
• A Markup language defines the rules that help to add
meaning to the content and structure of documents.
• They are classified as:
– Stylistic Markup – It determines the presentation of the
document
– Structure Markup – It defines the structure of the
document
– Semantic Markup – It determines the content of the
document
Advanced XML 7
Markup Language
SGML
• Generalized Markup Language (GML) is the
system of formatting documents.
• GML was fine-tuned and came to be known as
Standard Generalized Markup Language
(SGML).
• SGML is the source of origin of all markup
languages
Advanced XML 8
Features of SGML
• It describes markup language, which allows authors to create their own tags that relate to their content.
• It needs a separate file that will contain all the rules for the language, for its interpretation
• A SGML application is markup language derived from SGML.
Advanced XML 9
• HTML is a MARKUP language
• Using HTML tags and elements, we can:
– Control the appearance of the page and the content
– Publish online documents and retrieve online information using the links inserted in the HTML document
– Create on-line forms. These forms can be used to collect information about the user, conduct transactions, and so on
Advanced XML 10
Introduction to HTML
HTML
• HTML is the most famous markup language derived
from SGML.
• It was created to mark up technical papers so that
they could be transferred across different platforms
for the scientific community.
• It is now also used by those non-scientific users who
are concerned about their document‟s presentation.
Advanced XML 11
Drawbacks of HTML
• Fixed tag set
• Presentation technology does not relate to the contents
• It is flat
• Clogging
• HTML is not international
• Data interchange is impossible
• Does not have a robust linking mechanism
• HTML is not reusable
Advanced XML 12
HTML and XML code Examples
<UL>
<LI> TOM CRUISE
<UL>
<LI> CLIENT ID : 100
<LI> COMPANY : XYZ Corp.
<LI> Email : [email protected]
<LI> Phone : 3336767
<LI> Street Adress: 25th St.
<LI> City : Toronto
<LI> State : Toronto
<LI> Zip : 20056
</UL>
</UL>
<Details>
<CONTACT>
<PERSON_NAME>TOM CRUISE </PERSON_NAME>
<ID> 100 </ID>
<Company>XYZ Corp. </Company>
<Email>[email protected]</Email>
<Phone> 3336767</Phone>
<Street> 25th St.</Street>
<City>Toronto</City>
<State>Toronto</State>
<ZIP> 20056</ZIP>
</CONTACT>
</Details>
Advanced XML 13
HTML Code XML Code
XML -1
• XML stands for Extensible Markup Language.
• It overcomes all the drawbacks of HTML.
• It allows the user to define their own set of tags, and
also makes it possible for others (people or programs) to understand it.
• It is more flexible than HTML.
• It inherits the features of SGML and combines it with
the features of HTML.
• It is a smaller version of SGML.
Advanced XML 14
XML -2
• XML is a metalanguage and it describes other languages.
• The data contained in an XML file can be displayed in different ways.
• It can also be offered to other applications for further processing.
• Style sheets help transform structured data into different HTML views. This enables data to be displayed on different browsers.
Advanced XML 15
XML Architecture - 1
• XML supports three-tier architecture for handling and manipulating data.
• It can be generated from existing databases using a scalable three-tier model.
• XML tags represent the logical structure of data that can be interpreted and used in various ways by different applications.
• The middle-tier is used to access multiple databases and translate data into XML.
Advanced XML 16
XML Architecture -2
Advanced XML 17
XML – A Universal data format
• HTML is a single markup language, but XML is a family of markup languages.
• Any type of data can be easily defined in XML.
• XML is popular because it supports a wide range of applications and is easy to use.
• XML has a structured data format, which allows it to store complex data
Advanced XML 18
Benefits of XML
• The three-tier architecture has easier
scalability and better security.
• The benefits of XML are classified into the
following:
– Business benefits
– Technological benefits
Advanced XML 19
Business Benefits
• Information sharing:
– Allows businesses to define data formats in XML
– Provides tools to read, write and transform data between
XML and other formats
• XML inside a single application:
– Powerful, flexible and extensible language
• Content Delivery:
– Supports different users and channels, like digital TV,
phone, web and multimedia kiosks
Advanced XML 20
Business Benefits
• Other Benefits:
– Data Independence
– Easier to parse
– Reducing Server Load
– Easier to create
– Web Site Content
– Remote Procedure Calls
– e-Commerce
Advanced XML 21
Technological Benefits
Technological
Benefits
Re-use of data
Separation of data
and presentation
Extensibility Semantic
information
Advanced XML 22
XML Document Structure
Advanced XML 23
XML Document Structure
Advanced XML 24
XML Document Structure
• An XML document is composed of sets of “entities”
identified by unique names.
• All documents begin with a root or document entity.
• Entities are aliases for more complex functions.
• Documents are logically composed of declarations,
elements, comments, character references, and
processing instructions.
Advanced XML 25
Well formed and Valid Documents
• An XML document is considered as well formed, if a minimum set of requirements defined in the XML 1.0 specification are satisfied.
• The requirements ensure that correct language terms are used in the right manner .
• A valid XML document is a well-formed XML document, which conforms to the rules of a Document Type Definition (DTD).
• DTD defines the rules that an XML markup in the XML document must follow.
Advanced XML 26
XML Document Life cycle
• XML Document Life cycle
• Importance components
• Editors
• Parser
• Browser Advanced XML 27
Editors
• The main functions that editors provide:
– Add opening and closing tags to the code
– Check for validity of XML
– Verify XML against a DTD/Schema
– Perform series of transforms over a document
– Color the XML Syntax
– Display the line numbers
– Present the content and hide the code
– Complete the word
Advanced XML 28
Editors
• The popular used editors are:
– Oxygen
– XML Writer
– XML Spy
– XML Pro
– XML Mind
– XMetal
Advanced XML 29
Parsers - 1
• Parsers help the computer interpret an XML
file.
<?xml version=“1.0”?> <nxn> </nxn>
Editor with the XML document
Parsed document viewed in the browser
XML document parsed by the parser
• Their are two types of parsers:
• Non Validating parser
• Validating parser
Advanced XML 30
Parsers - 2
XML
file
Other related
files (like
DTD file)
Parsers load the XML
and other related files
to check whether the
XML document is
well formed and valid
Data tree
Advanced XML 31
Parsers - 3
• Commonly used parsers are:
• Crimson
• Xerces
• Oracle XML Parser
• JAXP (Java API for XML)
• MSXML
Advanced XML 32
Browsers
• Commonly used web browser are as follows:
• Netscape
• Mozilla
• Internet Explorer
• Firefox
• Opera
Advanced XML 33
Data vs. Markup
<NAME> Tom Cruise </NAME>
Markup
Data
Advanced XML 34
Creating an XML Document
• To create an XML document:
– State an XML declaration
– Create a root element
– Create the XML code
– Verify the document
Advanced XML 35
Creating an XML Document
Advanced XML 36
Stating an XML Declaration
• Syntax
<?xml version=“1.0” standalone=“no” encoding=“UTP-8”?>
• „Standalone‟ and „encoding‟ attributes are
optional, only the version number is
mandatory
• „Standalone‟ – is the external declaration
• „Encoding‟ - specifies the character encoding
used by the author
• XML 1.0 version is default
Advanced XML 37
Creating a Root Element
• There can only be one root element
• It describes the function of the document
• Every XML document must have a root
element
Example
<?xml version=“1.0” standalone=“no” encoding=“UTP-8”?>
<BOOK>
</BOOK>
Advanced XML 38
Creating the XML Code -1
• It is the process of creating our own elements and
attributes as required by our application.
• Elements are the basic units of XML content.
• Tags tell the user agent to do something to the content
encased between the start and end tag.
Opening Tag Content Closing Tag
<TITLE> FPT University </TITLE>
Element
Parts of an
element
Advanced XML 39
Creating the XML Code -2
• Rules govern the elements:
– At least one element required
– XML tags are case sensitive
– End the tags correctly
– Nest tags Properly
– Use legal tags
– Length of markup names
– Define Valid Attributes
Advanced XML 40
Verify the document
• The document should follow the XML rules;
otherwise it will not be read by the browser or
by any other XML reader
Advanced XML 41
Comments
• This is information for the understanding of
the user, and is to be ignored by the processor.
• Syntax
<!- - Write the comment here -- >
Example
<!-- don't show these <NAME>KATE WINSLET</NAME> <NAME>NICOLE KIDMAN</NAME> <NAME>ARNOLD</NAME> --> <NAME>TOM CRUISE</NAME>
The example given will
display only the name
TOM CRUSIE, and others
are treated as comments.
Advanced XML 42
Processing Instruction
• A processing information is a bit of information
meant for the application using the XML document.
• These instructions are directly passed to the
application using the parser.
• The XML declaration is also a processing agent.
<?xml:stylesheet type=“text/xsl”?>
Name of application Instruction information
Advanced XML 43
Character Data
• The text between the start and end tags is
defined as „character data‟.
• Character data may be any legal (Unicode).
• Character data is classified into:
– PCDATA
– CDATA
Advanced XML 44
PCDATA
• It stands for parsed character data.
• PCDATA is text that will be parsed by a Parser.
• Tags inside the text will be treated as markup and
entities will be expanded.
Entity Name
Character
<
<
>
>
&
&
"
"
'
'
Predefined entities
Advanced XML 45
CDATA
• It means character data.
• It will not be parsed by the Parser.
• CDATA are used to make it convenient to include large blocks of special characters.
• The character string ]]> is not allowed within a CDATA block as it will signal the end of the CDATA block.
<SAMPLE> <![CDATA[<DOCUMENT> <NAME>TOM CRUISE</NAME> <EMAIL>[email protected]</EMAIL> </DOCUMENT>]]> </SAMPLE>
Example
Advanced XML 46
Entities
• Entities are used to avoid typing long pieces of text
repeatedly within a document.
• There are two categories of entities:
– General entities
Syntax
<!ENTITY ADDRESS "text that is to be represented by
an entity">
– Parameter entities
Syntax
<!ENTITY % ADDRESS "text that is to be represented by an entity">
Advanced XML 47
Entities
Advanced XML 48
Examples of Entities
An example of Parameter entities
< CLIENT = "&FPT;" PRODUCT =
"&PRODUCT_ID;" QUANTITY
= "15">
• Entity declaration
– Syntax
%PARAMETER_ENTITY_NA
ME;
– Example
%address;
An example of a General entity
<!ENTITY full_address " My
Address 12 Tenth Ave. Suite 12
Paris, France">
• Entity declaration
– Syntax
&ENTITY_NAME;
– Example
&address;
Advanced XML 49
The DOCTYPE declarations
• The <!DOCTYPE [..]> declaration follows the XML declaration in an XML document.
• Syntax <?xml version="1.0"?> <!DOCTYPE myDoc [ ...declare the entities here.... <myDoc> ...body of the document.... </myDoc>
Example
<!DOCTYPE CUSTOMERS [ <!ENTITY firstFloor "15 Downing St Floor 1"> <!ENTITY secondFloor "15 Downing St Floor 2"> <!ENTITY thirdFloor "15 Downing St Floor 3"> ]>
Advanced XML 50
Attributes
• An attribute gives information about an
element.
• Attributes are embedded in the element start
tag.
• An attribute consists of an attribute name and
attribute value.
Example
<TV count="8">SONY</TV>
<LAPTOP count="10">IBM</LAPTOP>
Advanced XML 51
• Two or more applications on the Internet may also
have some element names that are common.
Namespaces help avoid such ambiguity that may
arise.
• It also allows to combine documents from different
sources and enables the identification of what element
or attributes come from which source.
• It instructs the user agent to access the DTD against
which the document is validated.
Advanced XML 52
XML Namespaces - 1
• A URI(Uniform Resource Identifier) is used to identify namespaces in XML.
• It includes Uniform Resources Name(URN) and a Uniform Resource Locator(URL).
• URL contains the reference for a document or an HTML page on a web.
• URN is a universally unique number that identifies Internet resources.
Advanced XML 53
XML Namespaces - 2
• Namespaces are used to overcome the conflict that arise when reuse and extension of the DTD‟s take place.
• Namespaces help standardize and uniquely brand elements and attributes.
• Namespaces employ the URI to instruct the user-agent about the location of the DTD against which the XML document is checked for validity.
• Namespaces ensure that element names do not conflict and do clarify their origins.
Advanced XML 54
Needs of a Namespace
Advanced XML 55
Needs of a Namespace
Advanced XML 56
Syntax for Namespace
• A prefix is associated with the URI that can be used
as a namespace.
• Syntax
xmlns:[prefix]= “[URI of namespace]”
– The xmlns: is a reserved attribute
• Example
xmlns:ins= “http://www.fpt.edu.vn”
– Namespace needs to be declared before using
– It is declared in the root element of the document
Advanced XML 57
Syntax for Namespace
• Attributes comes within the namespace of their element unless they are predefined.
• We can also incorporate attributes from two domains:
<sample
xmlns= “http://www.fpt.edu.vn”
xmlns:tea_batch= “http://www.tea.org”>
<batch-list>
<batch type=“thirdbatch”>Evening Batch</batch>
<batch tea_batch:type= “thirdbatch”>Tea batch III
</batch>
<batch>Afternoon Batch</batch>
</batch-list>
</sample>
Advanced XML 58
Attributes and Namespaces
• The new XSL syntax makes use of namespace to identify both its own tags, and the formatting vocabulary tags.
• The xsl: prefix are in the http//www.w3.org/TR/WD-xsl namespace.
• The fo: prefix are in the http//www.w3.org/TR/WD-xsl/FO.
• XSL is written in XML syntax and uses tags, elements, and attributes.
Advanced XML 59
Namespace Application
<book
xmlns:html=“http//www.w3.org/TR/WD-xsl/FO”>
<index>
<chapter>this is chapter 1</chapter>
<html:br/>
<chapter>this is chapter 1</chapter>
</index>
</book>
Advanced XML 60
Namespace Example
Advanced XML 61
Default Namespace
Advanced XML 62
Override Default Namespace
Summary-1
• A markup language defines a set of rules that adds meaning to the content and structure of documents
• XML is extensible, which means that we can define our own set of tags, and make it possible for other parties (people or programs) to know and understand these tags. This makes XML much more flexible than HTML
• XML inherits features from SGML and includes the features of HTML. XML can be generated from existing databases using a scalable three-tier model. XML-based data does not contain information about how data should be displayed
• An XML document is composed of a set of “entities” identified by unique names
Advanced XML 63
Summary-2
• A well-formed document is one that conforms to the basic rules of XML; a valid document is a well-formed document that conforms to the rules of a DTD (Document Type Definition)
• The parser helps the computer to interpret an XML file
• Steps involved in the building of an XML document are:
– Stating an XML declaration
– Creating a root element
– Creating the XML code
– Verifying the document
• Character data is classified into PCDATA and CDATA
Advanced XML 64
Summary-3
• Entities are used to avoid typing long pieces of text repeatedly in a document. The two types of entities are:
– General entities
– Parameter entities
• The <!DOCTYPE […]> declaration follows the XML declaration in an XML document.
• An attribute gives information about an element
Advanced XML 65