xml documents chao-hsien chu, ph.d. school of information sciences and technology the pennsylvania...

23
XML Documents XML Documents Chao-Hsien Chu, Ph.D. School of Information Sciences and Technology The Pennsylvania State University Elements Attributes Comments PI Document Type

Upload: barrie-robbins

Post on 03-Jan-2016

219 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: XML Documents Chao-Hsien Chu, Ph.D. School of Information Sciences and Technology The Pennsylvania State University Elements Attributes Comments PI Document

XML DocumentsXML DocumentsChao-Hsien Chu, Ph.D.

School of Information Sciences and TechnologyThe Pennsylvania State University

ElementsAttributes

Comments

PI

Documen

t

Type

Page 2: XML Documents Chao-Hsien Chu, Ph.D. School of Information Sciences and Technology The Pennsylvania State University Elements Attributes Comments PI Document

Components of XML SystemsComponents of XML Systems

XMLParser

(Processor)

XMLApplication

XMLDocument(Contents)

XMLDTD

(Rule)

Well-Formed(Syntax)

Validate(Structure)

Page 3: XML Documents Chao-Hsien Chu, Ph.D. School of Information Sciences and Technology The Pennsylvania State University Elements Attributes Comments PI Document

FurtherProcessing

(optional)

How a Parser Interprets XML - ValidateHow a Parser Interprets XML - Validate

XMLDocument

Data TypeDefinition

IssueWarning/Stop

Processing

WellFormed? DTD?

Valid?Issue

Warning/StopProcessing

no

no

no

yes

yes

yes

Page 4: XML Documents Chao-Hsien Chu, Ph.D. School of Information Sciences and Technology The Pennsylvania State University Elements Attributes Comments PI Document

XML Document SyntaxXML Document Syntax

Processing Instructions (PI)

Document Type Declarations (optional)

Comments (optional)

Element Start and End Tags

Attributes

Entity References

Character Data Sections (CDATA)

Page 5: XML Documents Chao-Hsien Chu, Ph.D. School of Information Sciences and Technology The Pennsylvania State University Elements Attributes Comments PI Document

The Panoramic Perspective of XMLThe Panoramic Perspective of XML

XMLDocument

Prolog

Doc. TypeDeclaration

RootElement

Comments

ProcessingInstructions

Comments

ProcessingInstructions

Comments

ProcessingInstructions

EntityReferences

CDATASections

Elements PCDATA

Attributes

EntityReferences

CDATA,Entities, ID,..

Doc TypeDefinitions

ElementDeclaration

AttributeDeclaration

EntityDeclaration

NotationsDeclaration

: Optional

Page 6: XML Documents Chao-Hsien Chu, Ph.D. School of Information Sciences and Technology The Pennsylvania State University Elements Attributes Comments PI Document

An Example of XML DocumentAn Example of XML Document<?xml version = "1.0" standalone = "no“ ?><!DOCTYPE Address_Book SYSTEM "fclml.dtd"><?xml-stylesheet type="text/xsl" href="mystyle.xsl“ ?><Address_Book> <Contact> <Name>Alley Gator</Name> <ID>001</ID> <EMAIL>[email protected]</EMAIL> <Phone>(010)62345678</Phone> <Address> <Street>112 Main Street</Street> <City>Muddy Waters</City> <State>FL</State> <ZIP>55544</ZIP> </Address> </Contact></Address_Book>

Process Instruction

Elements

RootElement

DocumentTypeDeclaration

Page 7: XML Documents Chao-Hsien Chu, Ph.D. School of Information Sciences and Technology The Pennsylvania State University Elements Attributes Comments PI Document

Processing Instructions (PI)Processing Instructions (PI)

PI is used to provide information regarding processing such as processor (name and version of the processor)

Syntax: <?Processor Attribute = “Value of Attribute” ?>

Examples: <?xml version = “1.0” ?> <?xml version="1.0" encoding="Big5" ?> <?xml version = "1.0" standalone = "no"?> <?rtf \page ?>

DTD File

Page 8: XML Documents Chao-Hsien Chu, Ph.D. School of Information Sciences and Technology The Pennsylvania State University Elements Attributes Comments PI Document

Document Type DeclarationDocument Type Declaration

A statement embedded in an XML document whose purpose is to point to the existence and location of a document type definition (DTD).

DTD is optional.

Syntax:<!DOCTYPE Root Element SYSTEM “xxx.dtd">

Example:<?xml version = "1.0" standalone = "no"?>

<!DOCTYPE Address_Book SYSTEM "fclml.dtd">

Page 9: XML Documents Chao-Hsien Chu, Ph.D. School of Information Sciences and Technology The Pennsylvania State University Elements Attributes Comments PI Document

CommentsComments

A place to write a note for reminding, simple documentation, or commenting out codes for debugging, etc., which will not be seen by the end users.

<!-- This is a comment area-->

You can use any character inside the comment area except “--” itself

There is no limitation on the length of the comment area.

Comments may not come before the XML declaration. Comments may not be placed inside a tag.

Page 10: XML Documents Chao-Hsien Chu, Ph.D. School of Information Sciences and Technology The Pennsylvania State University Elements Attributes Comments PI Document

Guideline for ElementsGuideline for Elements

Elements are the building blocks of XML documents.

Every document needs to have one and only one root element.

An element must start with a starting tag and ends with a corresponding ending tag.

Element names are case sensitive. Element names must open and close with identical cases.

Spaces are not allowed between the forward slash and element name.</ Books>

Page 11: XML Documents Chao-Hsien Chu, Ph.D. School of Information Sciences and Technology The Pennsylvania State University Elements Attributes Comments PI Document

Example of Element Example of Element

<Item optional = “1”>

</Item>

TagName

AttributeName

AttributeValue

Attribute

End Tag

Start Tag

Element Contents• Texts• Elements

Elem

ent

Page 12: XML Documents Chao-Hsien Chu, Ph.D. School of Information Sciences and Technology The Pennsylvania State University Elements Attributes Comments PI Document

Guideline for ElementsGuideline for Elements

Elements can be used to both contain information and define structure.

The structure of information is encoded by the nesting of tags.

Empty elements, which don’t have contents, are being used as placeholders or to signify their existence

E.G., <BR />.

Page 13: XML Documents Chao-Hsien Chu, Ph.D. School of Information Sciences and Technology The Pennsylvania State University Elements Attributes Comments PI Document

Tree Diagram of Address DocumentTree Diagram of Address Document

Address_Book

Contact

IDName E-mail Phone Address

ZipStateCityStreet

Root Element

Page 14: XML Documents Chao-Hsien Chu, Ph.D. School of Information Sciences and Technology The Pennsylvania State University Elements Attributes Comments PI Document

Element NameElement Name

Element names must begin with a letter or an underscore (_). Subsequent characters may include letters, digits, underscore, hyphens, and periods.

Element names cannot begin with a number. Element names cannot include spaces.

Page 15: XML Documents Chao-Hsien Chu, Ph.D. School of Information Sciences and Technology The Pennsylvania State University Elements Attributes Comments PI Document

Instant QuizInstant Quiz

<Help> <Book%7> <Volume Control> <Volume> <_8ball> <1heading> <heading1> <Mary Smith> <section.paragraph>

_______ _______ _______ _______ _______ _______ _______ _______ _______ _______ _______ _______ _______ _______ _______ _______

Which of the flowing are “legal” or “illegal” element name?

Page 16: XML Documents Chao-Hsien Chu, Ph.D. School of Information Sciences and Technology The Pennsylvania State University Elements Attributes Comments PI Document

AttributesAttributes

Attributes are small descriptive bits of information used for describing elements.

Attributes are contained within the start tag of an element after the element name and are followed by an “=“ sign, then the value of the attribute.

The attributes value must be enclosed with a pair of single or double quotes.

Page 17: XML Documents Chao-Hsien Chu, Ph.D. School of Information Sciences and Technology The Pennsylvania State University Elements Attributes Comments PI Document

Instant QuizInstant Quiz

1. <marble color=“red”> _____

2. <marble color=“red” size=“big”> _____

3. <marble color=“red” /> _____

4. <marble color=red> _____

5. <marble color> _____

Which of the followings are legal attributes?

Page 18: XML Documents Chao-Hsien Chu, Ph.D. School of Information Sciences and Technology The Pennsylvania State University Elements Attributes Comments PI Document

CDATA SectionCDATA Section

CDATA sections are used when you want all text to be interpreted as pure character data rather than as markup. This is useful if you have a lot of <, >, & or “ characters.

Example:<Height><![CDATA[Faraz < Alex]]>

</Height>

Page 19: XML Documents Chao-Hsien Chu, Ph.D. School of Information Sciences and Technology The Pennsylvania State University Elements Attributes Comments PI Document

Entity ReferencesEntity References

Entity references are markup that is replaced with character data when the document is parsed.

XML predefines five entity references:&amp; &&lt; <&gt; >&quot; “&apos; ‘

Entity references point to either external text file or external picture.

Page 20: XML Documents Chao-Hsien Chu, Ph.D. School of Information Sciences and Technology The Pennsylvania State University Elements Attributes Comments PI Document

Illustration of Entity ReferenceIllustration of Entity Reference

XMLDocument

EntityReference

EntityReference

TextFile

Before

Page 21: XML Documents Chao-Hsien Chu, Ph.D. School of Information Sciences and Technology The Pennsylvania State University Elements Attributes Comments PI Document

Illustration of Entity ReferenceIllustration of Entity Reference

XMLDocument

AfterParsing

TextContents

Page 22: XML Documents Chao-Hsien Chu, Ph.D. School of Information Sciences and Technology The Pennsylvania State University Elements Attributes Comments PI Document

Well Formed DocumentWell Formed Document

Here are some general guidelines:

Contains one and only one root element. All elements must contain both start and end tags. Tags are case sensitive No overlapping tags. Elements must nest inside each other

properly. Attribute values must be enclosed in quotes. An empty element must end with “/>” The text characters (<), (>) and (“) must always be

represented by character entities.

Well formed XML documents are those documents that are syntactically correct.

Page 23: XML Documents Chao-Hsien Chu, Ph.D. School of Information Sciences and Technology The Pennsylvania State University Elements Attributes Comments PI Document

Thank You?

Any Question?