Transcript
Page 1: Feb 2001C.Watters1 Grammars, SGML, & XML Agreeing on the rules

Feb 2001 C.Watters 1

Grammars, SGML, & XML

Agreeing on the rules

Page 2: Feb 2001C.Watters1 Grammars, SGML, & XML Agreeing on the rules

Feb 2001 C.Watters 2

Overview

What is a grammarBNF notationRegular expressionsContext free grammars (remember Chomsky)SGMLHTMLXML (finally) and the DTD

Page 3: Feb 2001C.Watters1 Grammars, SGML, & XML Agreeing on the rules

Feb 2001 C.Watters 3

What is a Grammar?

A grammar is a set of rules which can generate a construct from a list of

terminals determine if a construct obeys the rules

(i.e. is well formed)

Example - English construct is a sentence terminals are words

Page 4: Feb 2001C.Watters1 Grammars, SGML, & XML Agreeing on the rules

Feb 2001 C.Watters 4

Simple GrammarSentence::= Subject Verb

Subject::=(Pnoun | (def noun))Pnoun::=(Joe | Mary)Verb::=(runs | sits)def::=(the | a | an)noun::=(boy | girl)

Page 5: Feb 2001C.Watters1 Grammars, SGML, & XML Agreeing on the rules

Feb 2001 C.Watters 5

Syntax not Semantics

Let’s expand definition of noun terminals Noun::=(car | hat | TV) no problem with syntax! Hat running a little vague.

They are flying planes. No problem with syntax. Semantics not so clear

Page 6: Feb 2001C.Watters1 Grammars, SGML, & XML Agreeing on the rules

Feb 2001 C.Watters 6

What is a grammar for anyway?

1. Parse Input Examines input and determines if it satisfies

the rules of a given grammar Joe Mary. If(a<3){jump} <html>my page</hjkl>

2. Generate output Use the rules to generate well-formed

entities Joe runs. If(a < 3){b=a} <html>my page </html>

Page 7: Feb 2001C.Watters1 Grammars, SGML, & XML Agreeing on the rules

Feb 2001 C.Watters 7

BNFBackus Naur (or Normal) Form

Notation for describing syntax of a languageJohn Backus and Peter Naur, 1960’s for ALGOLMeta-symbols used ::= LHS is defined by RHS | or < > category name (defined somewhere

else)Example <program>::= begin <statements>

end;

Page 8: Feb 2001C.Watters1 Grammars, SGML, & XML Agreeing on the rules

Feb 2001 C.Watters 8

Useful extensions to BNF

Optional [else <statements>]repetitive items {<letter> | <digit> }recursion is allowed <integer> ::= <digit> | <integer>

<digit>

brackets for grouping <letter>(<digit> | <letter> <digit>)

Page 9: Feb 2001C.Watters1 Grammars, SGML, & XML Agreeing on the rules

Feb 2001 C.Watters 9

Regular Expressions

Simple way to express languages or stringseg. One ‘a’ followed by (one ‘n’ followed by one ‘d’) or by one ‘t’

a((n d) | t)

note: may use + instead of |

Page 10: Feb 2001C.Watters1 Grammars, SGML, & XML Agreeing on the rules

Feb 2001 C.Watters 10

Regular Expressions include

Concatenation\sequence A BSelection A|BKleene Closure (0 or more) A*Positive Closure(1 or more) A+

Bounded Repetition (1 to i) Ai

eg. A(n*| t) = {A, An, Ann, …, At}

Page 11: Feb 2001C.Watters1 Grammars, SGML, & XML Agreeing on the rules

Feb 2001 C.Watters 11

Context-Free Grammars

CFG is a set of recursive productions used to generate patterns of strings satisfying the construct of the language

SGML and its subsets, HTML and XML are context-free grammars!CFG are more powerful than Regular Expressions

Page 12: Feb 2001C.Watters1 Grammars, SGML, & XML Agreeing on the rules

Feb 2001 C.Watters 12

SGML

Standardized General Markup LanguageDeveloped by a committee!Led by Charles Goldfarb, 1978-1986a grammar to define the structure of documents

rules define the construct or structureterminals are <tags> and strings

Page 13: Feb 2001C.Watters1 Grammars, SGML, & XML Agreeing on the rules

Feb 2001 C.Watters 13

XML

DTDsHow to use itExamples

Page 14: Feb 2001C.Watters1 Grammars, SGML, & XML Agreeing on the rules

Feb 2001 C.Watters 14

DTD - grammar definition for a document type

Defines: element types (structure) attributes (terminals) constraints on combinations of these

Page 15: Feb 2001C.Watters1 Grammars, SGML, & XML Agreeing on the rules

Feb 2001 C.Watters 15

Element Type Declaration<!DOCTYPE GarageSale [<!ELEMENT GarageSale (Date, Place, Notes)><!ELEMENT Date (#PCDATA)><!ELEMENT Place (#PCDATA)><!ELEMENT Notes (#PCDATA)> ]>

<GarageSale><Date>today</Date><Place>myhouse</Place><Notes>Rain or

shine</Notes> </GarageSale>

Page 16: Feb 2001C.Watters1 Grammars, SGML, & XML Agreeing on the rules

Feb 2001 C.Watters 16

Sub-Elements<!ELEMENT Figure (Graphic|Code)>

<!ELEMENT Figure (Caption, (Graphic|Code))><!ELEMENT Figure (Caption?, (Graphic|Code))>

? Means optional

<!ELEMENT FTNOTE (P+)> 1 or more

<!ELEMENT FTNOTE (P*)> 0 or more

where Graphic etc are also defined as ELEMENTS

Page 17: Feb 2001C.Watters1 Grammars, SGML, & XML Agreeing on the rules

Feb 2001 C.Watters 17

AttributesLet you add extra information to elements<!ELEMENT Place (#PCDATA)><!ATTLIST Place Address CDATA Email CDATA Phone CDATA><Place Address=“1234 Oak”>

<!ATTLIST SHIRT Size (small|medium|large)><Shirt size=“small”>

CDATA=> character data

Page 18: Feb 2001C.Watters1 Grammars, SGML, & XML Agreeing on the rules

Feb 2001 C.Watters 18

Validating Parser

A DTD (document type definition) defines the grammar of a type of document

memo web page book

Validating Parser uses a DTDto check if a given document satisfies the rules of that grammar

Page 19: Feb 2001C.Watters1 Grammars, SGML, & XML Agreeing on the rules

Feb 2001 C.Watters 19

HTML & XMLHTML is a subset of SGML with a

shared DTD HTMLDOC::=(<html> HEAD BODY

</html>)

XML is a subset of SGML with many DTD’s allowed

“XML is like HTML with the training wheels off” -Dan Connolly, leader of XML activity at W3C

Page 20: Feb 2001C.Watters1 Grammars, SGML, & XML Agreeing on the rules

Feb 2001 C.Watters 20

XMLUses tags to identify semantics of data

looks like HTML, but isn’t<slide><title>Introduction</title> <author><first>Carolyn</first>

<last>Watters</last> </author>

<content>XML this and that</content></slide>

is license free, platform-independent and well-supported

Page 21: Feb 2001C.Watters1 Grammars, SGML, & XML Agreeing on the rules

Feb 2001 C.Watters 21

HTMLHypertext Markup Language

Hypertext Markup Language

Presents documents via WWW browsers

Document layout and hyperlink specifications

Predefined set of tags (ie. Common DTD)

Page 22: Feb 2001C.Watters1 Grammars, SGML, & XML Agreeing on the rules

Feb 2001 C.Watters 22

<HTML><TITLE>Statistics Canada</TITLE><BODY><H3>Welcome to Stats Canada</H3>Statistics Canada ……. . <p> We like numbers…..<img src=“mapleleaf.gif><ul>What we do<li><a href=“census.html”>Census</a><li><a href=“special.html”>Special surveys</a><li>a href=“online.html”>Online data</a></ul></BODY></HTML>

Page 23: Feb 2001C.Watters1 Grammars, SGML, & XML Agreeing on the rules

Feb 2001 C.Watters 23

HTML

HTML - Advantages

Simple - fixed set of tags

Portable - used with all browsers

Linking - within and to external documents

HTML - Disadvantages

Limited tag set

Can’t separate the definition from content

Can’t define structure of contents

Page 24: Feb 2001C.Watters1 Grammars, SGML, & XML Agreeing on the rules

Feb 2001 C.Watters 24

XML

XML allows anyone to define a document structure separate from its display structure

Explicit Definition - DTD

Page 25: Feb 2001C.Watters1 Grammars, SGML, & XML Agreeing on the rules

25C.WattersFeb 2001

Some CodeSome Code

Schema

Entity Passport Details

SubEntities Last Name First Name Address

Entity Address

SubEntities Street City Town State Province ……..

Page 26: Feb 2001C.Watters1 Grammars, SGML, & XML Agreeing on the rules

Feb 2001 C.Watters 26

<!ELEMENT passport_details (last_name,first_name+,address)><!ELEMENT last_name (#PCDATA)><!ELEMENT first_name (#PCDATA)><!ELEMENT address (street,(city|town),(state|province),(ZIP|

postal_code),country,contact_no?,email*)><!ELEMENT street (#PCDATA)><!ELEMENT city (#PCDATA)><!ELEMENT town (#PCDATA)><!ELEMENT state (#PCDATA)><!ELEMENT province (#PCDATA)><!ELEMENT ZIP (#PCDATA)><!ELEMENT postal_code (#PCDATA)><!ELEMENT country (#PCDATA)><!ELEMENT phone_home (#PCDATA)><!ELEMENT email (#PCDATA)>

DTD

Page 27: Feb 2001C.Watters1 Grammars, SGML, & XML Agreeing on the rules

27C.WattersFeb 2001

Internal DTD and InstanceInternal DTD and Instance

<?xml version='1.0'?><!DOCTYPE passport_details [<!ELEMENT passport_details

(last_name,first_name+,address)><!ELEMENT last_name (#PCDATA)><!ELEMENT first_name (#PCDATA)><!ELEMENT address (street,(city|town),

(state|province),(ZIP|

postal_code),country,contact_no?,email*)>

<!ELEMENT street (#PCDATA)><!ELEMENT city (#PCDATA)><!ELEMENT town (#PCDATA)><!ELEMENT state (#PCDATA)><!ELEMENT province (#PCDATA)><!ELEMENT ZIP (#PCDATA)><!ELEMENT postal_code (#PCDATA)><!ELEMENT country (#PCDATA)><!ELEMENT phone_home (#PCDATA)><!ELEMENT email (#PCDATA)>]>

<passport_details><last_name>Smith</last_name><first_name>Jo</first_name><first_name>Stephen</first_name><address>

<street>1 Great Street</street>

<city>GreatCity</city><state>GreatState</state><postal_code>1234</

postal_code><country>GreatLand</

country>

<email>[email protected]</email></address>

</passport_details>

Page 28: Feb 2001C.Watters1 Grammars, SGML, & XML Agreeing on the rules

28C.WattersFeb 2001

Shared DTDShared DTD

XML Document specifies the DTD<?xml version='1.0'?>

<!DOCTYPE passport_details SYSTEM "PassportExt.dtd">

<passport_details><last_name>Smith</last_name><first_name>Jo</first_name><first_name>Stephen</first_name><address>

<street>1 Great Street</street><city>GreatCity</city><state>GreatState</state><postal_code>1234</postal_code><country>GreatLand</country>

<email>[email protected]</email></address>

</passport_details>

Page 29: Feb 2001C.Watters1 Grammars, SGML, & XML Agreeing on the rules

Feb 2001 C.Watters 29

Coordinating Heterogenous Databases

Separation of Structure / Content / Display

Document Validity Checking

Potential Use in Standards

Importance of XML

Page 30: Feb 2001C.Watters1 Grammars, SGML, & XML Agreeing on the rules

Feb 2001 C.Watters 30

Example

Boeing

Boeing places a DTD on its site

part purchasers use this DTD

Boeing can use multiple XSL stylesheets

Page 31: Feb 2001C.Watters1 Grammars, SGML, & XML Agreeing on the rules

Feb 2001 C.Watters 31

Boeing (cont’d)

customer creates an order document, they can verify the validity of that document against the DTD.

this ensures they are transmitting only type-valid orders.

in turn, Boeing can ensure they are receiving only type-valid documents.

Page 32: Feb 2001C.Watters1 Grammars, SGML, & XML Agreeing on the rules

Feb 2001 C.Watters 32

2. Using XML: DOM & SAX

DOM: Document Object Model

The DOM is a standard object application programming interface that gives developers programmatic control of XML document content, structure, formats, and more.

DOM defines a programmatic API for accessing XML documents.

Page 33: Feb 2001C.Watters1 Grammars, SGML, & XML Agreeing on the rules

Feb 2001 C.Watters 33

3. Using XML: presenting data

Need to convert XML tags into appropriate HTML tags for use in a browser!!

<lastname>Smith</lastname>

<b>Smith</b> Smith

Page 34: Feb 2001C.Watters1 Grammars, SGML, & XML Agreeing on the rules

Feb 2001 C.Watters 34

Stylesheets are used to present XML: The Cascading Stylesheet Specification (CSS)

The Extensible Style Language (XSL)

CSS XSL

Can be used with HTML? Yes No

Can be used with XML? Yes Yes

Transformation language? No Yes

Syntax CSS XML

Page 35: Feb 2001C.Watters1 Grammars, SGML, & XML Agreeing on the rules

Feb 2001 C.Watters 35

CSS and XSL

CSS - Cascading Style Sheets can predefined HTML display (font etc) these are shared and reused

XSL - XML Style language predefine display characteristics for XML

entities transform into CSS for browsers to use

Page 36: Feb 2001C.Watters1 Grammars, SGML, & XML Agreeing on the rules

36C.WattersFeb 2001

Cascading Style SheetsCascading Style Sheets

CSSlast_name

{font-family: verdana, arial;font-size: 15pt;font-weight:bold;display: block;margin-bottom: 5pt;

}first_name

{font-family: verdana, arial;font-size: 15pt;font-weight:bold;display: block;margin-bottom: 5pt;

}

street, city, town, state, province, ZIP, postal_code{font-family: verdana, arial;font-size: 12pt;font-weight:bold;color:green;display:block;margin-bottom: 20pt;margin-top: 40pt;

}email

{font-family: verdana, arial;font-size: 12pt;font-weight:bold;color:blue;display:block;margin-top: 5pt;

}

Page 37: Feb 2001C.Watters1 Grammars, SGML, & XML Agreeing on the rules

Feb 2001 C.Watters 37

CSS

Most local definition has precidenceMay be referred to (shared)

Page 38: Feb 2001C.Watters1 Grammars, SGML, & XML Agreeing on the rules

38C.WattersFeb 2001

<?xml version='1.0'?><xsl:stylesheet

xmlns:xsl="http://www.w3.org/TR/WD-xsl"xmlns="http://www.w3.org/TR/REC-html40"result-ns="">

<xsl:template><xsl:apply-templates/></xsl:template><xsl:template match="/"> <html>

<head><title><xsl:value-of select="/passport/last_name"/></title></head><body> <H1><xsl:value-of select="/pastport/last_name, first_name"/></H1> <H2>Address</H2>

<BLOCKQUOTE><xsl:apply-templates select="/passport/address"/></BLOCKQUOTE>

</body> </html>

XSL (Style Language)

Page 39: Feb 2001C.Watters1 Grammars, SGML, & XML Agreeing on the rules

Feb 2001 C.Watters 39

Understanding A Template

Most templates have the following form:<xsl:template match="para"> <p><xsl:apply-templates/></p> </xsl:template>

The whole <xsl:template> element is a template

The match pattern determines where this template applies

Literal result elements come from non-XSL namespace(s)

XSLT elements come from the XSL namespace

Page 40: Feb 2001C.Watters1 Grammars, SGML, & XML Agreeing on the rules

Feb 2001 C.Watters 40

Options for displaying XML

XMLDocument

CSSStylesheet

CSSStylesheet

XSLStylesheet

XSLStylesheet

XML enabledWeb BroswerXML enabledWeb Broswer

XML DisplayEngine

XML DisplayEngine

XSLTransformation

spec

HTMLDocument

Web BroswerWeb BroswerXSL

Transformation

Page 41: Feb 2001C.Watters1 Grammars, SGML, & XML Agreeing on the rules

Feb 2001 C.Watters 41

2. Using XML:How does browser read XML ?

XML parser: A tool for reading XML documents

Microsoft's Internet Explorer 4.0 was the first Web browser to implement XML

Netscape will support XML metadata in Communicator/Navigator 5.0 as a delivery component code-named Aurora.

Page 42: Feb 2001C.Watters1 Grammars, SGML, & XML Agreeing on the rules

Feb 2001 C.Watters 42

Desktop

Middle - Tier

Storage

Display

Multiple view created from the XML-base data

Data Delivery,

Manipulation:XML exchanged over HTTP manipulated via the DOM

Data Integration

XML emitted or generated from multiple source

XML delivered to other applications or objects for further processing

HTML view #1(eg.

Purchasing Agent)

HTML view #2(eg.

Consumer)

Web ServerDB Access, Integration

Business Rules(eg. Purchase order)

Web ServerDB Access, Integration

Business Rules(eg. Purchase order)

Mainframe Database

XML

XML Architecture

Page 43: Feb 2001C.Watters1 Grammars, SGML, & XML Agreeing on the rules

Feb 2001 C.Watters 43

4. Case Study

An example of XML Tree structure

A simply example: Portfolio.xml Portfolio.xsl

http://msdn.microsoft.com/xml/samples/review/review-xsl.xml

Page 44: Feb 2001C.Watters1 Grammars, SGML, & XML Agreeing on the rules

Feb 2001 C.Watters 44..

storystory

addressaddressbookstorebookstore

menumenu

bodybody

reviewreview

logologo

namename

phonephone

datedate

reviewerreviewer personperson

summarysummary personperson

booksbooks

office suppliesoffice supplies

..

Tree Structure of the example

Page 45: Feb 2001C.Watters1 Grammars, SGML, & XML Agreeing on the rules

Feb 2001 C.Watters 45

Page 46: Feb 2001C.Watters1 Grammars, SGML, & XML Agreeing on the rules

Feb 2001 C.Watters 46

In the major Web Browser products.

In Microsoft Office 2000.

In every major database tool by end of 2000.

In every HTML tool by end of 2000.

CommerceNet believes that XML may just be the “killer application” needed to open up the Worldwide Web for Electronic Commerce.

Is this for real?

Page 47: Feb 2001C.Watters1 Grammars, SGML, & XML Agreeing on the rules

Feb 2001 C.Watters 47

XML - AdvantagesPlatform and system independentUser-defined tagsDoesn’t require explicit DTDDisplay format and content are separate

XML - DisadvantagesRequires a processing application“Pickier” than HTMLMust be converted to HTML to view in browser

Summary

Page 48: Feb 2001C.Watters1 Grammars, SGML, & XML Agreeing on the rules

Feb 2001 C.Watters 48

W3 Consortium: www.w3.com

kazillions of XML books in every bookstore!

Resources

Page 49: Feb 2001C.Watters1 Grammars, SGML, & XML Agreeing on the rules

Feb 2001 C.Watters 49

6. Reference

Jon Bosak and Tim Bray, Scientific American, May 1999 [http://www.sciam.com/1999/0599issue/0599bosak.html]

Norman Walsh: What is XML? Oct. 3, 1998 [http://xml.com/xml/pub/98/10/guide1.html#AEN58]

Graphic Communications Association web site [http://www.gca.org/whats_xml/default.htm]

University College Cork [http://www.ucc.ie/xml/]

Microsoft MSDN online samples [http://msdn.microsoft.com/xml/samples/review/review-xsl.xml]

[http://www.oasis-open.org/cover/xsl.html]

Charles F. Goldfarb, Paul Prescod, The XML Handbook, 1998


Top Related