xml for java distributed programming part i distributed programming paradigms distributed...

58
XML For Java XML For Java Distributed Distributed Programming Programming Part I Part I Distributed programming Distributed programming paradigms paradigms Intro XML, Schema, etc. Intro XML, Schema, etc.

Upload: jesse-hancock

Post on 17-Dec-2015

257 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: XML For Java Distributed Programming Part I Distributed programming paradigms Distributed programming paradigms Intro XML, Schema, etc. Intro XML, Schema,

XML For Java XML For Java Distributed ProgrammingDistributed Programming

Part IPart I• Distributed programming paradigmsDistributed programming paradigms

• Intro XML, Schema, etc.Intro XML, Schema, etc.

Page 2: XML For Java Distributed Programming Part I Distributed programming paradigms Distributed programming paradigms Intro XML, Schema, etc. Intro XML, Schema,

““Traditional” Traditional” programming modelsprogramming models

Page 3: XML For Java Distributed Programming Part I Distributed programming paradigms Distributed programming paradigms Intro XML, Schema, etc. Intro XML, Schema,

Distributed programming models:Distributed programming models:Classic Web-basedClassic Web-based

Easy to deploy but slow, not great user experienceEasy to deploy but slow, not great user experience

htmlbrowser

WebServer

http

DynamicallyGenerated

html

Many programming models•JSP•ASP•Servlets•PHP•CGI (python, perl, C)•Cold Fusion

html

plus optionallyJavaScript to jazz up html

database

Lacks full supportof apps server -- notransactions, rpc, etc.

Page 4: XML For Java Distributed Programming Part I Distributed programming paradigms Distributed programming paradigms Intro XML, Schema, etc. Intro XML, Schema,

Distributed programming modelsDistributed programming modelsTypical Web-basedTypical Web-based

Better user experience. Heavier, less portable, requires Better user experience. Heavier, less portable, requires socket programming to stream to server.socket programming to stream to server.

WebServer

http

DynamicallyGenerated

html

html + applet

databaseapplet

html

socket

Page 5: XML For Java Distributed Programming Part I Distributed programming paradigms Distributed programming paradigms Intro XML, Schema, etc. Intro XML, Schema,

Direct ConnectionsDirect Connections

Direct socket and rpc-styleDirect socket and rpc-style

Application client

App1sockets

App2

App3

ports

Application client

App1

Remote ProceduresApp2

App3N

DS

Examples: Java’s rmi, CORBA, DCOM

Page 6: XML For Java Distributed Programming Part I Distributed programming paradigms Distributed programming paradigms Intro XML, Schema, etc. Intro XML, Schema,

Application ServersApplication Servers

Page 7: XML For Java Distributed Programming Part I Distributed programming paradigms Distributed programming paradigms Intro XML, Schema, etc. Intro XML, Schema,

RPC-style Web serviceRPC-style Web service

Page 8: XML For Java Distributed Programming Part I Distributed programming paradigms Distributed programming paradigms Intro XML, Schema, etc. Intro XML, Schema,

General role of XMLGeneral role of XML

Most modern languages have method of Most modern languages have method of representing structured data.representing structured data.

Typical flow of events in applicationTypical flow of events in application

Read data(file, db, socket)

Marshalobjects

Manipulate inprogram

Unmarshal (file, db, socket)

•Many language-specific technologies to reduce these steps: RMI, object serialization in any language, CORBA (actually somewhat language neutral), MPI, etc.

•XML provides a very appealing alternative that hits the sweet spot for many applications

Page 9: XML For Java Distributed Programming Part I Distributed programming paradigms Distributed programming paradigms Intro XML, Schema, etc. Intro XML, Schema,

Simple XML-based architectureSimple XML-based architecture

webbrowser

WebServer

http

“hand-rolled”XML

XML

pyth

on C

GI

“hand-rolled”XML

File system

Page 10: XML For Java Distributed Programming Part I Distributed programming paradigms Distributed programming paradigms Intro XML, Schema, etc. Intro XML, Schema,

What is XMLWhat is XML

Extensible Markup LanguageExtensible Markup Language ( (XMLXML) ) a general-purpose specification for creating custom

markup languages. It is classified as an extensible language because it allows its users to define their own elements. Its primary purpose is to facilitate the sharing of structured data across different information systems, particularly via the Internet, and it is used both to encode documents and to serialize data. In the latter context, it is comparable with other text-based serialization languages such as JSON and YAML.

W3C recommendation, open standard

Page 11: XML For Java Distributed Programming Part I Distributed programming paradigms Distributed programming paradigms Intro XML, Schema, etc. Intro XML, Schema,

XML in messagingXML in messaging

Most modern languages have method of Most modern languages have method of representing structured data.representing structured data.

Typical flow of events in applicationTypical flow of events in application

Read data(file, db, socket)

Marshalobjects

Manipulate inprogram

Unmarshal (file, db, socket)

•Many language-specific technologies to reduce these steps: RMI, object serialization in any language, CORBA (actually somewhat language neutral), MPI, etc.

•XML provides a very appealing alternative that hits the sweet spot for many applications

Page 12: XML For Java Distributed Programming Part I Distributed programming paradigms Distributed programming paradigms Intro XML, Schema, etc. Intro XML, Schema,

User-defined types in programming User-defined types in programming languageslanguages

One view of XML is as a text-based, programming-One view of XML is as a text-based, programming-language-neutral way of representing structured language-neutral way of representing structured

information.information. Compare:Compare:

struct Student{ char* name; char* ssn; int age; float gpa;}

class Student{ public String name; public String ssn; public int age; public float gpa;}

C Java Fortrantype Student character(len=*) :: name character(len=*) :: ssn integer :: age real :: gpaend type Student

Page 13: XML For Java Distributed Programming Part I Distributed programming paradigms Distributed programming paradigms Intro XML, Schema, etc. Intro XML, Schema,

Sample XML SchemaSample XML Schema

• In XML, (a common) datatype description is called an XML schema.• DTD and Relax NG are other common alternatives• Below uses schema just for illustration purposes• Note that schema itself is written in XML

<?xml version="1.0" encoding="UTF-8"?><xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" elementFormDefault="qualified" attributeFormDefault="unqualified"> <xs:element name="student"> <xs:complexType> <xs:sequence> <xs:element name="name" type="xs:string"/> <xs:element name="ssn" type="xs:string"/> <xs:element name="age" type="xs:integer"/> <xs:element name="gpa" type="xs:decimal"/> </xs:sequence> </xs:complexType> </xs:element></xs:schema>

Ignore thisFor now

Page 14: XML For Java Distributed Programming Part I Distributed programming paradigms Distributed programming paradigms Intro XML, Schema, etc. Intro XML, Schema,

Alternative schemaAlternative schema

•In this example studentType is defined separately rather than anonymously

<xs:schema> <xs:element name="student" type="studentType“/> <xs:complexType name="studentType"> <xs:sequence>

<xs:element name="name" type="xs:string"/><xs:element name="ssn" type="xs:string"/><xs:element name="age" type="xs:integer"/><xs:element name="gpa" type="xs:decimal"/>

</xs:sequence> </xs:complexType></xs:schema>

new type defined separately

Page 15: XML For Java Distributed Programming Part I Distributed programming paradigms Distributed programming paradigms Intro XML, Schema, etc. Intro XML, Schema,

Alternative: DTDAlternative: DTD• Can also use a DTD (Document Type Descriptor), but this is much simpler than a schema but also much less powerful (notice the lack of types)

<!DOCTYPE Student [ <! – Each XML file is stored in a document whose name is the same as the root node -- > <! ELEMENT Student (name,ssn,age,gpa)> <! – Student has four attributes -- > <!ELEMENT name (#PCDATA)> <! – name is parsed character data -- > <!ELEMENT ssn (#PCDATA)> <!ELEMENT age (#PCDATA)> <!ELEMENT gpa (#PCDATA)>]>

Page 16: XML For Java Distributed Programming Part I Distributed programming paradigms Distributed programming paradigms Intro XML, Schema, etc. Intro XML, Schema,

Another alternative: Relax NGAnother alternative: Relax NG

Gaining in popularityGaining in popularity

Can be very simple to write and at same Can be very simple to write and at same time has many more features than DTDtime has many more features than DTD

Still much less common than SchemaStill much less common than Schema

Page 17: XML For Java Distributed Programming Part I Distributed programming paradigms Distributed programming paradigms Intro XML, Schema, etc. Intro XML, Schema,

Creating instances of typesCreating instances of types

In programming languages, we instantiate objects:

struct Student s1, s2;s1.name = “Andrew”s1.ssn=“123-45-6789”;

Student s = new Student();s1.name = “Andrew”;s1.ssn=“123-45-6789”;.type(Student) :: s1s1%name = ‘Andrew’.

C

Java

Fortran

Page 18: XML For Java Distributed Programming Part I Distributed programming paradigms Distributed programming paradigms Intro XML, Schema, etc. Intro XML, Schema,

Creating XML documentsCreating XML documents

XML is XML is notnot a programming language! a programming language!

In XML we make a Student “object” in an xml file In XML we make a Student “object” in an xml file (Student.xml):(Student.xml):

<Student><Student>

<name>Andrew</name><name>Andrew</name>

<ssn>123-45-6789</ssn><ssn>123-45-6789</ssn>

<age>39</age><age>39</age>

<gpa>2.0</gpa><gpa>2.0</gpa>

</Student> </Student>

Think of this as like a serialized object.Think of this as like a serialized object.

Page 19: XML For Java Distributed Programming Part I Distributed programming paradigms Distributed programming paradigms Intro XML, Schema, etc. Intro XML, Schema,

XML and SchemaXML and Schema

Note that there are two parts to what we didNote that there are two parts to what we did Defining the “structure” layoutDefining the “structure” layout Defining an “instance” of the structureDefining an “instance” of the structure

The first is done with an appropriate Schema or The first is done with an appropriate Schema or DTD.DTD.The second is the XML partThe second is the XML partBoth can go in the same file, or an XML file can Both can go in the same file, or an XML file can refer to an external Schema or DTD (typical)refer to an external Schema or DTD (typical)From this point on we use only SchemaFrom this point on we use only SchemaExercise 1Exercise 1

Page 20: XML For Java Distributed Programming Part I Distributed programming paradigms Distributed programming paradigms Intro XML, Schema, etc. Intro XML, Schema,

??

Question: What can we do with such a file?Question: What can we do with such a file?

Some answers:Some answers:

Write corresponding Schema to define its contentWrite corresponding Schema to define its content

Write XSL transformation to displayWrite XSL transformation to display

Parse into a programming languageParse into a programming language

Page 21: XML For Java Distributed Programming Part I Distributed programming paradigms Distributed programming paradigms Intro XML, Schema, etc. Intro XML, Schema,

Exercise 1Exercise 1

Page 22: XML For Java Distributed Programming Part I Distributed programming paradigms Distributed programming paradigms Intro XML, Schema, etc. Intro XML, Schema,

Exercise 1 SolutionExercise 1 Solution<?xml version="1.0" encoding="UTF-8"?><cars> <car> <make>dodge</make> <model>ram</model> <color>red</color> <year>2004</year> <mileage>22000</mileage> </car>

<car> <make>Ford</make> <model>Pinto</model> <color>white</color> <year>1980</year> <mileage>100000</mileage> </car>

</cars>

Page 23: XML For Java Distributed Programming Part I Distributed programming paradigms Distributed programming paradigms Intro XML, Schema, etc. Intro XML, Schema,

Some sample XML Some sample XML documentsdocuments

Page 24: XML For Java Distributed Programming Part I Distributed programming paradigms Distributed programming paradigms Intro XML, Schema, etc. Intro XML, Schema,

Order / WhitespaceOrder / Whitespace

Note that element order is important, but whitespace in element data is not. This is the same as far as the xml parser is concerned:

<Article ><Headline>Direct Marketer Offended by Term 'Junk Mail' </Headline><authors>

<author> Joe Garden</author><author> Tim Harrod</author>

</authors><abstract>Dan Spengler, CEO of the direct-mail-marketing firm Mailbox of

Savings, took umbrage Monday at the use of the term <it>junk mail</it></abstract><body type="url" > http://www.theonion.com/archive/3-11-01.html </body>

</Article>

Page 25: XML For Java Distributed Programming Part I Distributed programming paradigms Distributed programming paradigms Intro XML, Schema, etc. Intro XML, Schema,

Molecule ExampleMolecule Example

XML is extremely useful for standardizing data sharing within XML is extremely useful for standardizing data sharing within specialized domains. Below is a part of the specialized domains. Below is a part of the Chemical Chemical Markup Language Markup Language describing a water molecule and its describing a water molecule and its constituentsconstituents

<?xml version "1.0" ?><?xml version "1.0" ?><CML><CML>

<MOL TITLE="Water" ><MOL TITLE="Water" ><ATOMS> <ATOMS>

<ARRAY BUILTIN="ELSYM" > H O H</ARRAY><ARRAY BUILTIN="ELSYM" > H O H</ARRAY></ATOMS></ATOMS><BONDS><BONDS>

<ARRAY BUILTIN="ATID1" >1 2</ARRAY><ARRAY BUILTIN="ATID1" >1 2</ARRAY><ARRAY BUILTIN="ATID2" >2 3</ARRAY><ARRAY BUILTIN="ATID2" >2 3</ARRAY><ARRAY BUILTIN="ORDER" >1 1</ARRAY><ARRAY BUILTIN="ORDER" >1 1</ARRAY>

</BONDS></BONDS></MOL></MOL>

</CML></CML>

Page 26: XML For Java Distributed Programming Part I Distributed programming paradigms Distributed programming paradigms Intro XML, Schema, etc. Intro XML, Schema,

Rooms exampleRooms example

A typical example showing a few more XML features:A typical example showing a few more XML features:

<?xml version="1.0" ?><?xml version="1.0" ?> <rooms> <rooms>

<room name="<room name="RedRed">">  <capacity><capacity>1010</capacity> </capacity> <equipmentList><equipmentList>

<equipment><equipment>ProjectorProjector</equipment> </equipment>   

</equipmentList></equipmentList>  </room></room><room name="<room name="GreenGreen">">  

<capacity><capacity>55</capacity> </capacity>   <equipmentList /> <equipmentList /> <features><features>   <feature><feature>No RoofNo Roof</feature> </feature>    </features></features>  

</room></room>   </rooms></rooms>

Page 27: XML For Java Distributed Programming Part I Distributed programming paradigms Distributed programming paradigms Intro XML, Schema, etc. Intro XML, Schema,

SuggestionSuggestion

Try building each of those documents in Try building each of those documents in an XML builder tool (XMLSpy, Oxygen, an XML builder tool (XMLSpy, Oxygen, etc.) or at least an XML-aware editor.etc.) or at least an XML-aware editor.

Note: it is not required to create a schema Note: it is not required to create a schema to do this. Just create new XML document to do this. Just create new XML document and start building.and start building.

Page 28: XML For Java Distributed Programming Part I Distributed programming paradigms Distributed programming paradigms Intro XML, Schema, etc. Intro XML, Schema,

Dissecting an XML Dissecting an XML DocumentDocument

Page 29: XML For Java Distributed Programming Part I Distributed programming paradigms Distributed programming paradigms Intro XML, Schema, etc. Intro XML, Schema,

Things that can appear in an XML documentThings that can appear in an XML document

ELEMENTSELEMENTS: : simplesimple, , complexcomplex, , emptyempty, or , or mixedmixed content model; content model; attributes. attributes.

The The XML declarationXML declaration

Processing instructions(PIsProcessing instructions(PIs) ) <? …?><? …?>Most common is Most common is <?xml-stylesheet …?><?xml-stylesheet …?>

<?xml-stylesheet type=“text/css” <?xml-stylesheet type=“text/css” href=“mys.css”?>href=“mys.css”?>

CommentsComments <!-- <!-- comment textcomment text --> -->

Page 30: XML For Java Distributed Programming Part I Distributed programming paradigms Distributed programming paradigms Intro XML, Schema, etc. Intro XML, Schema,

Begin TagsEnd Tags

Tags

Attributes

<?xml version "1.0"<?xml version "1.0" ?>?>

<<CMLCML><><MOL TITLE="Water" MOL TITLE="Water" > <> <ATOMSATOMS>> <<ARRAY BUILTIN="ELSYM" ARRAY BUILTIN="ELSYM" >> H O H H O H</</ARRAYARRAY>></</ATOMSATOMS>><<BONDSBONDS>><<ARRAY BUILTIN="ATID1" >1 2ARRAY BUILTIN="ATID1" >1 2</</ARRAYARRAY>><<ARRAY BUILTIN="ATID2" >2 3ARRAY BUILTIN="ATID2" >2 3</</ARRAYARRAY>><<ARRAY BUILTIN="ORDER" >1 1ARRAY BUILTIN="ORDER" >1 1</</ARRAYARRAY>></</BONDSBONDS>></</MOLMOL>></</CMLCML>>

Parts of an XML documentParts of an XML documentDeclaration

AttributeValues

An XML element is everything from (including) the element's start tag to (including) the element's end tag.

Page 31: XML For Java Distributed Programming Part I Distributed programming paradigms Distributed programming paradigms Intro XML, Schema, etc. Intro XML, Schema,

XML and TreesXML and TreesTags give the structure of a Tags give the structure of a document. They divide the document. They divide the document up into document up into Elements, Elements, starting at the top most starting at the top most element, theelement, the root element. root element. The The stuff inside an element is its stuff inside an element is its content – content cancontent – content caninclude other elements along include other elements along with ‘character data’with ‘character data’

CML

MOL

ATOMS BONDS

ARRAY ARRAY ARRAY ARRAY

HOH 12 23 11

Root element

CDATA sections

Page 32: XML For Java Distributed Programming Part I Distributed programming paradigms Distributed programming paradigms Intro XML, Schema, etc. Intro XML, Schema,

XML and XML and TreesTrees

<?xml version "1.0"<?xml version "1.0" ?>?><<CMLCML>>

<<MOL TITLE="Water" MOL TITLE="Water" >><<ATOMSATOMS>>

<<ARRAY BUILTIN="ELSYM" ARRAY BUILTIN="ELSYM" >> H O H H O H</</ARRAYARRAY>></</ATOMSATOMS>><<BONDSBONDS>>

<<ARRAY BUILTIN="ATID1" >1 2ARRAY BUILTIN="ATID1" >1 2</</ARRAYARRAY>><<ARRAY BUILTIN="ATID2" >2 3ARRAY BUILTIN="ATID2" >2 3</</ARRAYARRAY>><<ARRAY BUILTIN="ORDER" >1 1ARRAY BUILTIN="ORDER" >1 1</</ARRAYARRAY>>

</</BONDSBONDS>></</MOLMOL>>

</</CMLCML>>

CML

MOL

ATOMS BONDS

ARRAY ARRAY ARRAY ARRAY

HOH 12 23 11

Root element

Data sections

Page 33: XML For Java Distributed Programming Part I Distributed programming paradigms Distributed programming paradigms Intro XML, Schema, etc. Intro XML, Schema,

XML and TreesXML and Trees

rooms

room

capacity equipmentlistequipmentlist

equipment

capacity

room

features

feature10

projector

5

No Roof

Page 34: XML For Java Distributed Programming Part I Distributed programming paradigms Distributed programming paradigms Intro XML, Schema, etc. Intro XML, Schema,

More detail on elementsMore detail on elements

Page 35: XML For Java Distributed Programming Part I Distributed programming paradigms Distributed programming paradigms Intro XML, Schema, etc. Intro XML, Schema,

Element relationshipsElement relationships

<book> <title>My First XML</title> <prod id="33-657" media="paper"></prod> <chapter>Introduction to XML <para>What is HTML</para> <para>What is XML</para> </chapter> <chapter>XML Syntax <para>Elements must have a closing tag</para> <para>Elements must be properly nested</para> </chapter> </book>

•Book is the root element.

•Title, prod, and chapter are child elements of book.

•Book is the parent element of title, prod, and chapter.

•Title, prod, and chapter are siblings (or sister elements) because they have the same parent.

Page 36: XML For Java Distributed Programming Part I Distributed programming paradigms Distributed programming paradigms Intro XML, Schema, etc. Intro XML, Schema,

Well formed XMLWell formed XML

Page 37: XML For Java Distributed Programming Part I Distributed programming paradigms Distributed programming paradigms Intro XML, Schema, etc. Intro XML, Schema,

Well-formed vs ValidWell-formed vs Valid

An XML document is said to be An XML document is said to be well-well-formedformed if it obeys basic semantic and if it obeys basic semantic and syntactic constraints.syntactic constraints.

This is different from a This is different from a validvalid XML XML document, which (as we will see in more document, which (as we will see in more depth) properly matches a schema.depth) properly matches a schema.

Page 38: XML For Java Distributed Programming Part I Distributed programming paradigms Distributed programming paradigms Intro XML, Schema, etc. Intro XML, Schema,

Rules for Well-Formed XMLRules for Well-Formed XML

An XML document is considered well-formed if it obeys the An XML document is considered well-formed if it obeys the following rules:following rules:

There must be one element that contains all others (root element)There must be one element that contains all others (root element)

All tags must be balanced All tags must be balanced <BOOK>...</BOOK><BOOK>...</BOOK> <BOOK /><BOOK />

Tags must be nested properly:Tags must be nested properly: <BOOK> <LINE> This is OK </LINE> </BOOK><BOOK> <LINE> This is OK </LINE> </BOOK> <LINE> <BOOK> This is </LINE> definitely NOT </BOOK> <LINE> <BOOK> This is </LINE> definitely NOT </BOOK>

OKOK

Element text is case-sensitive soElement text is case-sensitive so <P>This is not ok, even though we do it all the time <P>This is not ok, even though we do it all the time

in HTML!</p>in HTML!</p>

Page 39: XML For Java Distributed Programming Part I Distributed programming paradigms Distributed programming paradigms Intro XML, Schema, etc. Intro XML, Schema,

More Rules for Well-Formed XMLMore Rules for Well-Formed XML

The attributes in a tag must be in quotesThe attributes in a tag must be in quotes < ITEM CATEGORY=“Home and Garden” Name=“hoe-matic < ITEM CATEGORY=“Home and Garden” Name=“hoe-matic

t500”>t500”>

Comments are allowedComments are allowed <!–- They are done just as in HTML… --><!–- They are done just as in HTML… -->

Must begin withMust begin with <?xml version=‘1.0’ ?><?xml version=‘1.0’ ?>

Special characters must be escaped: the most common are Special characters must be escaped: the most common are < < " ' > &" ' > &

<formula> x &lt; y+2x </formula><formula> x &lt; y+2x </formula><cd title="&quot; mmusic"><cd title="&quot; mmusic">

Page 40: XML For Java Distributed Programming Part I Distributed programming paradigms Distributed programming paradigms Intro XML, Schema, etc. Intro XML, Schema,

Naming RulesNaming Rules

Naming rules for XML elementsNaming rules for XML elements Names may contain letters, numbers, and other characters Names must not start with a number or punctuation character Names must not start with the letters xml (or XML or Xml ..) Names cannot contain spaces

Any name can be used, no words are reserved, but the idea is to make names descriptive. Names with an underscore separator are typical

Examples: <first_name>, <date_of_birth>, etc.

Page 41: XML For Java Distributed Programming Part I Distributed programming paradigms Distributed programming paradigms Intro XML, Schema, etc. Intro XML, Schema,

XML ToolsXML Tools

XML can be created with any text editorXML can be created with any text editor

Normally we use an XML-friendly editorNormally we use an XML-friendly editor e.g. XMLSpye.g. XMLSpy nXML emacs extensionsnXML emacs extensions MSXML on WindowsMSXML on Windows OxygenOxygen Etc etc.Etc etc.

To check and validate XML, use either these tools and/or To check and validate XML, use either these tools and/or xmllint on Unix systems.xmllint on Unix systems.

Page 42: XML For Java Distributed Programming Part I Distributed programming paradigms Distributed programming paradigms Intro XML, Schema, etc. Intro XML, Schema,

Another ViewAnother View

XML-as-data is one way to introduce XMLXML-as-data is one way to introduce XML

Another is as a Another is as a markup language markup language similar to html.similar to html.

One typically says that html has a fixed tag set, whereas One typically says that html has a fixed tag set, whereas XML allows the definition of arbitrary tagsXML allows the definition of arbitrary tags

This analogy is particularly useful when the goal is to use This analogy is particularly useful when the goal is to use XML for text presentation -- that is, when most of our XML for text presentation -- that is, when most of our data fields contain textdata fields contain text

Note that mixed element/text fields are permissible in XMLNote that mixed element/text fields are permissible in XML

Page 43: XML For Java Distributed Programming Part I Distributed programming paradigms Distributed programming paradigms Intro XML, Schema, etc. Intro XML, Schema,

Article exampleArticle example

<Article > <Headline>Direct Marketer Offended by Term 'Junk Mail' </Headline> <authors> <author> Joe Garden</author> <author> Tim Harrod</author> </authors> <abstract>Dan Spengler, CEO of the direct-mail-marketing firm Mailbox of Savings, took umbrage Monday at the use of the term <it>junk mail</it>. </abstract> <body type="url" > http://www.theonion.com/archive/3-11-01.html </body>

</Article>

Page 44: XML For Java Distributed Programming Part I Distributed programming paradigms Distributed programming paradigms Intro XML, Schema, etc. Intro XML, Schema,

More uses of XMLMore uses of XML

There is more!There is more!

A very popular use of XML is as a base syntax for A very popular use of XML is as a base syntax for programming languages (the elements become program programming languages (the elements become program control structures)control structures)

XSLT, BPEL, ant, etc. are good examplesXSLT, BPEL, ant, etc. are good examples XML is ubiqitous and must have a deep understanding to be XML is ubiqitous and must have a deep understanding to be

efficient and productiveefficient and productive

Many other current and potential uses -- up to the Many other current and potential uses -- up to the creativity of the programmercreativity of the programmer

Page 45: XML For Java Distributed Programming Part I Distributed programming paradigms Distributed programming paradigms Intro XML, Schema, etc. Intro XML, Schema,

XML SchemaXML Schema

There are many details to cover of schema There are many details to cover of schema specification. It is extremely rich, flexible, specification. It is extremely rich, flexible, and somewhat complexand somewhat complex

We will do this in detail next lectureWe will do this in detail next lecture

Now we begin with a brief introductionNow we begin with a brief introduction

Page 46: XML For Java Distributed Programming Part I Distributed programming paradigms Distributed programming paradigms Intro XML, Schema, etc. Intro XML, Schema,

XML Schema

XML itself does not restrict what elements existing in a document.

In a given application, you want to fix a vocabulary -- what elements make sense, what their types are, etc.

Use a Schema to define an XML dialect MusicXML, ChemXML, VoiceXML, ADXML, etc.

Restrict documents to those tags.

Schema can be used to validate a document -- ie to see if it obeys the rules of the dialect.

Page 47: XML For Java Distributed Programming Part I Distributed programming paradigms Distributed programming paradigms Intro XML, Schema, etc. Intro XML, Schema,

Schema determine …What sort of elements can appear in the document.

What elements MUST appear

Which elements can appear as part of another element

What attributes can appear or must appear

What kind of values can/must be in an attribute.

Page 48: XML For Java Distributed Programming Part I Distributed programming paradigms Distributed programming paradigms Intro XML, Schema, etc. Intro XML, Schema,

<?xml version="1.0" encoding="UTF-8"?><library> <book id="b0836217462" available="true"> <isbn> 0836217462 </isbn> <title lang="en"> Being a Dog is a Full-Time Job </title> <author id="CMS"> <name> Charles Schulz </name> <born> 1922-11-26 </born> <dead> 2000-02-12 </dead> </author> <character id="PP"> <name> Peppermint Patty </name> <born> 1966-08-22 </born> <qualification> bold,brash, and tomboyish </qualification> </character> <character id="Snoopy"> <name> Snoopy</name> <born>1950-10-04</born> <qualification>extroverted beagle</qualification> </character> <character id="Schroeder"> <name>Schroeder</name> <born>1951-05-30</born> <qualification>brought classical music to the Peanuts Strip</qualification> </character> <character id="Lucy"> <name>Lucy</name> <born>1952-03-03</born> <qualification>bossy, crabby, and selfish</qualification> </character> </book></library>

• We start with sample XML document and reverse engineer a schema as a simple example

First identify the elements:author, book, born, character,dead, isbn, library, name,qualification, title

Next categorize by contentmodelEmpty: contains nothingSimple: only text nodesComplex: only sub-elementsMixed: text nodes + sub-elements

Note: content model independentof comments, attributes, or processing instructions!

Page 49: XML For Java Distributed Programming Part I Distributed programming paradigms Distributed programming paradigms Intro XML, Schema, etc. Intro XML, Schema,

Content modelsContent models

Simple content model: name, born, title, dead, isbn, qualification

Complex content model: libarary, character, book, author

Page 50: XML For Java Distributed Programming Part I Distributed programming paradigms Distributed programming paradigms Intro XML, Schema, etc. Intro XML, Schema,

Content TypesContent Types

We further distinguish between complex and simple content Types: Simple Type: An element with only text nodes

and no child elements or attributes Complex Type: All other cases

We also say (and require) that all attributes themselves have simple type

Page 51: XML For Java Distributed Programming Part I Distributed programming paradigms Distributed programming paradigms Intro XML, Schema, etc. Intro XML, Schema,

Content TypesContent Types

Simple content type: name, born, dead, isbn, qualification

Complex content type: library, character, book, author, title

Page 52: XML For Java Distributed Programming Part I Distributed programming paradigms Distributed programming paradigms Intro XML, Schema, etc. Intro XML, Schema,

Exercise2 answerExercise2 answer

• In the previous example <book>

• book has element content, because it contains other elements.

• Chapter has mixed content because it contains both textand other elements.

• Para has simple content (or text content) because it

contains only text.

• Prod has empty content, because it carries no information

Page 53: XML For Java Distributed Programming Part I Distributed programming paradigms Distributed programming paradigms Intro XML, Schema, etc. Intro XML, Schema,

Building the schemaBuilding the schema

Schema are XML documentsSchema are XML documents

They must contain a schema root element as suchThey must contain a schema root element as such <?xml version="1.0"?> <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"

targetNamespace="http://www.w3schools.com" xmlns="http://www.w3schools.com" elementFormDefault="qualified">

... ... </xs:schema>

We will discuss details in a bit -- note that yellow part can be excluded for now.

Page 54: XML For Java Distributed Programming Part I Distributed programming paradigms Distributed programming paradigms Intro XML, Schema, etc. Intro XML, Schema,

Flat schema for libraryFlat schema for library

Start by defining all of the simple types (including attributes):

<xs:schema xmlns:xs=http://www.w3.org/2001/XMLSchema> <xs:element name=“name” type=“xs:string”/> <xs:element name=“qualification” type=“xs:string”/> <xs:element name=“born” type=“xs:date”/> <xs:element name=“dead” type=“xs:date”/> <xs:element name=“isbn” type=“xs:string”/> <xs:attribute name=“id” type=“xs:ID”/> <xs:attribute name=“available” type=“xs:boolean”/> <xs:attribute name=“lang” type=“xs:language/> …/…</xs:schema>

Page 55: XML For Java Distributed Programming Part I Distributed programming paradigms Distributed programming paradigms Intro XML, Schema, etc. Intro XML, Schema,

Complex types with simple content

Now to complex types with simple content:

<title lang=“en”> Being a Dog is …</title>

<xs:element name=“title”> <xs:complexType> <xs:simpleContent> <xs:extension base=“xs:string”> <xs:attribute ref=“lang”/> </xs:extension> </xs:simpleContent> </xs:complexType></xs:element>

“the element named title has a complextype which is a simple content obtainedby extending the predefined datatypexs:string by adding the attribute definedin this schema and having the name lang.”

Page 56: XML For Java Distributed Programming Part I Distributed programming paradigms Distributed programming paradigms Intro XML, Schema, etc. Intro XML, Schema,

Complex TypesComplex Types

All other types are complex types with complex content. For example:

<xs:element name=“library”> <xs:complexType> <xs:sequence> <xs:element ref=“book” maxOccurs=“unbounded”/> </xs:sequence> </xs:complexType></xs:element>

<xs:element name=“author”> <xs:complexType> <xs:sequence> <xs:element ref=“name”/> <xs:element ref=“born”/> <xs:element ref=“dead” minOccurs=0/> </xs:sequence> <xs:attribute ref=“id”/> </xs:complexType></xs:element>

Page 57: XML For Java Distributed Programming Part I Distributed programming paradigms Distributed programming paradigms Intro XML, Schema, etc. Intro XML, Schema,

<?xml version="1.0" encoding="UTF-8"?><xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"> <xs:element name="name" type="xs:string"/> <xs:element name="qualification" type="xs:string"/> <xs:element name="born" type="xs:date"> </xs:element> <xs:element name="dead" type="xs:date"> </xs:element> <xs:element name="isbn" type="xs:string"> </xs:element> <xs:attribute name="id" type="xs:ID"> </xs:attribute> <xs:attribute name="available" type="xs:boolean"> </xs:attribute> <xs:attribute name="lang" type="xs:language"> </xs:attribute> <xs:element name="title"> <xs:complexType> <xs:simpleContent> <xs:extension base="xs:string"> <xs:attribute ref="lang"> </xs:attribute> </xs:extension> </xs:simpleContent> </xs:complexType> </xs:element> <xs:element name="library"> <xs:complexType> <xs:sequence> <xs:element maxOccurs="unbounded" ref="book"> </xs:element> </xs:sequence> </xs:complexType> </xs:element> <xs:element name="author"> <xs:complexType> <xs:sequence> <xs:element ref="name"> </xs:element> <xs:element ref="born"> </xs:element> <xs:element ref="dead" minOccurs="0"> </xs:element> </xs:sequence> <xs:attribute ref="id"> </xs:attribute> </xs:complexType> </xs:element>

<xs:element name="book"> <xs:complexType> <xs:sequence> <xs:element ref="isbn"> </xs:element> <xs:element ref="title"> </xs:element> <xs:element ref="author" minOccurs="0" maxOccurs="unbounded”/> <xs:element ref="character" minOccurs="0" maxOccurs="unbounded"/> </xs:sequence> <xs:attribute ref="available"> </xs:attribute> <xs:attribute ref="id"> </xs:attribute> </xs:complexType> </xs:element> <xs:element name="character"> <xs:complexType> <xs:sequence> <xs:element ref="name"/> <xs:element ref="born"/> <xs:element ref="qualification"/> </xs:sequence> <xs:attribute ref="id"> </xs:attribute> </xs:complexType> </xs:element>

</xs:schema>

Page 58: XML For Java Distributed Programming Part I Distributed programming paradigms Distributed programming paradigms Intro XML, Schema, etc. Intro XML, Schema,

<?xml version="1.0" encoding="UTF-8"?><xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"> <xs:element name="library"> <xs:complexType> <xs:sequence> <xs:element name="book" maxOccurs="unbounded"> <xs:complexType> <xs:sequence> <xs:element name="isbn" type="xs:integer"> </xs:element> <xs:element name="title"> <xs:complexType> <xs:simpleContent> <xs:extension base="xs:string"> <xs:attribute name="lang" type="xs:language" > </xs:attribute> </xs:extension> </xs:simpleContent> </xs:complexType> </xs:element> <xs:element name="author" minOccurs="0" maxOccurs="unbounded"> <xs:complexType> <xs:sequence> <xs:element name="name" type="xs:string"> </xs:element> <xs:element name="born" type="xs:date"> </xs:element> <xs:element name="dead" type="xs:date"> </xs:element> </xs:sequence> <xs:attribute name="id" type="xs:ID"> </xs:attribute> </xs:complexType> </xs:element> <xs:element name="character" minOccurs="0" maxOccurs="unbounded"> <xs:complexType> <xs:sequence> <xs:element name="name" type="xs:string"> </xs:element> <xs:element name="born" type="xs:date"> </xs:element> <xs:element name="qualification" type="xs:string" > </xs:element> </xs:sequence> <xs:attribute name="id" type="xs:ID"> </xs:attribute> </xs:complexType> </xs:element> </xs:sequence> <xs:attribute type="xs:ID" name="id"> </xs:attribute> <xs:attribute name="available" type="xs:boolean"> </xs:attribute> </xs:complexType> </xs:element> </xs:sequence> </xs:complexType> </xs:element></xs:schema>

Same schema but with everythingdefined locally!