extensible markup language. agenda overview of xml data type definition language xml schema xml...

46
E xtensible M arkUp L anguage

Upload: antony-pitts

Post on 01-Jan-2016

250 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Extensible MarkUp Language. AGENDA  OVERVIEW OF XML  DATA TYPE DEFINITION LANGUAGE  XML SCHEMA  XML PARSERS 1) DOM PARSER 2) SAX PARSER 3) JAXB PARSER

Extensible MarkUp Language

Page 2: Extensible MarkUp Language. AGENDA  OVERVIEW OF XML  DATA TYPE DEFINITION LANGUAGE  XML SCHEMA  XML PARSERS 1) DOM PARSER 2) SAX PARSER 3) JAXB PARSER

AGENDA

OVERVIEW OF XML

DATA TYPE DEFINITION LANGUAGE

XML SCHEMA

XML PARSERS 1) DOM PARSER 2) SAX PARSER 3) JAXB PARSER

EXTENSIBLE SYTLESHEET TRANSFORMATIONS

Page 3: Extensible MarkUp Language. AGENDA  OVERVIEW OF XML  DATA TYPE DEFINITION LANGUAGE  XML SCHEMA  XML PARSERS 1) DOM PARSER 2) SAX PARSER 3) JAXB PARSER

OVERVIEW OF XML

What is Markup language?

Markup languages are designed for the processing, definition

and presentation of text. The language specifies code for

formatting, both the layout and style, within a textfile.

The well known markup languages are HTML and XML.

XML is a

A framework for defining markup languages

Each language is targeted at its own application domain with its markup tags.

There is a common set of generic tools for processing XML documents

Page 4: Extensible MarkUp Language. AGENDA  OVERVIEW OF XML  DATA TYPE DEFINITION LANGUAGE  XML SCHEMA  XML PARSERS 1) DOM PARSER 2) SAX PARSER 3) JAXB PARSER

How is XML different from HTML?

Markup languages generally combine two distinct functions of representing text (document) –the ‘look’ and the ‘structure’.

HTML and XML have different sets of goals.

While HTML was designed to display data and hence focused on the ‘look’ of the data, XML was designed to describe and carry data and hence focuses on ‘what data is’.

HTML is about displaying data and XML is about describing data.

HTML and XML are complementary to each other.

Page 5: Extensible MarkUp Language. AGENDA  OVERVIEW OF XML  DATA TYPE DEFINITION LANGUAGE  XML SCHEMA  XML PARSERS 1) DOM PARSER 2) SAX PARSER 3) JAXB PARSER

XML FEATURES

XML can be used to create new languages. Ex: WML, VRML

XML uses the concept of DTD (Document Type Definition) to describe data

XML with DTD is self descriptive

XML separates data from display formats

XML can be used as a format to exchange data

Data can be stored in either files or databases

JAVA=Portable Programs

XML=Portable Data

Page 6: Extensible MarkUp Language. AGENDA  OVERVIEW OF XML  DATA TYPE DEFINITION LANGUAGE  XML SCHEMA  XML PARSERS 1) DOM PARSER 2) SAX PARSER 3) JAXB PARSER

XML Syntax

XML Syntax consists of

XML Declaration

XML Elements

XML Attributes

XML DeclarationThe first line of an XML document should always consist of an XML declaration defining the version of XML

XML ElementXML is a markup language that is used to store data in a self-explanatory manner. Making the data "self-explanatory" comes about by containing information in elements. If a piece of text is a title then it will be contained within a "title" element.

XML AttributesAttributes are used to specify additional information about the element. An attribute for an element appears within the opening tag. The syntax for including an attribute in an element is: <element attributeName="value">

Page 7: Extensible MarkUp Language. AGENDA  OVERVIEW OF XML  DATA TYPE DEFINITION LANGUAGE  XML SCHEMA  XML PARSERS 1) DOM PARSER 2) SAX PARSER 3) JAXB PARSER

SAMPLE APPLICATION

<?xml version="1.0" encoding= "ISO-8859-1" ?><book><title> XML for dummies</title><chapter> introduction to xml<para>Markup languages</para><para>Features of XML</para></chapter><chapter>XML syntax<para>Elements must be enclosed in tags</para><para>Elements must be properly nested</para></chapter></book>

Page 8: Extensible MarkUp Language. AGENDA  OVERVIEW OF XML  DATA TYPE DEFINITION LANGUAGE  XML SCHEMA  XML PARSERS 1) DOM PARSER 2) SAX PARSER 3) JAXB PARSER

DOCUMENT TYPE DEFINITION LANGUAGE

A Document Type Definition (DTD) defines the legal building blocks of an XML

document. It defines the document structure with a list of legal elements and attributes.

A DTD is associated with an XML document via a Document Type Declaration ,

which is a tag that appears near the start of the XML document. The declaration

establishes that the document is an instance of the type defined by the referenced DTD.

The declarations in a DTD are divided into an internal subset and an external subset

Page 9: Extensible MarkUp Language. AGENDA  OVERVIEW OF XML  DATA TYPE DEFINITION LANGUAGE  XML SCHEMA  XML PARSERS 1) DOM PARSER 2) SAX PARSER 3) JAXB PARSER

Internal DTD Declaration

If the DTD is declared inside the XML file, it should be wrapped in a DOCTYPE definition with the following syntax:

<!DOCTYPE root-element [element-declarations]>

Example XML document with an internal DTD:<?xml version="1.0"?> <!DOCTYPE book[<!ELEMENT book (bookID,title)> <!ELEMENT bookID (#PCDATA)> <!ELEMENT title (#PCDATA)> ]> <book> <bookID>1243</bookID> <title>john</title> </book>

Page 10: Extensible MarkUp Language. AGENDA  OVERVIEW OF XML  DATA TYPE DEFINITION LANGUAGE  XML SCHEMA  XML PARSERS 1) DOM PARSER 2) SAX PARSER 3) JAXB PARSER

External DTD Declaration

If the DTD is declared in an external file, it should be wrapped in a DOCTYPE definition with the following syntax:

<!DOCTYPE root-element SYSTEM "filename">

Example XML document with an external DTD:

<?xml version="1.0"?> <!DOCTYPE note SYSTEM “book.dtd"> <book> <bookID>Tove</bookID> <title>Jani</title> </book>

And the file "note.dtd" which contains the DTD:<!ELEMENT book (bookID,title)> <!ELEMENT bookID (#PCDATA)> <!ELEMENT title (#PCDATA)>

Page 11: Extensible MarkUp Language. AGENDA  OVERVIEW OF XML  DATA TYPE DEFINITION LANGUAGE  XML SCHEMA  XML PARSERS 1) DOM PARSER 2) SAX PARSER 3) JAXB PARSER

XML SCHEMA

An XML schema is a description of a type of XML document

An XML schema describes the structure of an XML document.

The XML Schema language is also referred to as XML Schema Definition (XSD).

An XML Schema:

defines elements that can appear in a document

defines attributes that can appear in a document

defines the order of child elements

defines the number of child elements

defines data types for elements and attributes

Page 12: Extensible MarkUp Language. AGENDA  OVERVIEW OF XML  DATA TYPE DEFINITION LANGUAGE  XML SCHEMA  XML PARSERS 1) DOM PARSER 2) SAX PARSER 3) JAXB PARSER

SCHEMA LOCATION

! In an instance document, the attribute xsi:schemaLocation

<purchaseReportxmlns="http://www.example.com/Report"xmlns:xsi="http://www.w3.org/1999/XMLSchema-instance"xsi:schemaLocation="http://www.example.com/Reporthttp://www.example.com/Report.xsd"period="P3M" periodEnding="1999-12-31"><!-- etc --></purchaseReport>

Page 13: Extensible MarkUp Language. AGENDA  OVERVIEW OF XML  DATA TYPE DEFINITION LANGUAGE  XML SCHEMA  XML PARSERS 1) DOM PARSER 2) SAX PARSER 3) JAXB PARSER

XML PARSERS

Parser is breaking (a sentence) down into its component parts with an explanation of the form, function, and syntactical relationship of each part.

Compilers parse text to identify the program elements and check that it conforms to the correct syntax.

An XML parser is the piece of software that reads XML files and makes the information from those files available to applications and programming

XML parser is a Software that reads an XML document, identifies all the XML tags andpasses the data to the application

All modern browsers have a build-in XML parser that can be used to read and manipulate XML.

The parser reads XML into memory and converts it into an XML DOM object that canbe accessed with JavaScript.

Page 14: Extensible MarkUp Language. AGENDA  OVERVIEW OF XML  DATA TYPE DEFINITION LANGUAGE  XML SCHEMA  XML PARSERS 1) DOM PARSER 2) SAX PARSER 3) JAXB PARSER

XML PARSERS

DOM PARSERS 1) DOM Characteristics 2) DOM in Action 3) DOM Tree and Nodes 4) DOM Programming Procedures

SAX PARSERS 1) SAX Features 2) SAX Operational Model 3) SAX Programming Procedures 4) Benefits Of SAX

JAXB PARSERS 1) JAXB Design Goals 2) JAXB Binding Lifecycle 3) JAXB Runtime Operations 4) JAXB Programming

Page 15: Extensible MarkUp Language. AGENDA  OVERVIEW OF XML  DATA TYPE DEFINITION LANGUAGE  XML SCHEMA  XML PARSERS 1) DOM PARSER 2) SAX PARSER 3) JAXB PARSER

DOM PARSER

DOM is cross-platform and cross language

Uses OMG’s IDL to define interfacesIDL to language binding

DOM CHARACTERISTICS

Access XML document as a tree structure

Composed of mostly element nodes and text nodes Can “walk” the tree back and forth

Larger memory requirements

Fairly heavyweight to load and store

Use it when for walking and modifying the tree.

Page 16: Extensible MarkUp Language. AGENDA  OVERVIEW OF XML  DATA TYPE DEFINITION LANGUAGE  XML SCHEMA  XML PARSERS 1) DOM PARSER 2) SAX PARSER 3) JAXB PARSER

DOM IN ACTION

Page 17: Extensible MarkUp Language. AGENDA  OVERVIEW OF XML  DATA TYPE DEFINITION LANGUAGE  XML SCHEMA  XML PARSERS 1) DOM PARSER 2) SAX PARSER 3) JAXB PARSER

DOM TREE AND NODES

XML document is represented as a tree

A tree is made of nodes

There are 12 different node types

Node Types Document node Document Fragment node Element node Attribute node Text node Comment node Processing instruction node Document type node Entity node Entity reference node CDATA section node Notation node

Page 18: Extensible MarkUp Language. AGENDA  OVERVIEW OF XML  DATA TYPE DEFINITION LANGUAGE  XML SCHEMA  XML PARSERS 1) DOM PARSER 2) SAX PARSER 3) JAXB PARSER

Example XML Document

<?xml version="1.0"?><people><name><first_name>Alan</first_name><last_name>Turing</last_name></name></people>

DOM Tree Example

XML Document nodeelement node “people”element node “name”element node “first_name”text node “Alan”element node “last_name”text node “Turing”

Page 19: Extensible MarkUp Language. AGENDA  OVERVIEW OF XML  DATA TYPE DEFINITION LANGUAGE  XML SCHEMA  XML PARSERS 1) DOM PARSER 2) SAX PARSER 3) JAXB PARSER

Interfaces for DOM

NodeList NamedNodeMap DOMImplementation

Node Interface

Primary data type in DOMRepresents a single node in a DOM treeEvery node is Node interface type

Methods in Node InterfaceUseful Node interface methods public short getNodeType() public String getNodeName( ) public String getNodeValue( ) public NamedNodeMap getAttributes(); public NodeList getChildNodes( )

Page 20: Extensible MarkUp Language. AGENDA  OVERVIEW OF XML  DATA TYPE DEFINITION LANGUAGE  XML SCHEMA  XML PARSERS 1) DOM PARSER 2) SAX PARSER 3) JAXB PARSER

NodeList Interface

Represents a collection of nodes

Return type of getChildNodes() method of Node interface

public interface NodeList {public Node item(int index);public int getLength();

NamedNodeMap Interface

Represents a collection of nodes each of which can identified by name

Return type of getAttributes() method of Node interface

Document Interface

Contains factory methods for creating other nodes(elements, text nodes)

Method to get root element node

Page 21: Extensible MarkUp Language. AGENDA  OVERVIEW OF XML  DATA TYPE DEFINITION LANGUAGE  XML SCHEMA  XML PARSERS 1) DOM PARSER 2) SAX PARSER 3) JAXB PARSER

DocumentType Interface

public interface DocumentType extends Node {

public String getName();public NamedNodeMap getEntities();public NamedNodeMap getNotations();public String getPublicId();public String getSystemId();public String getInternalSubset ();

}

Code Example

case Node.PROCESSING_INSTRUCTION_NODE:

System.out.println("<?" + node.getNodeName() +" " + node.getNodeValue() +

Page 22: Extensible MarkUp Language. AGENDA  OVERVIEW OF XML  DATA TYPE DEFINITION LANGUAGE  XML SCHEMA  XML PARSERS 1) DOM PARSER 2) SAX PARSER 3) JAXB PARSER

DOM Programming Procedures

Create a parser object

Set Features and Read Properties

Parse XML document and get

Document object

Perform operations

Traversing DOM

Manipulating DOM

Creating a new DOM

Writing out DOM

Page 23: Extensible MarkUp Language. AGENDA  OVERVIEW OF XML  DATA TYPE DEFINITION LANGUAGE  XML SCHEMA  XML PARSERS 1) DOM PARSER 2) SAX PARSER 3) JAXB PARSER

CREATING A DOM OBJECT

import org.w3c.dom.Document;import org.xml.sax.SAXException ;import java. io.IOException ;

String xmlFile = "file:///xerces-1_3_0/data/personal. xml";DOMParser parser = new DOMParser();try {parser.parse(xmlFile);} catch (SAXException se) {se.printStackTrace();} catch (IOException ioe) {ioe.printStackTrace();}Document document = parser. getDocument

Page 24: Extensible MarkUp Language. AGENDA  OVERVIEW OF XML  DATA TYPE DEFINITION LANGUAGE  XML SCHEMA  XML PARSERS 1) DOM PARSER 2) SAX PARSER 3) JAXB PARSER

Generating A New DOMtry {// Generate a new DOM treeDocument doc= new DocumentImpl ();

Element root = doc.createElement("person"); // Create Root Element

Element item = doc.createElement("name"); // Create element

item. appendChild( doc.createTextNode("Jeff") );

root.appendChild( item ); // atach element to Root element

item = doc.createElement("height");item. appendChild( doc.createTextNode("1.80" ) );} catch ( Exception ex ) {ex.printStackTrace();}

Page 25: Extensible MarkUp Language. AGENDA  OVERVIEW OF XML  DATA TYPE DEFINITION LANGUAGE  XML SCHEMA  XML PARSERS 1) DOM PARSER 2) SAX PARSER 3) JAXB PARSER

SAX PARSER

Simple API for XML Started as community-driven project

SAX Features

Event-driven: You provide event handlers

Fast and lightweight: Document does not have to be entirely in memory

Sequential read access only

One-time access

Does not support modification of document

Page 26: Extensible MarkUp Language. AGENDA  OVERVIEW OF XML  DATA TYPE DEFINITION LANGUAGE  XML SCHEMA  XML PARSERS 1) DOM PARSER 2) SAX PARSER 3) JAXB PARSER

SAX Operational Model

XML DOCUMENT

PARSERPROVIDED HANDLER

Input

Events

Page 27: Extensible MarkUp Language. AGENDA  OVERVIEW OF XML  DATA TYPE DEFINITION LANGUAGE  XML SCHEMA  XML PARSERS 1) DOM PARSER 2) SAX PARSER 3) JAXB PARSER

SAX Programming Procedures

Page 28: Extensible MarkUp Language. AGENDA  OVERVIEW OF XML  DATA TYPE DEFINITION LANGUAGE  XML SCHEMA  XML PARSERS 1) DOM PARSER 2) SAX PARSER 3) JAXB PARSER

SAX Event Handlers

Page 29: Extensible MarkUp Language. AGENDA  OVERVIEW OF XML  DATA TYPE DEFINITION LANGUAGE  XML SCHEMA  XML PARSERS 1) DOM PARSER 2) SAX PARSER 3) JAXB PARSER

SAX Parser Example

XMLReader parser = null;--try {

// Create XML (non-validating) parserparser = XMLReaderFactory.createXMLReader();

// Create event handlermyContentHandler handler = new myContentHandler();parser.setContentHandler(handler);

// Call parsing methodparser.parse(args[0]);}catch(SAXException ex){System.err.println(ex.getMessage());}catch(Exception ex){System.err.println(ex.getMessage());}

Page 30: Extensible MarkUp Language. AGENDA  OVERVIEW OF XML  DATA TYPE DEFINITION LANGUAGE  XML SCHEMA  XML PARSERS 1) DOM PARSER 2) SAX PARSER 3) JAXB PARSER

SAX Event Handler

class myContentHandler implements ContentHandler {// ContentHandler methodspublic void startDocument(){System.out.println(“XML Document START”);}public void endDocument(){System.out.println(“XML Document END”);}public void startElement(String namespace, String name, String qName,Attributes atts){System.out.println(“<“ + qName + “>”);}public void endElement(String namespace, String name, String qName){System.out.println(“</“ + qName + “>);}public void characters(char[] chars, int start, int length){System.out.println(new String(chars, start, length);}

Page 31: Extensible MarkUp Language. AGENDA  OVERVIEW OF XML  DATA TYPE DEFINITION LANGUAGE  XML SCHEMA  XML PARSERS 1) DOM PARSER 2) SAX PARSER 3) JAXB PARSER

Benefits of SAX

It is very simple

It is very fast

Useful when custom data structures are needed to model the XML document

Can parse files of any size without impacting memory usage

Drawbacks of SAX

SAX provides read-only access

No random access to documents

Searching of documents is not easy

Page 32: Extensible MarkUp Language. AGENDA  OVERVIEW OF XML  DATA TYPE DEFINITION LANGUAGE  XML SCHEMA  XML PARSERS 1) DOM PARSER 2) SAX PARSER 3) JAXB PARSER

JAXB PARSER

Provides an efficient and standard way of mapping between XML and Java code

Programmers don't have to create application Java objects anymore themselves

Programmers do not have to deal with XML structure, instead deal with meaning business data

JAXB Design Goals

Easy to use : Don't have to deal with complexities of SAX and DOM

Customizable : Allows keeping pace with schema evolution

Portable: JAXB components can be replaced without having to make significant changes to the rest of the source code

Page 33: Extensible MarkUp Language. AGENDA  OVERVIEW OF XML  DATA TYPE DEFINITION LANGUAGE  XML SCHEMA  XML PARSERS 1) DOM PARSER 2) SAX PARSER 3) JAXB PARSER

How to Use JAXB

Develop or obtain XML schema

Generate the Java source files

Develop JAXB client application

Compile the Java source codes

With the classes and the binding framework and write Java applications that: 1) Build object trees representing XML data

Page 34: Extensible MarkUp Language. AGENDA  OVERVIEW OF XML  DATA TYPE DEFINITION LANGUAGE  XML SCHEMA  XML PARSERS 1) DOM PARSER 2) SAX PARSER 3) JAXB PARSER

JAXB Binding Lifecycle

Page 35: Extensible MarkUp Language. AGENDA  OVERVIEW OF XML  DATA TYPE DEFINITION LANGUAGE  XML SCHEMA  XML PARSERS 1) DOM PARSER 2) SAX PARSER 3) JAXB PARSER

JAXB Runtime Operations

Provide the following functionality for schema derived classes

Unmarshal

Process (access or modify)

Marshal

Validation

A factory generates Unmarshaller, Marshaller and Validator instances for JAXB technology based applications

Pass content tree as parameter to Marshaller and

Validator instances

Page 36: Extensible MarkUp Language. AGENDA  OVERVIEW OF XML  DATA TYPE DEFINITION LANGUAGE  XML SCHEMA  XML PARSERS 1) DOM PARSER 2) SAX PARSER 3) JAXB PARSER
Page 37: Extensible MarkUp Language. AGENDA  OVERVIEW OF XML  DATA TYPE DEFINITION LANGUAGE  XML SCHEMA  XML PARSERS 1) DOM PARSER 2) SAX PARSER 3) JAXB PARSER

JAXB PROGRAMMING

Page 38: Extensible MarkUp Language. AGENDA  OVERVIEW OF XML  DATA TYPE DEFINITION LANGUAGE  XML SCHEMA  XML PARSERS 1) DOM PARSER 2) SAX PARSER 3) JAXB PARSER

EXTENSIBLE STYLESHEET TRANSFORMATION (XSLT)

Extensible Stylesheet Language (XSL)is a language for expressing stylesheets

XSL is made of two parts:

XSL Transformation (XSLT)

XSL Formatting Objects (XSL-FO)

Viewpoints of XML

Presentation Oriented Publishing (POP): Useful for Browsers and Editors

Message Oriented Middleware (MOM): Useful for Machine-to-Machine data exchange. E.g.: Business-to-Business communication

Page 39: Extensible MarkUp Language. AGENDA  OVERVIEW OF XML  DATA TYPE DEFINITION LANGUAGE  XML SCHEMA  XML PARSERS 1) DOM PARSER 2) SAX PARSER 3) JAXB PARSER

XSLT is useful in:

Transforming data into a viewable format in a browser (POP)

Transforming business data between content models (MOM)

Page 40: Extensible MarkUp Language. AGENDA  OVERVIEW OF XML  DATA TYPE DEFINITION LANGUAGE  XML SCHEMA  XML PARSERS 1) DOM PARSER 2) SAX PARSER 3) JAXB PARSER
Page 41: Extensible MarkUp Language. AGENDA  OVERVIEW OF XML  DATA TYPE DEFINITION LANGUAGE  XML SCHEMA  XML PARSERS 1) DOM PARSER 2) SAX PARSER 3) JAXB PARSER
Page 42: Extensible MarkUp Language. AGENDA  OVERVIEW OF XML  DATA TYPE DEFINITION LANGUAGE  XML SCHEMA  XML PARSERS 1) DOM PARSER 2) SAX PARSER 3) JAXB PARSER
Page 43: Extensible MarkUp Language. AGENDA  OVERVIEW OF XML  DATA TYPE DEFINITION LANGUAGE  XML SCHEMA  XML PARSERS 1) DOM PARSER 2) SAX PARSER 3) JAXB PARSER
Page 44: Extensible MarkUp Language. AGENDA  OVERVIEW OF XML  DATA TYPE DEFINITION LANGUAGE  XML SCHEMA  XML PARSERS 1) DOM PARSER 2) SAX PARSER 3) JAXB PARSER

<?xml version="1.0"?><xsl:stylesheet version="1.0"xmlns:xsl="http://www.w3.org/1999/XSL/Transform"><xsl:template match=”people">Folks in Brandeis XML class</xsl:template></xsl:stylesheet>

XSLT Stylesheet

RESULT

<?xml version="1.0" encoding="UTF-8"?>

Folks in Brandeis XML class

Page 45: Extensible MarkUp Language. AGENDA  OVERVIEW OF XML  DATA TYPE DEFINITION LANGUAGE  XML SCHEMA  XML PARSERS 1) DOM PARSER 2) SAX PARSER 3) JAXB PARSER

XSLT stylesheet language

template

value-of

apply-templates

for-each

if

when, choose, otherwise

Sort

filtering

Page 46: Extensible MarkUp Language. AGENDA  OVERVIEW OF XML  DATA TYPE DEFINITION LANGUAGE  XML SCHEMA  XML PARSERS 1) DOM PARSER 2) SAX PARSER 3) JAXB PARSER

THANK YOU