extensible markup language. agenda overview of xml data type definition language xml schema xml...
TRANSCRIPT
Extensible MarkUp Language
AGENDA
OVERVIEW OF XML
DATA TYPE DEFINITION LANGUAGE
XML SCHEMA
XML PARSERS 1) DOM PARSER 2) SAX PARSER 3) JAXB PARSER
EXTENSIBLE SYTLESHEET TRANSFORMATIONS
OVERVIEW OF XML
What is Markup language?
Markup languages are designed for the processing, definition
and presentation of text. The language specifies code for
formatting, both the layout and style, within a textfile.
The well known markup languages are HTML and XML.
XML is a
A framework for defining markup languages
Each language is targeted at its own application domain with its markup tags.
There is a common set of generic tools for processing XML documents
How is XML different from HTML?
Markup languages generally combine two distinct functions of representing text (document) –the ‘look’ and the ‘structure’.
HTML and XML have different sets of goals.
While HTML was designed to display data and hence focused on the ‘look’ of the data, XML was designed to describe and carry data and hence focuses on ‘what data is’.
HTML is about displaying data and XML is about describing data.
HTML and XML are complementary to each other.
XML FEATURES
XML can be used to create new languages. Ex: WML, VRML
XML uses the concept of DTD (Document Type Definition) to describe data
XML with DTD is self descriptive
XML separates data from display formats
XML can be used as a format to exchange data
Data can be stored in either files or databases
JAVA=Portable Programs
XML=Portable Data
XML Syntax
XML Syntax consists of
XML Declaration
XML Elements
XML Attributes
XML DeclarationThe first line of an XML document should always consist of an XML declaration defining the version of XML
XML ElementXML is a markup language that is used to store data in a self-explanatory manner. Making the data "self-explanatory" comes about by containing information in elements. If a piece of text is a title then it will be contained within a "title" element.
XML AttributesAttributes are used to specify additional information about the element. An attribute for an element appears within the opening tag. The syntax for including an attribute in an element is: <element attributeName="value">
SAMPLE APPLICATION
<?xml version="1.0" encoding= "ISO-8859-1" ?><book><title> XML for dummies</title><chapter> introduction to xml<para>Markup languages</para><para>Features of XML</para></chapter><chapter>XML syntax<para>Elements must be enclosed in tags</para><para>Elements must be properly nested</para></chapter></book>
DOCUMENT TYPE DEFINITION LANGUAGE
A Document Type Definition (DTD) defines the legal building blocks of an XML
document. It defines the document structure with a list of legal elements and attributes.
A DTD is associated with an XML document via a Document Type Declaration ,
which is a tag that appears near the start of the XML document. The declaration
establishes that the document is an instance of the type defined by the referenced DTD.
The declarations in a DTD are divided into an internal subset and an external subset
Internal DTD Declaration
If the DTD is declared inside the XML file, it should be wrapped in a DOCTYPE definition with the following syntax:
<!DOCTYPE root-element [element-declarations]>
Example XML document with an internal DTD:<?xml version="1.0"?> <!DOCTYPE book[<!ELEMENT book (bookID,title)> <!ELEMENT bookID (#PCDATA)> <!ELEMENT title (#PCDATA)> ]> <book> <bookID>1243</bookID> <title>john</title> </book>
External DTD Declaration
If the DTD is declared in an external file, it should be wrapped in a DOCTYPE definition with the following syntax:
<!DOCTYPE root-element SYSTEM "filename">
Example XML document with an external DTD:
<?xml version="1.0"?> <!DOCTYPE note SYSTEM “book.dtd"> <book> <bookID>Tove</bookID> <title>Jani</title> </book>
And the file "note.dtd" which contains the DTD:<!ELEMENT book (bookID,title)> <!ELEMENT bookID (#PCDATA)> <!ELEMENT title (#PCDATA)>
XML SCHEMA
An XML schema is a description of a type of XML document
An XML schema describes the structure of an XML document.
The XML Schema language is also referred to as XML Schema Definition (XSD).
An XML Schema:
defines elements that can appear in a document
defines attributes that can appear in a document
defines the order of child elements
defines the number of child elements
defines data types for elements and attributes
SCHEMA LOCATION
! In an instance document, the attribute xsi:schemaLocation
<purchaseReportxmlns="http://www.example.com/Report"xmlns:xsi="http://www.w3.org/1999/XMLSchema-instance"xsi:schemaLocation="http://www.example.com/Reporthttp://www.example.com/Report.xsd"period="P3M" periodEnding="1999-12-31"><!-- etc --></purchaseReport>
XML PARSERS
Parser is breaking (a sentence) down into its component parts with an explanation of the form, function, and syntactical relationship of each part.
Compilers parse text to identify the program elements and check that it conforms to the correct syntax.
An XML parser is the piece of software that reads XML files and makes the information from those files available to applications and programming
XML parser is a Software that reads an XML document, identifies all the XML tags andpasses the data to the application
All modern browsers have a build-in XML parser that can be used to read and manipulate XML.
The parser reads XML into memory and converts it into an XML DOM object that canbe accessed with JavaScript.
XML PARSERS
DOM PARSERS 1) DOM Characteristics 2) DOM in Action 3) DOM Tree and Nodes 4) DOM Programming Procedures
SAX PARSERS 1) SAX Features 2) SAX Operational Model 3) SAX Programming Procedures 4) Benefits Of SAX
JAXB PARSERS 1) JAXB Design Goals 2) JAXB Binding Lifecycle 3) JAXB Runtime Operations 4) JAXB Programming
DOM PARSER
DOM is cross-platform and cross language
Uses OMG’s IDL to define interfacesIDL to language binding
DOM CHARACTERISTICS
Access XML document as a tree structure
Composed of mostly element nodes and text nodes Can “walk” the tree back and forth
Larger memory requirements
Fairly heavyweight to load and store
Use it when for walking and modifying the tree.
DOM IN ACTION
DOM TREE AND NODES
XML document is represented as a tree
A tree is made of nodes
There are 12 different node types
Node Types Document node Document Fragment node Element node Attribute node Text node Comment node Processing instruction node Document type node Entity node Entity reference node CDATA section node Notation node
Example XML Document
<?xml version="1.0"?><people><name><first_name>Alan</first_name><last_name>Turing</last_name></name></people>
DOM Tree Example
XML Document nodeelement node “people”element node “name”element node “first_name”text node “Alan”element node “last_name”text node “Turing”
Interfaces for DOM
NodeList NamedNodeMap DOMImplementation
Node Interface
Primary data type in DOMRepresents a single node in a DOM treeEvery node is Node interface type
Methods in Node InterfaceUseful Node interface methods public short getNodeType() public String getNodeName( ) public String getNodeValue( ) public NamedNodeMap getAttributes(); public NodeList getChildNodes( )
NodeList Interface
Represents a collection of nodes
Return type of getChildNodes() method of Node interface
public interface NodeList {public Node item(int index);public int getLength();
NamedNodeMap Interface
Represents a collection of nodes each of which can identified by name
Return type of getAttributes() method of Node interface
Document Interface
Contains factory methods for creating other nodes(elements, text nodes)
Method to get root element node
DocumentType Interface
public interface DocumentType extends Node {
public String getName();public NamedNodeMap getEntities();public NamedNodeMap getNotations();public String getPublicId();public String getSystemId();public String getInternalSubset ();
}
Code Example
case Node.PROCESSING_INSTRUCTION_NODE:
System.out.println("<?" + node.getNodeName() +" " + node.getNodeValue() +
DOM Programming Procedures
Create a parser object
Set Features and Read Properties
Parse XML document and get
Document object
Perform operations
Traversing DOM
Manipulating DOM
Creating a new DOM
Writing out DOM
CREATING A DOM OBJECT
import org.w3c.dom.Document;import org.xml.sax.SAXException ;import java. io.IOException ;
String xmlFile = "file:///xerces-1_3_0/data/personal. xml";DOMParser parser = new DOMParser();try {parser.parse(xmlFile);} catch (SAXException se) {se.printStackTrace();} catch (IOException ioe) {ioe.printStackTrace();}Document document = parser. getDocument
Generating A New DOMtry {// Generate a new DOM treeDocument doc= new DocumentImpl ();
Element root = doc.createElement("person"); // Create Root Element
Element item = doc.createElement("name"); // Create element
item. appendChild( doc.createTextNode("Jeff") );
root.appendChild( item ); // atach element to Root element
item = doc.createElement("height");item. appendChild( doc.createTextNode("1.80" ) );} catch ( Exception ex ) {ex.printStackTrace();}
SAX PARSER
Simple API for XML Started as community-driven project
SAX Features
Event-driven: You provide event handlers
Fast and lightweight: Document does not have to be entirely in memory
Sequential read access only
One-time access
Does not support modification of document
SAX Operational Model
XML DOCUMENT
PARSERPROVIDED HANDLER
Input
Events
SAX Programming Procedures
SAX Event Handlers
SAX Parser Example
XMLReader parser = null;--try {
// Create XML (non-validating) parserparser = XMLReaderFactory.createXMLReader();
// Create event handlermyContentHandler handler = new myContentHandler();parser.setContentHandler(handler);
// Call parsing methodparser.parse(args[0]);}catch(SAXException ex){System.err.println(ex.getMessage());}catch(Exception ex){System.err.println(ex.getMessage());}
SAX Event Handler
class myContentHandler implements ContentHandler {// ContentHandler methodspublic void startDocument(){System.out.println(“XML Document START”);}public void endDocument(){System.out.println(“XML Document END”);}public void startElement(String namespace, String name, String qName,Attributes atts){System.out.println(“<“ + qName + “>”);}public void endElement(String namespace, String name, String qName){System.out.println(“</“ + qName + “>);}public void characters(char[] chars, int start, int length){System.out.println(new String(chars, start, length);}
Benefits of SAX
It is very simple
It is very fast
Useful when custom data structures are needed to model the XML document
Can parse files of any size without impacting memory usage
Drawbacks of SAX
SAX provides read-only access
No random access to documents
Searching of documents is not easy
JAXB PARSER
Provides an efficient and standard way of mapping between XML and Java code
Programmers don't have to create application Java objects anymore themselves
Programmers do not have to deal with XML structure, instead deal with meaning business data
JAXB Design Goals
Easy to use : Don't have to deal with complexities of SAX and DOM
Customizable : Allows keeping pace with schema evolution
Portable: JAXB components can be replaced without having to make significant changes to the rest of the source code
How to Use JAXB
Develop or obtain XML schema
Generate the Java source files
Develop JAXB client application
Compile the Java source codes
With the classes and the binding framework and write Java applications that: 1) Build object trees representing XML data
JAXB Binding Lifecycle
JAXB Runtime Operations
Provide the following functionality for schema derived classes
Unmarshal
Process (access or modify)
Marshal
Validation
A factory generates Unmarshaller, Marshaller and Validator instances for JAXB technology based applications
Pass content tree as parameter to Marshaller and
Validator instances
JAXB PROGRAMMING
EXTENSIBLE STYLESHEET TRANSFORMATION (XSLT)
Extensible Stylesheet Language (XSL)is a language for expressing stylesheets
XSL is made of two parts:
XSL Transformation (XSLT)
XSL Formatting Objects (XSL-FO)
Viewpoints of XML
Presentation Oriented Publishing (POP): Useful for Browsers and Editors
Message Oriented Middleware (MOM): Useful for Machine-to-Machine data exchange. E.g.: Business-to-Business communication
XSLT is useful in:
Transforming data into a viewable format in a browser (POP)
Transforming business data between content models (MOM)
<?xml version="1.0"?><xsl:stylesheet version="1.0"xmlns:xsl="http://www.w3.org/1999/XSL/Transform"><xsl:template match=”people">Folks in Brandeis XML class</xsl:template></xsl:stylesheet>
XSLT Stylesheet
RESULT
<?xml version="1.0" encoding="UTF-8"?>
Folks in Brandeis XML class
XSLT stylesheet language
template
value-of
apply-templates
for-each
if
when, choose, otherwise
Sort
filtering
THANK YOU