java and xml platform independence meets language independence! cc432 / short course 507 lecturer:...

Post on 20-Dec-2015

213 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Java and XMLPlatform independence meets

language independence!

CC432 / Short Course 507

Lecturer: Simon Lucas

University of Essex

Spring 2002

Main Topics

• Introduction

• Reading and Writing XML

• SAX

• DOM and JDOM

• Serializing Objects to XML

• XMLC

• Concluding remarks

Introduction

• Java is a platform independent language – runs anywhere where we have a JVM

• And is well-connected – powerful java.net library

• Yet – many people persist in using other languages – C/C++, VB etc!

Why Java and XML?

• The common format that allows applications written in any language to communicate is XML

• Therefore, very important to make Java read and write XML

• Can also design object models in Java – and translate them into XML

• Leverage powerful design tools such as Together for this purpose

Reading and Writing XML• To gain an insight into what this involves –

we’ll work through a simplified model of XML• Our simplified model is as follows:

– A tree of elements

• Each element either has:– Text,

• OR– A set of Child Elements

Element.java

• The Element class defines the object model for this kind of document

• It also includes some String constants that dictate what characters will be used to delimit the elements

• These are chosen to be standard XML characters

• Currently, no checking that node text does not contain these special characters!!!

Element.java - Ipackage xml.serial;import java.util.*;import java.io.*;

public class Element { static String TAG_OPEN = "<"; static String TAG_CLOSE = ">"; static String END_TAG_OPEN = "</"; static int TAB = 2; static int INIT_INDENT = 0; static char SPACE = ' ';

protected Vector children; protected StringBuffer text; protected String name;

public Element( String name ) { this.name = name; children = null; text = null; }

Element.java II

final public Vector getChildren() {

return children;

}

final public String getText() {

return text.toString();

}

final public String getName() {

return name;

}

Element.java III final public void setText(String text)

throws Exception {

// should substitute for any nasty characters

// e.g. at least < and >

if ( children == null) {

this.text = new StringBuffer( text );

}

else {

throw new Exception(

"Cannot add text to a node

that already has child elements");

}

}

Element.java IV final public void addChild(Element child)

throws Exception

{

if ( text == null ) {

if (children == null) {

children = new Vector();

}

children.addElement( child );

}

else {

throw new Exception(

"Cannot add elements to a node

that already has text");

}

}

Reading and Writing Elements• Given this simple Element class• We can now write code to serialize a tree of

these elements to an XML doc• And to de-serialize such a document back to

the tree of Elements in memory• Hence, we get to write a simple parser for this

subset of XML!• ElementTest creates an element-only

document and writes it to a file

ElementTest.javapackage xml.serial;import java.io.*;

public class ElementTest { public static void main(String[] args) throws

Exception { Element el = new Element("object"); PrintWriter pw = new PrintWriter( System.out ); // el.write( pw ); Element value = new Element( "value" ); value.setText( "Hello" ); el.addChild( value ); el.write( pw ); pw.println( "And now the static version..." ); ElementWriter.write( el , pw ); pw.flush(); }}

Running ElementTest

>java xml.serial.ElementTest

<object>

<value>

Hello

</value>

</object>

SAX

Event-based XML processing

SAX – Main Features

• Serial processing of an XML document

• Register an event handler

• The SAX parser then reads the XML document from start to end

• Calls the methods of the event handler in response to various parts of the document

Example Events

• startDocument()• startElement()• characters()• endElement()• endDocument()• + many others!

SAX-based program pattern

• Define a class that implements the ContentHandler interface

• Easiest way is to extend DefaultHandler• DefaultHandler provides NO-OP

implementations of all the methods in the ContentHandler interface

• Override whichever methods you need to for your application

Using your Custom ContentHandler

• Import the necessary packages• Create a new SAXParser• Get an XMLReader from the Parser• Set the ContentHandler for the XMLReader to

be your own Customized ContentHandler• Set up an ErrorHandler for the XMLReader –

this is a class to handle any parsing errors• Call the XMLReader to parse an XML

Document

Counting Node Types

• This program is the Hello World of SAX• At the start of the document we create a

Hashtable to count the occurrences of each type of element

• We override startElement() to update the count in the Hashtable with each element name we see

• Override endDocument() to print a summary

SAXTest Program Structure

• SAXTest uses CountNodes

• CountNode extends DefaultHandler

SAXTest

DefaultHandler

CountNodes

SAXTestpackage courses.xml;import javax.xml.parsers.*;import org.xml.sax.*;import org.xml.sax.helpers.*;

public class SAXTest extends DefaultHandler { static String parserClass = "org.apache.xerces.parsers.SAXParser"; public static void main(String[] args) throws Exception { XMLReader reader = XMLReaderFactory.createXMLReader( parserClass ); reader.setContentHandler( new CountNodes() ); reader.setErrorHandler( new SimpleErrorHandler(System.err)); reader.parse( args[0] ); }}

CountNodes

• We shall override the following:– startDocument()– startElement()– endElement()

CountNodes - declaration

package courses.xml;

import org.xml.sax.*;

import org.xml.sax.helpers.*;

import java.util.*;

public class CountNodes extends DefaultHandler

{

private Hashtable tags;

// …

CountNodes: startDocument()

• Create a new hashtable for each new document

public void startDocument() throws SAXException

{

tags = new Hashtable();

}

CountNodes: startElement()public void startElement(String namespaceURI, String localName, String rawName, Attributes atts) throws SAXException{ String key = localName; Object value = tags.get(key); if (value == null) { // Add a new entry tags.put(key, new Integer(1)); } else { // Get the current count and increment it int count = ((Integer)value).intValue(); count++; tags.put(key, new Integer(count)); }}

CountNodes: endDocument()

• Summarise the Hashtable contents public void endDocument() throws SAXException {

Enumeration e = tags.keys();

while (e.hasMoreElements()) {

String tag = (String)e.nextElement();

int count =

((Integer) tags.get(tag)).intValue();

System.out.println(

"Tag <" + tag + "> occurs " +

count + " times");

}

}

Running SAXTest: Hello.xml

<?xml version="1.0" ?><greetings> <greeting lang="english"> hello </greeting> <greeing> bonjour </greeing> <greeting> hola! </greeting></greetings>

Output

>java courses.xml.SAXTest courses\xml\hello.xml

Tag <greeing> occurs 1 times

Tag <greetings> occurs 1 times

Tag <greeting> occurs 2 times

Notes on CountNodes

• Note the parameters to startElement()• We get direct access to that element only –

that is its:– Namespace– Attributes– Element Name (local name)– Raw Name (namespace + local name)

• We must work for any access beyond this!

SAX Exercise• By overriding:

– startElement()– endElement()– startDocument()– endDocument()

• provide a ContentHandler prints out how many times a greeting element was that child of another greeting element

SAX Filter Pipelines

• In the Count Nodes example, the XMLReader read from an XML document source

• Also possible to read from the output of a ContentHandler

• In this way can plug together modular filters to achieve complex effects

DOM and JDOM

Document Object Model

and

Java Document Object Model

DOM

• A language-independent object model of XML documents

• Memory-based• The entire document is parsed – read in to

memory• This allows direct access to any part of the

document• But limits the size of document that can be

handled

JDOM

• Because DOM is a language-independent spec., there are features that seem awkward from a Java perspective

• JDOM is a Java-based system, developed by Brett McLaughlin and Jason Hunter

• It aims to offer most of the features of DOM, but make them easier to exploit to Java programmers

Hello JDOM World

• We’ll look at a program that– creates a document– adds a few elements to it– writes it to an output stream

package xml.jdom;

import org.jdom.Element;import org.jdom.Document;import org.jdom.output.XMLOutputter;

public class HelloWorld { public static void main(String[] args) throws Exception

{ Element root = new Element("Greeting"); root.setText("Hello world!"); Element child = new Element("Gday"); child.setText("The kid <bold> is \"cool </bold>"); child.addAttribute( "color" , "red" ); root.addContent( child );

Document doc = new Document(root);

XMLOutputter output =

new XMLOutputter( " " , true ); output.output(

doc, new java.io.PrintWriter( System.out ) );

String text = root.getText(); }}

Reading XML into JDOMpackage xml.jdom;

import org.jdom.Document;import org.jdom.DocType;import org.jdom.Element;import org.jdom.input.SAXBuilder;import org.jdom.output.XMLOutputter;

public class InputTest { public static void main(String[] args) throws Exception

{ String filename1 = "xml/slides/slides.xml"; SAXBuilder builder = new SAXBuilder(); System.out.println("Building..."); Document doc = builder.build( filename1 ); System.out.println( doc ); }}

Processing XML with JDOM

• Now we have the document tree in memory

• Processing is typically much simpler than with SAX

• Though for simple programs, this is not always so

• Let’s begin by considering how to write the Count Nodes program with JDOM

Some API

• Commonly used functions:– getChildren() – gets all the child elements– getContent() – gets all the content of a node – Pis,

Entities, Child elements etc– addContent() – adds any kind of content to a node– addChild()– get/setText() deals with the text of a node– getParent() – does what you expect!

Count Nodes in JDOM

• Strategy:– Create a hashtable– Read in the document– Walk the tree, keeping count in the

hashtable– We walk the tree by recursively visiting all

the children of a node

CountNodes - Structure– CountTest.java reads in the XML doc as a JDOM

Document– Creates an instance of CountNodes– Calls the walkTree method of CountNodes on the

document root element– CountNodes defines three methods

• Constructor – initialises the Hashtable• walkTree – recursively walks the document• count – updates entries in the Hastable• printSummary

– Compare this with the SAX implementation

CountTest.javapackage xml.jdom;import org.jdom.*;import org.jdom.input.SAXBuilder;

public class CountTest { public static void main(String[] args) throws Exception { String filename1 = "courses/xml/hello.xml"; SAXBuilder builder = new SAXBuilder(); Document doc = builder.build( filename1 );

CountNodes counter = new CountNodes(); counter.walkTree( doc.getRootElement() ); counter.printSummary( System.out ); }}

CountNodes.javapackage xml.jdom;

import java.util.*;import java.io.*;import org.jdom.*;

public class CountNodes { Hashtable h;

public CountNodes() { h = new Hashtable(); } // … continued

CountNodes – walkTree()

public void walkTree(Element el) {

count( el.getName() );

List children = el.getChildren();

for (Iterator i = children.iterator(); i.hasNext() ; ) {

walkTree( (Element) i.next() );

}

}

CountNodes – count() public void count(String key) {

Object value = h.get(key);

if (value == null) {

// Add a new entry

h.put(key, new Integer(1));

}

else {

// Get the current count and increment it

int count = ((Integer) value).intValue();

count++;

h.put(key, new Integer(count));

}

}

CountNodes – printSummary()

public void count(String key) { Object value = h.get(key); if (value == null) { // Add a new entry h.put(key, new Integer(1)); } else { // Get the current count and increment it int count = ((Integer) value).intValue(); count++; h.put(key, new Integer(count)); } }

JDOM Exercise• Write a JDOM program to print out how

many times a greeting element was that child of another greeting element

• (e.g. given a doc like Hello.xml – see above)

• (same task that we previously attempted with SAX)

JDOM Exercise Hints

• Consider the following methods:– getParent()– getName()– getChildren()

Serializing Objects to XML

Homebrew version

JSX

Serialization to XML

• First we’ll consider a home-made version

• This will be a bit simplistic – but will work on a restricted range of object classes

• BUT: the Java code to do this will be easy to understand and to analyse

Home-made Serializer

• Aim: serialize a Java Object to an XML document

• Use the Java Reflection API to navigate an Object Graph

• For each object in the graph– Create XML elements/attributes to describe

it

• Write the XML elements to a stream

Issues

• Object attributes will be mapped as elements• What about primitive attributes?

– Can either use elements– Or attributes– Attributes lead to shorter documents and are

easier to read

• Shadowed attributes – must access these and name them properly

• Arrays – full or sparse?

De-serializing from XML

• What if the class details have changed?

• What if the class is not on the classpath?

• Fatal error, or graceful degreadation with warnings?

JSX – Main Features

http://www.csse.monash.edu.au/~bren/JSX/• Developed by Brendan Macmillan at Monash

University, Melbourne• Free for non-commercial use, charge for

commercial use• Has evolved rapidly from an early prototype with

many limitations• To the current version that works well and handles

most cases• To use, just add jsx.jar to your classpath

MyClass• Simple class, with Object, double, String and

byte[] fields

package xml.serial;

/** Simple class to play with serialization to XML */

public class MyClass { MyClass child; String message; double x; byte[] a;}

JSX – Test Program

package xml.serial;import JSX.*;import java.io.*;

public class SimpleJSXTest { public static void main(String[] args) throws Exception { MyClass mc = new MyClass(); mc.a = new byte[]{0 , 1 , 2 , 3}; mc.child = new MyClass(); mc.child.message = "Middle one!"; mc.child.child = mc; ObjOut out = new ObjOut( System.out ); out.writeObject( mc ); out.flush(); }}

JSX – Example Output>javac xml\serial\SimpleJSXTest.java

>java xml.serial.SimpleJSXTest

<?jsx version="1"?>

<xml.serial.MyClass x="0.0">

<xml.serial.MyClass obj-name="child"

message="Middle one!"

x="0.0">

<alias-ref obj-name="child" alias="0"/>

<null obj-name="a"/>

</xml.serial.MyClass>

<binary-data obj-name="a" valueOf="00 01 02 03"/>

</xml.serial.MyClass>

Java, XML and Relational Databases

Creating Virtual XML Documents from ResultSets

ResultSet -> XML

• For more details see Chapter 17 of Professional Java XML

• Basic idea is this:

• Iterate over a result set– Either use this as source of SAX events– OR build an in-memory document model of

it (DOM / JDOM)

SAX Version• Wrapper around a ResultSet that implements

the XMLReader interface• In response to the parse() method, iterates

over the ResultSet• For each Row in the result set, generate a

sequence of startElement(), characters() and endElement() events

• Can use the ResultSetMetaData to name the elements – depending on the mapping convention used – which depends on final purpose

JDOM Version

• Many ways of doing this – here’s one• Start with the table root element• For each row in the result set

– Add a new <row> element– For each field in the row

• Add a new <field> element to the row element

• OR: could build from the SAX version• Usual tradeoffs apply between SAX and DOM

XMLC

Auto-generation of classes from XML Document Types (Check

this!)

XMLC

• XMLC creates Java classes from HTML or XML documents

• See tutorial at– http://staff.plugged.net.au/dwood/xmlc/

• These notes were derived from the above tutorial

• The Java classes faithfully model the document

XMLC II

• By modifying the instance variables of a class, we can insert dynamic content into the document

• Argued to be more efficient than some dynamic generation methods

A Claim for XMLC: “The best single advantage of XMLC is the

ability to completely separate HTML templates (the pages that an artist creates) from Java code (the controlling logic that programmers create).” XMLC allows artists to generate and edit HTML from design tools that support the HTML 4.0 standard”

• Homework: Read the tutorial and Discuss!!!

Concluding Remarks - I

• Rapidly evolving technology

• Can Model objects in Java and convert them to/from XML

• Can write home-made solutions using reflection

• Or use the very good JSX package

Data Modelling

• Can model data using– Relational modelling– Object modelling– XML schemas / DTDs

• BUT: try to stick to once and once only• Model in the chosen way, and use tools

to map between the different representations

SAX and DOM

• Looked at SAX and JDOM for processing XML in Java

• SAX more suitable for massive documents (but where would these come from?)

• DOM + JDOM easier to work with

Concluding Remarks ||

• Java and XML are natural partners

• Used with XSLT, can be used to create well designed web applications with:– Separation of content from presentation– Adherence to Once and Once Only

principle

top related