xml for scientific computing
DESCRIPTION
XML for Scientific Computing. Several case studies for XML data in scientific computing. Overview. We will present case studies of the following systems XSIL: Extensible Scientific Interchange Language XDMF: Extensible Data Model and Format Discipline Specific XML: ChemicalML - PowerPoint PPT PresentationTRANSCRIPT
XML for Scientific XML for Scientific ComputingComputing
Several case studies for XML Several case studies for XML data in scientific computingdata in scientific computing
OverviewOverview We will present case studies of the following We will present case studies of the following
systemssystems• XSIL: Extensible Scientific Interchange LanguageXSIL: Extensible Scientific Interchange Language• XDMF: Extensible Data Model and FormatXDMF: Extensible Data Model and Format• Discipline Specific XML: ChemicalMLDiscipline Specific XML: ChemicalML• Gateway Application Descriptors (plus Castor)Gateway Application Descriptors (plus Castor)
XML by itself is just markup, like HTML without a XML by itself is just markup, like HTML without a browser. Each of the above uses a related set of browser. Each of the above uses a related set of software to manipulate the XML data.software to manipulate the XML data.
We present several examples of XML to give you We present several examples of XML to give you an overview. an overview.
We conclude with some remarks about standards We conclude with some remarks about standards for science applications.for science applications.
Overview of Case StudiesOverview of Case Studies XSIL and XDMF are examples of XSIL and XDMF are examples of
representing (meta)data for scientific representing (meta)data for scientific computing.computing.• Concentrate on data structures, data I/O.Concentrate on data structures, data I/O.• Meaning of data not described.Meaning of data not described.
ChemicalML marks up domain specific data.ChemicalML marks up domain specific data.• Meaningfully describes data content.Meaningfully describes data content.
Gateway application data describes science Gateway application data describes science codes themselves.codes themselves.
All possess a data object model.All possess a data object model.• Object oriented data descriptions guide the Object oriented data descriptions guide the
markup tag definitions.markup tag definitions.
XSILXSIL
XML tags for generic scientific XML tags for generic scientific data markup, with related data markup, with related
Java software. Java software.
XSILXSIL Developed in support of several projects led by Developed in support of several projects led by
CACR.CACR.• Example: LIGO, Digital SkyExample: LIGO, Digital Sky• Roy Williams, CalTech.Roy Williams, CalTech.
See See http://www.cacr.caltech.edu/SDA/xsil/http://www.cacr.caltech.edu/SDA/xsil/ for for more information and free software.more information and free software.
XSIL developed for astronomical and gravitational XSIL developed for astronomical and gravitational wave communities.wave communities.
But provides general purpose tags.But provides general purpose tags. Also comes with software for building Java Also comes with software for building Java
applications that manipulate, display XSIL applications that manipulate, display XSIL documents.documents.
XSIL TagsXSIL Tags
XSIL defines a small number of tagsXSIL defines a small number of tags• XSIL: base container for the object model.XSIL: base container for the object model.• CommentComment• Param: an arbitrary name/value pairParam: an arbitrary name/value pair• Time: describes time, plus formatTime: describes time, plus format• Table: data in columns and rowsTable: data in columns and rows• Array: table data with specific sizeArray: table data with specific size• URL: URL: • Streams: for handling dataStreams: for handling data
We’ll now go over some of these in detail. We’ll now go over some of these in detail.
The XSIL Tag IThe XSIL Tag I
XSIL documents map to a document XSIL documents map to a document object model with associated object model with associated handling code.handling code.
The root tag for XSIL is <XSIL>:The root tag for XSIL is <XSIL>:<XSIL Name=“Example” Type=“Examples.MyExample><XSIL Name=“Example” Type=“Examples.MyExample>……</XSIL></XSIL>
Type Type points to the Java code that should points to the Java code that should process this file. process this file. • It’s some file called MyExample.java in the package It’s some file called MyExample.java in the package
Examples.Examples.
The XSIL Tag IIThe XSIL Tag II
XSIL tags can be nested if different parts XSIL tags can be nested if different parts of the XSIL document need to be handled of the XSIL document need to be handled by different codes.by different codes.<XSIL Name=“Example” Type=“Examples.MyExample”><XSIL Name=“Example” Type=“Examples.MyExample”>
…… <XSIL Name=“Subsection” Type=“Examples.Subsection”><XSIL Name=“Subsection” Type=“Examples.Subsection”>……</XSIL> </XSIL></XSIL> </XSIL>
XSIL tags thus are the base container in a XSIL tags thus are the base container in a generic object hierarchy.generic object hierarchy.• MyExample object “has a” Subsection objectMyExample object “has a” Subsection object
More On Object ContainersMore On Object Containers Consider an Electromagnetics example:Consider an Electromagnetics example:
• A target is represented as a grid for finite A target is represented as a grid for finite difference integration of Maxwell’s eqns.difference integration of Maxwell’s eqns.
• The base input file contains one or more The base input file contains one or more materials. materials.
• Each material has specific EM properties.Each material has specific EM properties. If translated to XSIL, could look like this:If translated to XSIL, could look like this:
<XSIL Name=“EMRoot” Type=“CEA.Root”><XSIL Name=“EMRoot” Type=“CEA.Root”><!– Some general parameters --><!– Some general parameters --><XSIL Name=“EMMaterial” Type=“CEA.Material”><XSIL Name=“EMMaterial” Type=“CEA.Material”><!– Some info describing the material. --><!– Some info describing the material. --></XSIL></XSIL>
</XSIL></XSIL>
ParametersParameters
Each XSIL tag can contain one or Each XSIL tag can contain one or more parameters.more parameters.
Params are arbitrary name/value Params are arbitrary name/value pairs.pairs.
Params optionally have units.Params optionally have units.<XSIL …><XSIL …>
<Param Name=“Color”>Red</Param><Param Name=“Color”>Red</Param>
<Param Name=“Weight” Unit=“kg”>3.14</Param><Param Name=“Weight” Unit=“kg”>3.14</Param>
</XSIL></XSIL>
TablesTables
Params associate one value per nameParams associate one value per name Tables support multiple valuesTables support multiple values
• A Table row can have any number of values.A Table row can have any number of values. Each table contains column definitions Each table contains column definitions
followed by an arbitrary number of followed by an arbitrary number of entries.entries.
Tables get data from streams (discussed Tables get data from streams (discussed later). later).
Example TableExample Table
<XSIL…>…<Table> <Column Name=“Color” Type=“string”/> <Column Name=“Weight” Type=“float” Unit=“kg”/> <Column Name=“Length” Type=“float” Unit=“meter”/> <Stream Type=“Local” Delimiter=“,”>
“Red”,100.2,0.2“Green”,21.7,1.2
</Stream></Table></XSIL>
XSIL ArraysXSIL Arrays
XSIL arrays are similar to Fortran and C XSIL arrays are similar to Fortran and C arrays.arrays.
For mixed type data, use Tables.For mixed type data, use Tables. If all data is the same (integers, floats), use If all data is the same (integers, floats), use
Arrays.Arrays.<Array Type=“int”><Array Type=“int”>
<Dim Name=“x-dim”>2</Dim><Dim Name=“x-dim”>2</Dim><Dim Name=“y-dim”>2</Dim><Dim Name=“y-dim”>2</Dim><Stream Type=“Local” Delimiter=“,”><Stream Type=“Local” Delimiter=“,”>
137,42137,428,138,13
</Stream></Stream>
</Array</Array>>
XSIL StreamsXSIL Streams XSIL Streams can be used to load data XSIL Streams can be used to load data Data sources can beData sources can be
• In the file itself (as shown in previous examples).In the file itself (as shown in previous examples).• From files on diskFrom files on disk• From URLs (http://, ftp://, and file:// supported)From URLs (http://, ftp://, and file:// supported)
Loading data from diskLoading data from disk<Stream Type=“Remote” <Stream Type=“Remote” EncodingEncoding=“Littleendian”>=“Littleendian”>
/home/user1/data/datafile.dat/home/user1/data/datafile.dat</Stream></Stream>
Loading data from URLsLoading data from URLs<Stream Type=“Remote”><Stream Type=“Remote”>
http://my.server.edu/XSILdata/datafile.dathttp://my.server.edu/XSILdata/datafile.dat</Stream></Stream>
Ex: Use XSIL to describe input dataEx: Use XSIL to describe input data
<XSIL Name=“InputData” Type=“Examples.InDataHandler”> <XSIL Name=“Target 1” Type=“Examples.Target”> <Param Name=“Target”>Scud</Param> <Param Name=“dx”>0.1</Param> <Array> <Dim Name=“X-Dimension”>100</Dim> <Dim Name=“Y-Dimension”>100</Dim> <Stream Type=“Remote”>
/home/mpierce/data/mydata.dat </Stream> </Array> </XSIL> <XSIL Name=“Target 2” Type=“Examples.Target”> <!– Another target --> </XSIL></XSIL>
Table and Array TypesTable and Array Types
Table and Array data can be (in bits)Table and Array data can be (in bits)• boolean (1)boolean (1)• byte (8)byte (8)• short (16)short (16)• int (32)int (32)• long (64)long (64)• float (32)float (32)• double (64)double (64)• floatComplex (64)floatComplex (64)• doubleComplex (128)doubleComplex (128)• string (arbitrary length)string (arbitrary length)
Using XSILUsing XSIL
The previous example just marks up data.The previous example just marks up data. XSIL also comes with Java bindings thatXSIL also comes with Java bindings that
• Read the file and parse it.Read the file and parse it.• Extract parameter values, units, etc.Extract parameter values, units, etc.• Read in and manipulate tables, arraysRead in and manipulate tables, arrays
Central ideas:Central ideas: • Each XSIL tag corresponds to a Java classEach XSIL tag corresponds to a Java class• XSIL’s Type points to your custom driver code XSIL’s Type points to your custom driver code
that uses the XSIL classes.that uses the XSIL classes.
XSIL Coding ExampleXSIL Coding Example
Consider following small XSIL Consider following small XSIL exampleexample
<XSIL Type=“Examples.MyExample”><XSIL Type=“Examples.MyExample”>
<Param Name=“x0”>12.0</Param><Param Name=“x0”>12.0</Param>
<Param Name=“dx”>0.1</Param><Param Name=“dx”>0.1</Param>
</XSIL></XSIL>
XSIL Java Code ExampleXSIL Java Code Examplepackage extensions.Examplespackage extensions.Examplesimport org.escience.XSILimport org.escience.XSILpublic class MyExample {public class MyExample {
String x0,dx;String x0,dx;XSIL root;XSIL root;public MyExample(String xsilFileName) {public MyExample(String xsilFileName) {
root=new XSIL(xsilFileName);root=new XSIL(xsilFileName);}}public void construct() {public void construct() {
for(int i=0;i<root.getChildCount();i++) {for(int i=0;i<root.getChildCount();i++) { XSIL x=root.getChild(i);XSIL x=root.getChild(i); if(x instance of Param) {if(x instance of Param) { Param p=(Param)x;Param p=(Param)x; if(p.getName().equals(“x0”)) x0=p.getText();if(p.getName().equals(“x0”)) x0=p.getText(); if(p.getName().equals(“dx”)) dx=p.getText();if(p.getName().equals(“dx”)) dx=p.getText();
}}}}}}}}
Code NotesCode Notes
All classes (Param, Table, etc.) All classes (Param, Table, etc.) extend the XSIL class.extend the XSIL class.
Pass the XSIL class root the XSIL path Pass the XSIL class root the XSIL path through the constructor.through the constructor.• XSIL handles all parsing XSIL handles all parsing
XSIL class defines getChildCount(), XSIL class defines getChildCount(), getChild() methods.getChild() methods.
Param class defines getName() and Param class defines getName() and getText() methods.getText() methods.
XSIL SummaryXSIL Summary
Defines a small set of general Defines a small set of general purpose tags for scientific data.purpose tags for scientific data.
Data itself is not directly marked up.Data itself is not directly marked up.• Read in through streamsRead in through streams
XSIL software maps Java classes to XSIL software maps Java classes to XSIL tags.XSIL tags.• Convenient for working with XSIL docs. Convenient for working with XSIL docs. • DOM classes are much more DOM classes are much more
cumbersome to use.cumbersome to use.
XDMFXDMF
A data model geared toward A data model geared toward finite element codes, with finite element codes, with
associated software in C++, associated software in C++, Java, and TCLJava, and TCL
ICE XDMFICE XDMF ICE (Interdisciplinary Computing ICE (Interdisciplinary Computing
Environment) is a comprehensive project at Environment) is a comprehensive project at ARL MSRC that attempts to provide a ARL MSRC that attempts to provide a common software platform for DoD scientific common software platform for DoD scientific codes.codes.• Jerry Clarke, lead developerJerry Clarke, lead developer
XDMF (Extensible Data Model and Format) XDMF (Extensible Data Model and Format) provides a common data format for several provides a common data format for several different codes different codes • Primary focus: finite element codes for fluid Primary focus: finite element codes for fluid
dynamics and structural mechanics.dynamics and structural mechanics.• XDMF and related software provides the XDMF and related software provides the
backbone for loosely coupling applications and backbone for loosely coupling applications and visualization.visualization.
XDMF DesignXDMF Design
XDMF divides data into “light” and XDMF divides data into “light” and “heavy” types.“heavy” types.
Light data, or metadata, is formatted Light data, or metadata, is formatted in XML and will be described in more in XML and will be described in more depth.depth.
Heavy data is in HDF5 and not Heavy data is in HDF5 and not presented here.presented here.
XDMF Basic ConceptsXDMF Basic Concepts
XDMF basic tags are <DataStructure> and XDMF basic tags are <DataStructure> and <DataTransform><DataTransform>
<DataStructure> defines the actual data.<DataStructure> defines the actual data. <DataTransform> defines the area of <DataTransform> defines the area of
interest (AOI) in the data.interest (AOI) in the data.• AOI defined by coordinates, a function, or a AOI defined by coordinates, a function, or a
hyperslab.hyperslab. <DataTransform> contains one or more <DataTransform> contains one or more
<DataStructures><DataStructures>• The transform defines how the data structure The transform defines how the data structure
will be filtered. will be filtered.
Simple Data StructureSimple Data Structure
The example below is for 655 XYZ values The example below is for 655 XYZ values in the indicated HDF5 file.in the indicated HDF5 file.
<DataStructure Name="Some XYZ Data"<DataStructure Name="Some XYZ Data"
Type="Float"Type="Float"
Dimensions="655 3">Dimensions="655 3">
MyData.h5:/MyXYZdataMyData.h5:/MyXYZdata
</DataStructure></DataStructure> Simple character data can also be included Simple character data can also be included
directly the XML document. directly the XML document.
Data Structure for Mesh Data Structure for Mesh Connections and PressuresConnections and Pressures
<DataStructure<DataStructure
Name="Connections"Name="Connections"
Type="Int"Type="Int"
Precision="8"Precision="8"
Dimensions="100 8" >Dimensions="100 8" >
MyData.h5:/MyConnsMyData.h5:/MyConns
</DataStructure></DataStructure>
<DataStructure<DataStructure
Name="Pressure"Name="Pressure"
Type="Float"Type="Float"
Precision="8"Precision="8"
Dimensions="100">Dimensions="100">
MyData.h5:/MyPressureMyData.h5:/MyPressure
</DataStructure></DataStructure>
Data Structure Attribute SummaryData Structure Attribute Summary
<DataStructure<DataStructure Name= "Any name " Some meaningful name to Name= "Any name " Some meaningful name to
the ownerthe owner Rank="NumberOfDimensions" Redundant Rank="NumberOfDimensions" Redundant
information information Dimensions="Kdim Jdim Idim" The slowest Dimensions="Kdim Jdim Idim" The slowest
varying dimension is listed firstvarying dimension is listed first Type="Char | Float | Int | Compound" Default is Type="Char | Float | Int | Compound" Default is
FloatFloat Precision="BytesPerElement" Default is 4Precision="BytesPerElement" Default is 4 Format="XML | HDF" Default is XML Format="XML | HDF" Default is XML >>
XDMF Array TypesXDMF Array Types
XDMF array entries can have these XDMF array entries can have these types:types:• Integer Integer • Float Float • CharChar
All are 4 bytes by default, can be All are 4 bytes by default, can be increased to 8 bytes.increased to 8 bytes.
DataTransformDataTransform
DataTransform defines a way for the DataTransform defines a way for the raw data to be filtered raw data to be filtered • Gives a certain Area of Interest in data Gives a certain Area of Interest in data
set.set. Possible transforms:Possible transforms:
• Coordinate: Select an particular areaCoordinate: Select an particular area• Function: Define simple algorithm for Function: Define simple algorithm for
selecting areaselecting area• Hyperslab: Define start, stride, and Hyperslab: Define start, stride, and
count for each dimension of an array.count for each dimension of an array.
Hyperslab Transform ExampleHyperslab Transform Example The following markup instructs the processing The following markup instructs the processing
code to apply an hyperslab transform to a 4-D code to apply an hyperslab transform to a 4-D array.array.
The first data structure defines the hyperslab:The first data structure defines the hyperslab:• 0000 are the starting points for each dim0000 are the starting points for each dim• 2221 are the strides for each dim2221 are the strides for each dim• 25 50 75 3 are the step sizes for each dim25 50 75 3 are the step sizes for each dim
The second data structure gives the raw data, a The second data structure gives the raw data, a 100x200x300x3 array in the noted HDF5 file.100x200x300x3 array in the noted HDF5 file.
The transform will produce a 25x50x75x3 region The transform will produce a 25x50x75x3 region that includes every other plane of the original that includes every other plane of the original data in the original data region [0,0,0,0]-data in the original data region [0,0,0,0]-[50,100,150,2].[50,100,150,2].
Hyperslab Transform ExampleHyperslab Transform Example
<DataTransform<DataTransform
Dimensions="25 50 75 Dimensions="25 50 75 3"3"
Type="HyperSlab">Type="HyperSlab">
<DataStructure<DataStructure
Dimensions="3 4"Dimensions="3 4"
Format="XML">Format="XML">
0 0 0 0 2 2 2 1 25 0 0 0 0 2 2 2 1 25 50 75 350 75 3
</DataStructure></DataStructure>
<DataStructure<DataStructure
Name="Points"Name="Points"
Dimensions="100 Dimensions="100 200 300 3"200 300 3"
Format="HDF">Format="HDF">
MyData.h5:/XYZMyData.h5:/XYZ
</DataStructure></DataStructure>
</DataTransform></DataTransform>
Data OrganizationData Organization
DataStructures and DataTransform DataStructures and DataTransform constitute XDMF’s data representation.constitute XDMF’s data representation.
XDMF Domain tags are used as XDMF Domain tags are used as arbitrary containers.arbitrary containers.
Domains contain grids, grids contain Domains contain grids, grids contain topologies, geometries and attributes, topologies, geometries and attributes, as well as data structures.as well as data structures.
Attributes include scalars, vectors, Attributes include scalars, vectors, tensorstensors
An XDMF ExampleAn XDMF Example<Domain Name="Example #1"><Domain Name="Example #1"> <Grid Name="My Hex Grid with <Grid Name="My Hex Grid with
Pressure">Pressure"> <Topology Type="Hexahedron"<Topology Type="Hexahedron" Dimensions="100"Dimensions="100" Order="7 6 5 4 3 2 1 0">Order="7 6 5 4 3 2 1 0"> <DataStructure<DataStructure Name="Connections"Name="Connections" Type="Int"Type="Int" Precision="8"Precision="8" Dimensions="100 8" >Dimensions="100 8" > MyData.h5:/MyConnsMyData.h5:/MyConns </DataStructure></DataStructure> </Topology></Topology>
(continued in next column)(continued in next column)
<Geometry Type="XYZ"><Geometry Type="XYZ"> <DataStructure Name="XYZ <DataStructure Name="XYZ
Data"Data" Type="Float"Type="Float" Dimensions="655 3">Dimensions="655 3"> MyData.h5:/MyXYZdataMyData.h5:/MyXYZdata </DataStructure></DataStructure> </Geometry></Geometry> <Attribute Type="Scalar“ <Attribute Type="Scalar“
Center="Cell">Center="Cell"> <DataStructure <DataStructure
Name="Pressure"Name="Pressure" Type="Float"Type="Float" Precision="8"Precision="8" Dimensions="100"> Dimensions="100"> MyData.h5:/MyPressureMyData.h5:/MyPressure </DataStructure></DataStructure> </Attribute></Attribute> </Grid></Grid></Domain></Domain>
Review of ExampleReview of Example
Recall XDMF is primarily for structured and Recall XDMF is primarily for structured and unstructured finite element grids.unstructured finite element grids.• Input data includes grid connectivity info, grid Input data includes grid connectivity info, grid
geometry, and pressure values geometry, and pressure values The Domain contains a GridThe Domain contains a Grid The Grid is defined by Topology, The Grid is defined by Topology,
Geometry, and Attributes.Geometry, and Attributes. Topology, Attributes, and Geometry Topology, Attributes, and Geometry
contain data sources and structure info.contain data sources and structure info.
XDMF APIXDMF API
Like XSIL, XDMF treats the XML markup as Like XSIL, XDMF treats the XML markup as a set of instructions to be processed by a set of instructions to be processed by actual programs.actual programs.
XDMF defines an API of document XDMF defines an API of document processing engines.processing engines.• Core is in C++Core is in C++• ICE also provides Java and TCL APIs through ICE also provides Java and TCL APIs through
wrappers around core.wrappers around core. See See
http://www.arl.hpc.mil/ice/Examples/CodeIhttp://www.arl.hpc.mil/ice/Examples/CodeIntegration/DemoIceRt.cxxntegration/DemoIceRt.cxx for code example. for code example.
XDMF SummaryXDMF Summary
Provides a few general purpose tagsProvides a few general purpose tags Again, data is not directly marked up.Again, data is not directly marked up.
• Stored in HDF5Stored in HDF5 XDMF handled programmatically with XDMF handled programmatically with
APIs in C++, Java, Tcl.APIs in C++, Java, Tcl. More information:More information:
• http://www.arl.hpc.mil/ice/http://www.arl.hpc.mil/ice/
Comparison of XSIL and XDMFComparison of XSIL and XDMF
XSILXSIL• Larger tag setLarger tag set• Java APIJava API• Can read data that is Can read data that is
in document, on in document, on disk, from URLdisk, from URL
• Questionable Questionable performance and performance and memory efficiency memory efficiency for very large data for very large data sets.sets.
• Free and open Free and open sourcesource
XDMFXDMF• Uses HDF5 for large Uses HDF5 for large
data sets.data sets.• C++, Java, TCL APIs.C++, Java, TCL APIs.• Defines both data Defines both data
structures and structures and transform transform instructions.instructions.
• Supports arrays, but Supports arrays, but not mixed data not mixed data types (such as XSIL types (such as XSIL Tables).Tables).
• Integrated with ICE Integrated with ICE
Chemical Markup Chemical Markup LanguageLanguage
A domain specific XML A domain specific XML markup language.markup language.
CML IntroductionCML Introduction
XSIL and XDMF use XML to describe code XSIL and XDMF use XML to describe code input files and give simple processing input files and give simple processing instructions.instructions.
Tags describe data structure, not content.Tags describe data structure, not content. We now examine a domain specific We now examine a domain specific
example, the Chemical Markup Language.example, the Chemical Markup Language. Other domain markup languages:Other domain markup languages:
• Mathematics Markup Language (MathML)Mathematics Markup Language (MathML)• Geography Markup Language (GML)Geography Markup Language (GML)
XML for ChemistryXML for Chemistry
Goal: provide a common chemical data Goal: provide a common chemical data format that is an open, universal standard.format that is an open, universal standard.• Data representation is platform independentData representation is platform independent• Support structured searches of data banks.Support structured searches of data banks.• Provide a common format for software Provide a common format for software
(particularly visualization).(particularly visualization).• Support multidisciplinary data formats Support multidisciplinary data formats
(biology, math) through XML namespaces.(biology, math) through XML namespaces.• Provide a data object hierarchy suitable for Provide a data object hierarchy suitable for
object oriented programming.object oriented programming.
CML StructureCML Structure
Chemistry lends itself to object Chemistry lends itself to object container structurecontainer structure• Atoms have protons, neutrons, electronsAtoms have protons, neutrons, electrons• Molecules have atomsMolecules have atoms• Complex molecules and compounds are Complex molecules and compounds are
composed of molecules, molecular composed of molecules, molecular pieces (benzene rings, for example)pieces (benzene rings, for example)
CML defines these as data objects CML defines these as data objects with property fieldswith property fields
A Simple Example: GlycineA Simple Example: Glycine<molecule convention="MDLMol" <molecule convention="MDLMol"
id="glycine" title="GLYCINE">id="glycine" title="GLYCINE"> <date day="22" month="11" <date day="22" month="11"
year="1995">year="1995"> </date></date> <atomArray><atomArray> <atom id="a1"><atom id="a1"> <string <string
builtin="elementType">builtin="elementType">C</string>C</string>
<float <float builtin="x2">0.6424</float>builtin="x2">0.6424</float>
<float <float builtin="y2">0.4781</float>builtin="y2">0.4781</float>
</atom></atom> … …..</atomArray></atomArray>
<bondArray><bondArray> <bond id="b1"><bond id="b1"> <string <string
builtin="atomRef">a1</strbuiltin="atomRef">a1</string>ing>
<string <string builtin="atomRef">a2</strbuiltin="atomRef">a2</string>ing>
<string <string builtin="order">1</stringbuiltin="order">1</string>>
</bond></bond> … …..</bondArray></bondArray></molecule></molecule>
CML Example SoftwareCML Example Software
Previous SlidePrevious Slide
Browser tool, Jumbo-3.0 Browser tool, Jumbo-3.0 • User can display dozens of CML’d User can display dozens of CML’d
molecules.molecules.• Molecules can by rotated in display. Molecules can by rotated in display. • Display is rendered in SVG (Adobe Display is rendered in SVG (Adobe
plugin).plugin).• Molecule displayed is cholesterol. They Molecule displayed is cholesterol. They
also have glycine in database, but not also have glycine in database, but not as exciting to look at.as exciting to look at.
Gateway Application Gateway Application DescriptorsDescriptors
Describing scientific Describing scientific applications themselves with applications themselves with
XML and mapping to Java with XML and mapping to Java with Castor.Castor.
Gateway Application DescriptorsGateway Application Descriptors
Gateway is a computational web Gateway is a computational web portal for securely submitting and portal for securely submitting and monitoring jobs, transferring files, monitoring jobs, transferring files, and archiving information.and archiving information.
Gateway describes scientific Gateway describes scientific applications and host computers with applications and host computers with XML metadata.XML metadata.
This is used to provide general This is used to provide general purpose tools that can be used to purpose tools that can be used to build portals for specific applications.build portals for specific applications.
Application DescriptorsApplication Descriptors
Gateway describes scientific applications Gateway describes scientific applications and host machines in XML.and host machines in XML.
This is used to generate HTML forms This is used to generate HTML forms needed to collect information needed to needed to collect information needed to create batch queuing scripts and job create batch queuing scripts and job submission.submission.
The general object container scheme isThe general object container scheme is• Portals contain applicationsPortals contain applications• Applications contain hostsApplications contain hosts• Each also has a set of descriptive parameters.Each also has a set of descriptive parameters.
Example: ANSYS on GridsExample: ANSYS on Grids<Application><Application> <ApplicationName>ANSYS<ApplicationName>ANSYS </ApplicationName></ApplicationName> <Version>5.0</Version><Version>5.0</Version> <Parameter Name="IOStyle"><Parameter Name="IOStyle">
<Value>StandardIO</Value><Value>StandardIO</Value>
</Parameter></Parameter> <Parameter <Parameter
Name="NumberOfInFiles">Name="NumberOfInFiles"> <Value>1</Value><Value>1</Value> </Parameter></Parameter>
(continued on next column)(continued on next column)
<Host><Host> <HostName><HostName>
grids.ucs.indiana.edugrids.ucs.indiana.edu</HostName></HostName>
<HostIP>156.56.103.5</HostIP><HostIP>156.56.103.5</HostIP> <RemoteCopy>rcp<RemoteCopy>rcp </RemoteCopy></RemoteCopy>
<RemoteExec>rsh</RemoteExec<RemoteExec>rsh</RemoteExec>>
<WorkDir>/tmp</WorkDir><WorkDir>/tmp</WorkDir>
<QueueType>CSH</QueueType><QueueType>CSH</QueueType> <QsubPath>/usr/bin/csh<QsubPath>/usr/bin/csh </QsubPath></QsubPath> <ExecPath>echo<ExecPath>echo
</ExecPath></ExecPath> </Host></Host></Application></Application>
Java Data Object BindingsJava Data Object Bindings
As with other examples, the As with other examples, the descriptor does not do anything by descriptor does not do anything by itself.itself.
Must provide language bindings to Must provide language bindings to make it useful in programs.make it useful in programs.
We used Castor We used Castor (http://castor.exolab.org) to generate (http://castor.exolab.org) to generate classes for us.classes for us.
Castor for Data Object CreationCastor for Data Object Creation Direct mapping between Application tag and Java Direct mapping between Application tag and Java
object, for example.object, for example. Each object has necessary getter and setter Each object has necessary getter and setter
methods for manipulating data.methods for manipulating data. After making classes from XML schema (once), After making classes from XML schema (once),
load in XML file to program to create particular load in XML file to program to create particular data object instances (unmarshalled)data object instances (unmarshalled)
When program is done, modified data objects can When program is done, modified data objects can be marshalled back into XML file format.be marshalled back into XML file format.
We still have to write the Java code for specific We still have to write the Java code for specific uses, utility classes…. uses, utility classes….
Other markup languages Other markup languages and some comparisonand some comparison
Various shortcomings of Various shortcomings of programming and markup programming and markup
languageslanguages
XML SchemaXML Schema
XML Schema defines many built-in XML Schema defines many built-in typestypes• binary, boolean, byte, decimal, double, binary, boolean, byte, decimal, double,
float, int, long, short, stringfloat, int, long, short, string• And many moreAnd many more
Does not define standards forDoes not define standards for• ArraysArrays• Complex (real+imaginary) numbersComplex (real+imaginary) numbers
SOAPSOAP
Known as XML Remote Procedure Call Known as XML Remote Procedure Call protocol.protocol.• RPC is only one part of SOAPRPC is only one part of SOAP
Also defines encoding rules for data Also defines encoding rules for data exchange.exchange.
SOAP inherits all XML Schema Built-in Types SOAP inherits all XML Schema Built-in Types (see previous slide).(see previous slide).
Defines additional compound typesDefines additional compound types• Struct: arbitrary collection of types (say, strings Struct: arbitrary collection of types (say, strings
and floats) similar to XSIL table entry.and floats) similar to XSIL table entry.• Array: can contain primitive and compound types Array: can contain primitive and compound types
An array can be built out of arrays.An array can be built out of arrays.
HDF5 and XMLHDF5 and XML
Types includeTypes include• Integers Integers
2-64 bit, signed or unsigned, big or little endian2-64 bit, signed or unsigned, big or little endian
• Floats (32, 64 bit, BE or LE)Floats (32, 64 bit, BE or LE)• StringsStrings• ArraysArrays
Arbitrary compound typesArbitrary compound types See http://hdf.ncsa.uiuc.edu/HDF5/XML/See http://hdf.ncsa.uiuc.edu/HDF5/XML/
Compatibility and Missing FeaturesCompatibility and Missing Features No standard XML definitions for arrays No standard XML definitions for arrays
and “compound types” like XSIL tables.and “compound types” like XSIL tables.• We have several defs: SOAP, XSIL, XDMF, We have several defs: SOAP, XSIL, XDMF,
XML-HDF5XML-HDF5 Lack of built-in support for complex Lack of built-in support for complex
(real + imaginary) types (real + imaginary) types • XML, XML-HDF5, XDMF can easily define XML, XML-HDF5, XDMF can easily define
complex but not in standard way.complex but not in standard way.• Java does not have built-in complex type, Java does not have built-in complex type,
eithereither
More Missing FeaturesMore Missing Features
Varying support for integers, floats with Varying support for integers, floats with different sizes.different sizes.• C/C++ does not guarantee consistent bit C/C++ does not guarantee consistent bit
size.size. Binary data must specify Big Endian/Little Binary data must specify Big Endian/Little
Endian encoding for cross platform Endian encoding for cross platform compatibility.compatibility.• XML-HDF5, XSIL, XDMF all do thisXML-HDF5, XSIL, XDMF all do this• XML does notXML does not
XSIL does not have signed/unsigned XSIL does not have signed/unsigned