lattice 2004chris maynard1 qcdml tutorial how to mark up your configurations

34
Lattice 2004 Chris Maynard 1 QCDml Tutorial How to mark up your configurations

Upload: martin-johnson

Post on 28-Dec-2015

214 views

Category:

Documents


0 download

TRANSCRIPT

Lattice 2004 Chris Maynard 1

QCDml Tutorial

How to mark up your configurations

Lattice 2004 Chris Maynard 2

Contents

FAQs on using XML schema Defining QCDml

– Namespaces and validation

Example XML IDs– Ensemble and config

• Ensemble: actions, algorithms and management metadata

• Config: what goes where

Babble about BinX Metadata catalogue demo

Lattice 2004 Chris Maynard 3

FAQs about XML schema

What is XML schema?– Collection of rules for XML documents– An XML schema is itself an XML document

Why do we need an XML schema?– Computers can read and understand XML IDs– <length>16</length>– Meaning of length is context dependent

Do I need to learn XML schema– No. Schema makes it easier to produce XML

Lattice 2004 Chris Maynard 4

QCDml1.0

Metadata split into two schemata– Ensemble XML <markovChain/>– Config XML <gaugeConfiguration/>

• N.B. use lowerCamelConvention

ILDG website for XML schema files– http://www.lqcd.org/ildg– Go to Metadata and follow links– Version 1.0 online and ready to use

Lattice 2004 Chris Maynard 5

Namespaces

Example XML ID for UKQCD data

XML Namespace defined by W3.org as A collection of names identified by a URI

reference

Lattice 2004 Chris Maynard 6

First namespace

URI defines namespace for QCDml

This is the default namespace All elements of QCDml belong to this

namespace

Lattice 2004 Chris Maynard 7

Second namespace

Namespace of XML schema itself

Prefix <xsi:> for elements of XML schema

XML ID is valid against WC3 XML schema

Lattice 2004 Chris Maynard 8

SchemaLocation

The namespace of the schema

The file which contains the schema URI namespace can be URL of the

schema instance – not compulsory

Lattice 2004 Chris Maynard 9

Logical filename

Unique URI for a file in a namespace

Uniquely identifies this ensemble in ILDG namespace

Lattice 2004 Chris Maynard 10

Validation

Verify XML ID is valid against a schema– Schema aware applications can use XML ID

Can write XML in vi,emacs etc CMM uses XMLSpy for schema and ID

manipulation– built in validator, create XML ID from schema

http://www.w3.org/XML/Schema– Many different tools

Lattice 2004 Chris Maynard 11

QCDml Ensemble <action/>

Split into quark and gluon sections

UML representation of schema

Lattice 2004 Chris Maynard 12

Ensemble XML - actions

Inheritance tree - check for your action in schema

Lattice 2004 Chris Maynard 13

Which elements?

Schema defines required elements UKQCD NP clover

Lattice 2004 Chris Maynard 14

UKQCD Ensemble example

Glossary: not computer readable

How cSW was determined

References etc

Lattice 2004 Chris Maynard 15

NumberOfFlavours

Number of degenerate flavours for which these coupling values apply

Lattice 2004 Chris Maynard 16

MILC 2+1 staggered Ensemble

<couplings/> is array valued

Non-degenerate flavours shown with different couplings

Mass 0.02

Mass 0.05

Lattice 2004 Chris Maynard 17

Management

Metadata created when Ensemble registered with ILDG

Yet to be created middleware will do this

Lattice 2004 Chris Maynard 18

Algorithm

Algorithmic metadata split between ensemble and algorithm

Most metadata is unconstrained parameter <name/> <value/> pairs

Relevant information can be found– Glossary document for references etc

Hierarchical structure for algorithms is– difficult to create– difficult to make extenisble

Lattice 2004 Chris Maynard 19

Algorithm: Example

Glossary for detailed information

Unconstrained parameter <name/> <value/> pairs

Lattice 2004 Chris Maynard 20

Config XML

Machine and code details

In principle these could be different for configurations in the same ensemble

Lattice 2004 Chris Maynard 21

Config Management

Checksum for config binary

Zeroeth <revision/> is generate data, as this occurs before submission to ILDG

Lattice 2004 Chris Maynard 22

Precision

Precision (double or float) in which the calculation was done

Lattice 2004 Chris Maynard 23

markovStep

Logical File name of the ensemble in the ILDG namespace

Lattice 2004 Chris Maynard 24

dataLFN

Logical File name of the configuration in the ILDG namespace

Lattice 2004 Chris Maynard 25

The markov chain

Where the configuration is in the trajectory of markov chain

Lattice 2004 Chris Maynard 26

avePlaquette

Very useful metadata, can be used to check data transformations are correct

Lattice 2004 Chris Maynard 27

Config: UKQCD example

Application codes can write this info either as QCDml

Or tool can convert the IO to QCDml

Lattice 2004 Chris Maynard 28

BinX

XML markup for binary data Library for manipulating marked up data Production codes do not use BinX library

– But easy to mark up data format in BinX style– ILDG middleware can use BinX for data

manipulations– http://www.edikt.org/binx

BinX under discussion by Middleware + Metadata WG for file format.

Lattice 2004 Chris Maynard 29

Gauge config BinX

Small

Written once per ensemble

write code on top of BinX library

Change array order

2x3 3x3

average plaquette

ILDG BinX based gauge config manipulator?

Lattice 2004 Chris Maynard 30

Correlator data

Compact. No standard shape to correlators

BinX will read in any shape

Lattice 2004 Chris Maynard 31

Array stripper

BinX + BJ’s Xpath reader

Code reads this XML

Produces single slice array in text/XML

From any size/shape array

Schema for correlator channels

ILDG middleware extract channel from any correlator

Lattice 2004 Chris Maynard 32

Correlator dictionary

Possible QCDml extension Correlator AP code knows channel details

– IO AP write dictionary• Channel n is zero p pion

User requests pion– Stripper reads dictionary to find pion– Pulls channel n from correlator

Very easy to read other peoples data!

),,,,( 21 mnnnptCC

Lattice 2004 Chris Maynard 33

Metadata demonstration

UKQCD metadata catalogue– Browser is based on OGSA-DIA– Open source

• You can get it at www.forge.nesc.ac.uk

Browser reads the schema– Build XPath query graphically– Result handler

• Display XML and GET data• Render web page of results?• Create XML IDs?

Lattice 2004 Chris Maynard 34

ILDG metadata

ILDG proposal: – All collaborations publish metadata

Example method– UKQCD metadata catalogue access is not

authenticated– Anyone can read it

ILDG aggregation of metadata catalogues– Mark up data in QCDml– No extra effort required.