xml in bio medical field

Post on 12-Jul-2015

140 Views

Category:

Technology

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Presented by: Eng. Juman Ghazi

Director: Dr. Eng. Rasha Masood

What is XML?

XML stands for EXtensible Markup Language.

XML is a markup language much like HTML.

XML was designed to describe data and focus on what

data is.

2

eXtensible Markup Language

Helps information systems share structured data.

A meta language that gives meaning to data that other

application can use.

Application and platform independent.

Allows various types of data.

Extensible to accommodate new tags and processing methods.

Allows user-defined tags.

4

Advantages of using XML

Simpler version of Standard Generalized Markup

Language (SGML).

Easy to understand and read.

Supported by large number of platforms.

Used across open standards.

5

Components of an XML Document

1. Elements: <hello>

2. Attributes: <item id=“33905”>

3. Entities: &lt; (<)

4. Advanced Components

1. CData Sections

2. Processing Instructions

6

Example in HTML

7

<html>

<head>

<title>Menu</>

</head>

<body>

<h1>Soup</h1>

<h4>4.99</h4>

</body>

</html>

HTML in web browser

8

Example in HTML

9

<?xml version=“1.0” ?>

<menu>

<item>

<itemname>soup</itemname>

<cost>4.99</cost>

</item>

</menu>

XML in web browser

10

Declaration:

First line in document.

Provides information to the parser.

Recommended but optional.

Contains three name-value pairs:

Version (common).

Encoding (defaults to UTF-8).

Standalone (rare).

11

Tags:

Text in between < and > Have start tag and end tag.

Tags and data stored together.

Data is self-descriptive and easy to under stand.

12

13

Root

(text) (text) (text)

element element

Elements:

Basic building blocks of XML file.

Text between a start tag and end

tag is considered the value of the

element

Documents contain one root

element.

Can contain Nested elements.

14

Attributes:

Provide additional information about

the elements.

Name-value pairs:

- Single or double quotes to encode

values.

- Attribute names are unique within

the same element.

16

Comments:

Appear anywhere in document

- Start tag <!--

- End tag --!> contents inside comment are not parsed.

17

More in XML:

1. Schemas

2. Parsers

3. Editors

4. Standards

18

1. Schemas: Describe the structure and content of an XML

document.

Define a shared vocabulary for application.

Can be expressed using XML schema languages

such as:

-Document Type Definition (DTD).

-XML Schema (W3C).

19

Industry standards and data exchange:

20

2. Parsers:

Read and process the content of an XML

document.

Include push and pull parsers

-Pull parsers: events generated by the application

-Push parsers: events controlled by the parser

Free XML parsers available, including tools from

IBM.

21

3. Editors:

Text and graphical editors facilitate the editing

of XML code.

Benefits of using editors:

coding effort.

-Provide to perform tasks.

22

23

4. Standards: Various types of standards:

-Core standards from the basis of what is expressed

in an XML document.

- Processing standards relate to XML processing by

developers.

-Key vocabularies (applications).

XML standards influencers include the W3C, ISO and

OASIS.

24

XML RuLes:

1. Must Have a Closing Tag.

In HTML, some elements do not have to have a closing tag:

<p>This is a paragraph<p>This is another paragraph

In XML, it is illegal to omit the closing tag.

<p>This is a paragraph</p><p>This is another paragraph</p>

2. XML Tags are Case Sensitive.

XML tags are case sensitive. The tag <Letter> is different from the tag<letter>.

<Message>This is incorrect</message><message>This is correct</message>

"Opening and closing tags" are often referred to as "Start and end tags". Use whatever you prefer. It is exactly the same thing.

25

XML RuLes:3. Elements Must be Properly Nested:

In HTML, you might see improperly nested elements:

<b><i>This text is bold and italic</b></i>

In XML, all elements must be properly nested within each other:

<b><i>This text is bold and italic</i></b>

4. XML Documents Must Have a Root

Element:

XML documents must contain one

element that is the parent of all

other elements. This element is called

the root element.

<root>

<child>

<subchild>.....</subchild>

</child>

< /root>

26

XML RuLes:XML Attribute Values Must be

Quoted:

XML elements can have attributes

in name/value pairs

< note date=12/11/2007>

< to>Tove</to>

< from>Jani</from>

< /note>

< note date="12/11/2007">

< to>Tove</to>

< from>Jani</from>

< /note> 27

Wrong

Right

XML RuLes:

28

5. Entity References

Some characters have a special meaning in XML.

-character like "<" inside an XML element, will

generate an error because the parser interprets it as

the start of a new element.

<message>if salary < 1000 then</message>

<message>if salary &lt; 1000 then</message>

Characters have a special meaning in XML

Characters meanings in XMLLess than<&lt;

Greater than>&qt;

ampersand&&amp;

apostrophe‘&apos;

Quotation mark&quot;

29

tensibleXeXML Elements are XML elements can be extended to carry more information.<note>

<to>Tove</to><from>Jani</from><body>Don't forget me this weekend!</body></note>

Added some extra information to it:<note>

<date>2008-01-10</date><to>Tove</to><from>Jani</from><heading>Reminder</heading><body>Don't forget me this weekend!</body></note>

Should the application break or crash?

No. One of the beauties of XML, is that it can be extended without breaking applications. 31

Examples: 1- book store<bookstore>

<book category="CHILDREN"><title>Harry Potter</title><author>J K. Rowling</author><year>2005</year><price>29.99</price>

</book><book category="WEB"><title>Learning XML</title><author>Erik T. Ray</author><year>2003</year><price>39.95</price>

</book></bookstore>

32

Why XML For -informatics?

Biology is a complex discipline.

Wide variety of data resources and repositories.

Biological data represented in multiple formats. (FASTA

, agp ,gff..)

No standard protocol:

1-to interrogate biological data stores.

2-for Genomic, Proteonomic, Chemi-informatics.

3-to exchange biological data.

Difficulties in using and exchanging data.

34

XML in -informatics

1- (Visual Genomics).

2- (ProteoMetrics).

3- (Chemical info. “atomic, crystallographic

info., structures….”).

4- ene ntology onsortium.

35

The Bioinformatics Sequences Markup Language

(BSML)

-The DTD is aimed at representing DNA, RNA, Protein

sequences and their graphic properties.

-Found the structure of the information to be similar to

the one used in the databases.

(http://www.ebi.ac.uk/embl.html)

(http://www.visualgenomics.com/products/index.html)

(http://www.ncbi.nlm.nih.gov; http://www.ddbj.nig.ac.jp) 36

Gene Ontology Consortium

Controlled description for:

1- Molecular function.

2- Biological processes.

3- Cellular locations of gene products.

37

The BIOpolymer Markup Language (BioML)

- is different to BSML approach.

- BioML Goal (Fenyo, 1999) is “

BioML was designed to mimic thehierarchical structure of aliving organism.”

- Data integration e.g nucleotide and protein sequences

38

top related