emboss over a grid 1. 1st eela grid school december 4th of 2006 eduardo murrieta leon romualdo...

19
1 EMBOSS over a Grid EMBOSS over a Grid

Upload: beryl-bates

Post on 14-Dec-2015

217 views

Category:

Documents


3 download

TRANSCRIPT

1

EMBOSS over a GridEMBOSS over a Grid

1st EELA Grid School 1st EELA Grid School December 4th of 2006December 4th of 2006

Eduardo MURRIETA LEON

Romualdo ZAYAS-LAGUNAS

Pierre-Alain BRANGER

Jérôme VERLEYEN

Roberto RODRIGUEZ

César BONAVIDES

Alfredo HERNANDEZ

EMBOSS over a GridEMBOSS over a Grid

3

EMBOSS over a GridEMBOSS over a Grid

IntroductionIntroduction

4

EMBOSS over a GridEMBOSS over a Grid

IndexIndex

BioinformaticsBioinformatics EMBOSSEMBOSS ObjectivesObjectives

5

EMBOSS over a GridEMBOSS over a Grid

What is Bioinformatics?What is Bioinformatics?

BioinformaticsBioinformatics What is it?What is it?

ToolsTools

DatabaseDatabase EMBOSS EMBOSS ObjectivesObjectives

• State of art

- Analysis of genes expression

- Need for prediction of protein structure

- Analysis of sequence

- A huge amount of knowledge to store

6

EMBOSS over a GridEMBOSS over a Grid

What is Bioinformatics?What is Bioinformatics?

BioinformaticsBioinformatics What is it?What is it?

ToolsTools

DatabaseDatabase EMBOSS EMBOSS ObjectivesObjectives

• Bioinformatics as a solution

- To help life science data analysis

- Use in a lot of domain (human genome project)

7

EMBOSS over a GridEMBOSS over a Grid

Type of ToolsType of Tools

BioinformaticsBioinformaticsWhat is it?What is it?

ToolsTools

DatabaseDatabase EMBOSS EMBOSS ObjectivesObjectives

• Searching (knowledge extraction)

- Blast (nucleotides, proteins)

• Alignment

- Clustal

• Phylogeny

- Phylip

8

EMBOSS over a GridEMBOSS over a Grid

DatabaseDatabase

BioinformaticsBioinformaticsWhat is it?What is it?

ToolsTools DatabaseDatabase

EMBOSS EMBOSS ObjectivesObjectives

• Various organization

- NCBI : United States

- EMBL : Europe

- DDBJ : Japan

9

EMBOSS over a GridEMBOSS over a Grid

OverviewOverview

BioinformaticsBioinformatics EMBOSSEMBOSS

OverviewOverview

Tech. Char. Tech. Char.

ArchitectureArchitecture

GUIs GUIs

UseUse ObjectivesObjectives

• The European Molecular Biology Open Software Suite

- From EMBnet

• Package of software:

- a set of sequence analysis program

- a toolkit for creating robust bioinformatics applications or workflows

- Database searching

- Identification of motif

- Presentation tools for publication

10

EMBOSS over a GridEMBOSS over a Grid

Technical CharacteristicsTechnical Characteristics

BioinformaticsBioinformatics EMBOSSEMBOSS

OverviewOverview Tech. Char.Tech. Char.

ArchitectureArchitecture

GUIs GUIs

UseUse ObjectivesObjectives

• Software requirements

- Linux Distribution

- gcc compiler and graphic libraries

• Hardware requirements

- 100 to 400 Mb free disk space

- 512 Mb of RAM

• Execution requirements- Input data size : From 20 Kb to 100 Mb - Output : From 20 Kb to 1 Mb

11

EMBOSS over a GridEMBOSS over a Grid

EMBOSS ArchitectureEMBOSS Architecture

BioinformaticsBioinformatics EMBOSSEMBOSS

OverviewOverview

Tech. Char. Tech. Char. ArchitectureArchitecture

GUIsGUIs

UseUse ObjectivesObjectives

• Main parts

- ACD Files

- Programs (API)

- Inputs / Outputs (sequences, databases)

12

EMBOSS over a GridEMBOSS over a Grid

ACD FilesACD Files

BioinformaticsBioinformatics EMBOSSEMBOSS

OverviewOverview

Tech. Char. Tech. Char. ArchitectureArchitecture

GUIsGUIs

UseUse ObjectivesObjectives

• ACD Files

- Ajax Command

Definition Files

- stored in

$EMBOSS_DIR/acd

application: intconv [ documentation: "Convert ints to ajints" groups: "Test"]

section: input [ information: "Input section" type: "page"]

infile: infile [ parameter: "Y" knowntype: "integer long data" information: "Standard format information" ]

endsection: input

13

EMBOSS over a GridEMBOSS over a Grid

ProgramsPrograms

BioinformaticsBioinformatics EMBOSSEMBOSS

OverviewOverview

Tech. Char. Tech. Char. ArchitectureArchitecture

GUIsGUIs

UseUse ObjectivesObjectives

• Programs

- Binary files written in C and stored in $EMBOSS_DIR/bin

- Use of libraries

AJAX (Asynchronous Javascript and XML)

NUCLEUS (specific of molecular sequence analysis)

14

EMBOSS over a GridEMBOSS over a Grid

Input/OutputInput/Output

BioinformaticsBioinformatics EMBOSSEMBOSS

OverviewOverview

Tech. Char. Tech. Char. ArchitectureArchitecture

GUIsGUIs

UseUse ObjectivesObjectives

• Sequences

- succession of letters representing the structure of a real or hypothetical DNA molecule or protein

- ASCII TEXT extracted from huge Databases

• EMBOSS can access to various format of database

- Embl, Fasta, Genbank, Swissprot …

- access by Id of genes, by description keywords …

15

EMBOSS over a GridEMBOSS over a Grid

GUI for EMBOSSGUI for EMBOSS

BioinformaticsBioinformatics EMBOSSEMBOSS

OverviewOverview

Tech. Char.Tech. Char.

Architecture Architecture GUIsGUIs

UseUse ObjectivesObjectives

• wEMBOSS, Jemboss …

16

EMBOSS over a GridEMBOSS over a Grid

Use of EMBOSSUse of EMBOSS

BioinformaticsBioinformatics EMBOSSEMBOSS

OverviewOverview

Tech. Char.Tech. Char.

Architecture Architecture

GUIsGUIs UseUse

ObjectivesObjectives

• Study of Haptoglobin protein in different species

- Extraction from Swissprot DB. (“seqret”)

- 10 Mamalians species (human, rat, mouse ,rabbit)

- Alignment (“emma”)

- Calculate the phylogenetic tree

17

EMBOSS over a GridEMBOSS over a Grid

Use of EMBOSSUse of EMBOSS

BioinformaticsBioinformatics EMBOSSEMBOSS

OverviewOverview

Tech. Char.Tech. Char.

Architecture Architecture

GUIsGUIs UseUse

ObjectivesObjectives

• Example of a generated tree

18

EMBOSS over a GridEMBOSS over a Grid

ObjectivesObjectives

BioinformaticsBioinformatics EMBOSSEMBOSS ObjectivesObjectives

« Get EMBOSS running over a Grid »

- EMBOSS jobs execution on a grid through command lined

- Retrieving jobs results

- Be able to execute a complete workflow / pipeline sequence analysis (i.e Use of EMBOSS)

Complementary functions• EMBOSSed Databases research

• Wrapping applications for EMBOSS over a Grid

• Web interface and Project manager for EMBOSS

• Have a BioGrid portal

19

EMBOSS over a GridEMBOSS over a Grid

Muito obrigado!Muito obrigado!

Questions