1 p-grade portal tutorial mta sztaki pgportal@lpds.sztaki.hu gergely sipos sipos@sztaki.hu

Post on 04-Jan-2016

230 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

1

P-GRADE Portal tutorialP-GRADE Portal tutorial

MTA SZTAKIwww.portal.p-grade.hupgportal@lpds.sztaki.hu

Gergely Sipossipos@sztaki.hu

2

AgendaAgenda

• Basics of P-GRADE Portal (~1 hour)• Workflow hands-ons (~90 minute)

• Developing Application Specific Portal using P-GRADE (~30 minute)

• Agenda: – http://indico.cern.ch/conferenceOtherViews.py?confId=58116

3

P-GRADE overview and introduction: P-GRADE overview and introduction: workflows & parameter sweepsworkflows & parameter sweeps

(Basics) (Basics)

4

Introduction of LPDSIntroduction of LPDS(Lab of Parallel and Distr. Systems)(Lab of Parallel and Distr. Systems)

• Research division of MTA SZTAKI from 1998• Head: Peter Kacsuk, Prof.

• 22 research fellows• Foundation member

– Central European Grid Consortium (2003)– Hungarian Grid Competence Center (2003)

• Participant or coordinator in many European and national Grid research, infrastructure, and educational projects (from 2000)

– FP5: GridLab, DataGrid– FP6: EGEE I-II, SEE-GRID I-II, CoreGrid, ICEAGE, CancerGrid– FP7: EGEE III, SEE-GRID-SCI, EDGeS (coordinator), ETICS, S-CUBE

• Central European Grid Training Center in EGEE (from 2004)

www.lpds.sztaki.hu

5

Short History of P-GRADE portalShort History of P-GRADE portal

• Parallel Grid Application and Development Environment

• Initial development started in the Hungarian SuperComputing Grid project in 2003

• It has been continuously developed since 2003• Detailed information:

http://www.portal.p-grade.hu/

• Open Source community development since January 2008:

https://sourceforge.net/projects/pgportal/

6

Download of OSS P-GRADE portalDownload of OSS P-GRADE portal

110 downloads within the first month

~697 total downloads until now

7

Main P-GRADE related projectsMain P-GRADE related projects

• EU SEE-GRID-1 (2004-2006)– Integration with LCG-2 and gLite

• EU SEE-GRID-2 (2006-2008)– Parameter sweep extension

• EU CoreGrid (2005-2008)– To solve grid interoperation for job submission– To solve grid interoperation for data handling: SRB, OGSA-DAI

• GGF GIN (2006)– Providing the GIN Resource Testing portal

• EGEE 2,3 (2006-2010)– Respect program tool used for training and application development

• ICEAGE (2006-2008)– P-GRADE portal is used for training as official portal of the GILDA training

infrastructure• EU EDGeS (2008-2009)

– Transparent access to any EGEE and Desktop Grid systems– See Demo Booth 5: EDGes – Desktop Grid Extension of the EGEE

Infrastructure

9

Portal installationsPortal installations

10

Multi-Grid service portalMulti-Grid service portalTo be used today!To be used today!

13

Motivations for developing P-GRADE Motivations for developing P-GRADE portalportal

• P-GRADE portal should – Hide the complexity of the underlying grid middlewares– Provide a high-level graphical user interface that is easy-to-

use for e-scientists– Support many different grid programming approaches:

• Simple Scripts & Control (sequential and MPI job execution)• Scientific Application Plug-ins• Complex Workflows• Parameter sweep applications: both on job and workflow level• Interoperability: transparent access to grids based on different

middleware technology (both computing and data resources)– Support several levels of parallelism

14

Layers in a Grid systemLayers in a Grid system

Basic Grid services:AA, job submission, info, …

Higher-level grid services (brokering,…)

Application toolkits, standards

Application

Grid middlewareCommand line tools

P-GRADE Portal servicesGraphical interface

16

What is a P-GRADE Portal workflow?

• a directed acyclic graph where– Nodes represent jobs (batch

programs to be executed on a computing element)

– Ports represent input/output files the jobs expect/produce

– Arcs represent file transfer operations

• semantics of the workflow:– A job can be executed if all

of its input files are available

17

Three Levels of parallelismThree Levels of parallelism

– PS workflow level: Parameter study execution of the workflow

– Workflow level: Parallel execution among workflow nodes (WF branch parallelism)

Multiple jobs can run parallel

Each job can be a parallel program

– Job level: Parallel execution inside a workflow node (MPI job as workflow component)

Multiple instances of the same workflow can

process different data files

18

25 times

Example: Computational ChemistryExample: Computational Chemistry

Department of Chemistry, University of Perugia

SOLUTION OF SCHRODINGER EQUATION FOR TRIATOMIC SYSTEMS USING TIME-DEPENDENT (RWAVEPR) OR TIME INDEPENDENT (ABC) METHOD

A single execution can be between 5 hours and 10 hours

SEQUENTIAL FORTRAN 90

Many simulations at the same time

See at demo booth 11: EGEE Application Porting Support Group

19

Grid interoperation by P-GRADEAcccessing Globus, gLite and ARC based grids/VOs simultaneously

P-GRADE

GEMLCA

Portal

GEMLCA GEMLCA RepositoryRepository

P-GRADEportal

20

Typical user scenarioTypical user scenarioJob compilation phaseJob compilation phase

Certificate servers

Portalserver

Gridservices

DOWNLOAD BINARI(ES)

UPLOAD SOURCE(S)

Client COMPILE – EDIT

21

Typical user scenarioTypical user scenarioApplication development phaseApplication development phase

Certificate servers

Portalserver

Gridservices

START EDITOR

OPEN & EDIT WORKFLOW

orPARAMETER

STUDY

SAVE APPLICATION

Client

22

Certificate servers

Portalserver

Gridservices

TRANSFER FILES, SUBMIT JOBS

DOWNLOAD (SMALL)

RESULTS

DOWNLOAD (SMALL)

RESULTS

Typical user scenariosTypical user scenarios Workflow execution phaseWorkflow execution phase

VISUALIZE JOBS and

APPLICATION PROGRESS

MONITOR JOBS

DOWNLOAD PROXY CERTIFICATES

Client

23

P-GRADE Portal P-GRADE Portal structural overviewstructural overview

User interface layer Presents the user interface

Internal layer – Java classesRepresents the internal concepts

Java Webstartworkflow editor

Web browser

EGEE and Globus Grid services (gLite WMS, LFC,…; Globus GRAM, GridFTP, …)

Client

P-GRADEPortalserver

Grid

Grid layer – gLite and Globus command line toolsInterfacing with grid services

24

Interface layerInterface layer

User interface layer

Java Webstartworkflow editor

Web browserClient

Web server

P-GRADEPortalserver

Gridpshere Web portal framework

Gridsphere portlets

P-GRADE portlets

Workflow monitor:

Java applet generator

Workflow editor:

Java webstart application

25

Interface layer functionalitiesInterface layer functionalities

User interface layer

Java Webstartworkflow editor

Web browserClient

Web server

P-GRADEPortalserver

Gridpshere Web portal framework

Gridsphere portlets

P-GRADE portlets

Workflow monitor:

Java applet generator

Workflow editor:

Java webstart application

•Workflow portlet• Workflow manager, Storage, Upload

•Certificate portlet• Upload, download and other operations

•Settings portlet• Grid settings, Quota settings

• File management• Manage files in the grid

• Compiler portlet• Compile jobs on portal server

• Login• Welcome• ...

26

P-GRADE vs. Non-P-GRADE P-GRADE vs. Non-P-GRADE portletsportlets

P-GRADE Portal portletsGridSphere

2.xGrid Portal framework

27

Interface layerInterface layer

User interface layer

Java Webstartworkflow editor

Web browserClient

Web server

P-GRADEPortalserver

Gridpshere Web portal framework

Gridsphere portlets

P-GRADE portlets

Workflow monitor:

Java applet generator

Workflow editor:

Java webstart application

28

Interface layerInterface layer

User interface layer

Java Webstartworkflow editor

Web browserClient

Web server

P-GRADEPortalserver

Gridpshere Web portal framework

Gridsphere portlets

P-GRADE portlets

Workflow monitor:

Java applet generator

Workflow editor:

Java webstart application

32

Portlets/functionalities of Portlets/functionalities of P-GRADE portalP-GRADE portal

• Settings (portlet)• Certificate and proxy management (portlet)• Information system visualization (portlet)• Graphical workflow editing• Workflow manager (portlet)• LFC (EGEE) file management (portlet)• Compilation support (portlet)• Fault-tolerance support

33

Settings PortletSettings Portlet

• Portal administrator can – connect the

portal to several grids

– register the basic resources of the connected grids

34

Settings PortletSettings Portlet

User can customize the connected grids by adding and removing resources

35

Certificate and proxy management Certificate and proxy management PortletPortlet

• User can upload his certificates of various grids to the MyProxy server

• User can download proxys and allocate to grids• User can use simultaneously as many proxys as

many grids are connected to the portal• As a result parallel branches of a workflow can be

executed simultaneously in several grids

SEE-GRID accessHUNGRID access

36

EGEE Grid

UK NGS

P-GRADE-Portal

London Rome

Athens

Solving Grid interoperation by Solving Grid interoperation by P-GRADE PortalP-GRADE Portal

Different jobs can be parallel executed in different grids

37

Interoperation vs. InteroperabilityInteroperation vs. Interoperability

Interoperation: – short term solution that defines what needs to be done to achieve

interoperation between current production grids using existing technologies

Interoperability:– native ability of Grids and Grid middleware to interact directly via

common open standards

As defined by the GIN (Grid Interoperation Now)CG (Community Group) of the OGF (Open Grid Forum)

Grid 1 Grid 2 Grid 3

P-GRADE Portal Grid 1

Grid 2 Grid 3

Interoperation Interoperability

38

Information system PortletInformation system Portlet

39

Graphical workflow editingGraphical workflow editing

• The aim is to define a DAG of batch jobs:

1. Drag & drop components:jobs and ports

2. Define their properties3. Connect ports by

channels (no cycles, no loops, no conditions)

4. Automatically generates JDL file

40

Workflow Workflow EditorEditorProperties of a jobProperties of a job

Properties of a job:• Binary executable• Type of executable• Number of required

processors• Command line parameters• The resource to be used

for the execution:• Grid/VO• (Computing element)

41

Workflow Workflow EditorEditorDefining broker jobsDefining broker jobs

Select a Grid with broker!(*_BROKER)

Ignore the resource field!

If default JDL is not sufficient use the built-in JDL editor!

42

Workflow Workflow EditorEditorDefining input-output filesDefining input-output files

File propertiesType: input: the job reads output: the job generatesFile type: local: comes from my desktop remote: comes from an SEFile: location of the fileInternal file name: Executable reads the file in this name – fopen(“file.in”, …)File storage type (output files only): Permanent: final result Volatile: only data channel

43

• Client side location:result.dat

• LFC logical file name(LFC file catalog is required – EGEE VOs) lfn:/grid/gilda/sipos/11-04_-_result.dat

• GridFTP address (in Globus Grids):gsiftp://somengshost.ac.uk/mydir/result.dat

Local fileLocal file

Remote fileRemote file

How to refer to an I/O file?How to refer to an I/O file?

• Client side location:c:\experiments\11-04.dat

• LFC logical file name(LFC file catalog is required – EGEE VOs) lfn:/grid/gilda/sipos/11-04.dat

• GridFTP address (in Globus Grids):gsiftp://somengshost.ac.uk/mydir/11-04.dat

Input file Output file

44

Local vs. remote filesLocal vs. remote files

Portalserver

Gridservices

Computing elements

Storage elements

REMOTE INPUTFILES

REMOTE OUTPUT

FILES

LOCAL INPUT FILES

& EXECUTABLES

LOCAL OUTPUT

FILES

LOCAL INPUT FILES

& EXECUTABLES

LOCAL OUTPUT

FILES

Only the permanent

files!

Your binary can access data services directly too• GridFTP API• GFAL API• lfc-*, lcg-* commands

45

Workflow managerWorkflow manager

• Lists available workflows• Enables

– Submitting

– Aborting

– Deleting

existing workflows

• Shows status, logs and results of workflow executions

• Orchestrates job executions inside a workflow

46

LFCLFC (EGEE) file management (EGEE) file management

47

Compilation supportCompilation support

48

Fault-tolerantFault-tolerant Grid applications Grid applications

• Utilizing– Condor DAGMan’s rescue mechanism– EGEE job resubmission mechanism of WMS

• If the EGEE broker leaves a job stuck in a CEs’ queue, the portal automatically – kills the job on this site and – resubmits the job to the broker by prohibiting this site.

• As a result – the portal guarantees the correct submission of a job as

long as there exists at least one matching resource – job submission is reliable even in an unreliable grid

49

Lessons learntLessons learnt

• P-GRADE portal provides– Easy-to-use but powerful workflow system (graphical editor, wf

manager, etc.)

– Three levels of parallelism• MPI job level

• Workflow branch level

• Parameter sweep at workflow level

– Multi-grid/multi-VO access mechanism for various grids (LCG-2, gLite and GT2)

• Simultaneous access

• Transparent access

• Migrating a workflow from one grid to another requires no modification in the workflow

50

About the PracticesAbout the Practices

51

PracticePractice 11 Solve matrix multiplication on Solve matrix multiplication on HUNGRIDHUNGRID

(one job workflow)(one job workflow)

Job executable:• C code, compiled on GILDA UI• Expects command line parameters: M

V• Knows nothing about the grid

Job input/output files:• Program reads matrixes from two files

called INPUT1 and INPUT2• Program writes result matrix into file

called OUTPUT

Local execution on a PC: ./multiply M V

Task: Execute the program on EGEE, transfer input and output files in Sandboxes from the client

3 32 1 31 1 13 3 3

Binaryexecutable

INPUT1

3 35 2 76 7 93 8 2

INPUT2

3 325 35 2914 17 1842 51 54

OUTPUT

52

gLite Storage Element

PracticePractice 22 Save the multiplication OUTPUT on a Storage Save the multiplication OUTPUT on a Storage

Element and register in Element and register in thethe File Catalog File Catalog

• Modify output file type from “Local” to “Remote”

• Specify a logical file name as target location:lfn:/grid/hungrid/userXX

3 32 1 31 1 13 3 3

Binaryexecutable

INPUT1

3 35 2 76 7 93 8 2

INPUT2

3 325 35 2914 17 1842 51 54OUTPUT

Storage Element is selectedautomatically by gLite middleware

lfn:/grid/hungrid/userXX/…

Logical file name is defined by you

gLite File Catalog

Browse result file using the File Manager Portlet

53

PracticePractice 33 Combine jobs to build a MatrixOperations workflowCombine jobs to build a MatrixOperations workflow

• AB[*, 0]T * AB[*, 1]Matrix A Matrix B

A * B

A * B [ *, 0 ] A * B [ *,

1 ]

B( A * B [ *, 0 ]T ) * ( A * B [ *, 1 ] )A * B [ *, 0 ]T

2 1 31 1 13 3 3

5 2 76 7 93 8 2

25 35 2914 17 1842 51 54

25 1442

35 1751

25 14 42

3255

54

Practice 4Practice 4Matrix multiplication PSMatrix multiplication PS

parameter study parameter study workflowworkflow with 5 parameters with 5 parameters

Multiplication job

1 2 34 5 6

7 8 12

2 1 31 1 13 3 3

Matrix2

1 2 34 5 67 8 9

1 2 34 5 6

7 8 15 . . .

. . .

. . .

55

Multiplication job

Auto generator

Auto generator

Input files stored on SEs and registered in LFC catalog

1 2 34 5 67 8 9

1 2 34 5 6

7 8 12

1 2 34 5 6

7 8 15

1 2 34 5 67 8 Y

9 <= Y <=21, step 3

2 1 31 1 13 3 3

Matrix2 Output files stored on SEs and registered in LFC catalog

. . .

. . .

. . .

Practice 4Practice 4Matrix multiplication PSMatrix multiplication PS

parameter study parameter study workflowworkflow with 5 parameters with 5 parameters

Collector

Compressed output files

56

Thank you!Thank you!

www.portal.p-grade.hupgportal@lpds.sztaki.hu

Learn once, use everywhereDevelop once, execute anywhere

top related