2
AgendaAgenda
• Basics of P-GRADE Portal (~1 hour)• Workflow hands-ons (~90 minute)
• Developing Application Specific Portal using P-GRADE (~30 minute)
• Agenda: – http://indico.cern.ch/conferenceOtherViews.py?confId=58116
3
P-GRADE overview and introduction: P-GRADE overview and introduction: workflows & parameter sweepsworkflows & parameter sweeps
(Basics) (Basics)
4
Introduction of LPDSIntroduction of LPDS(Lab of Parallel and Distr. Systems)(Lab of Parallel and Distr. Systems)
• Research division of MTA SZTAKI from 1998• Head: Peter Kacsuk, Prof.
• 22 research fellows• Foundation member
– Central European Grid Consortium (2003)– Hungarian Grid Competence Center (2003)
• Participant or coordinator in many European and national Grid research, infrastructure, and educational projects (from 2000)
– FP5: GridLab, DataGrid– FP6: EGEE I-II, SEE-GRID I-II, CoreGrid, ICEAGE, CancerGrid– FP7: EGEE III, SEE-GRID-SCI, EDGeS (coordinator), ETICS, S-CUBE
• Central European Grid Training Center in EGEE (from 2004)
www.lpds.sztaki.hu
5
Short History of P-GRADE portalShort History of P-GRADE portal
• Parallel Grid Application and Development Environment
• Initial development started in the Hungarian SuperComputing Grid project in 2003
• It has been continuously developed since 2003• Detailed information:
http://www.portal.p-grade.hu/
• Open Source community development since January 2008:
https://sourceforge.net/projects/pgportal/
6
Download of OSS P-GRADE portalDownload of OSS P-GRADE portal
110 downloads within the first month
~697 total downloads until now
7
Main P-GRADE related projectsMain P-GRADE related projects
• EU SEE-GRID-1 (2004-2006)– Integration with LCG-2 and gLite
• EU SEE-GRID-2 (2006-2008)– Parameter sweep extension
• EU CoreGrid (2005-2008)– To solve grid interoperation for job submission– To solve grid interoperation for data handling: SRB, OGSA-DAI
• GGF GIN (2006)– Providing the GIN Resource Testing portal
• EGEE 2,3 (2006-2010)– Respect program tool used for training and application development
• ICEAGE (2006-2008)– P-GRADE portal is used for training as official portal of the GILDA training
infrastructure• EU EDGeS (2008-2009)
– Transparent access to any EGEE and Desktop Grid systems– See Demo Booth 5: EDGes – Desktop Grid Extension of the EGEE
Infrastructure
9
Portal installationsPortal installations
10
Multi-Grid service portalMulti-Grid service portalTo be used today!To be used today!
13
Motivations for developing P-GRADE Motivations for developing P-GRADE portalportal
• P-GRADE portal should – Hide the complexity of the underlying grid middlewares– Provide a high-level graphical user interface that is easy-to-
use for e-scientists– Support many different grid programming approaches:
• Simple Scripts & Control (sequential and MPI job execution)• Scientific Application Plug-ins• Complex Workflows• Parameter sweep applications: both on job and workflow level• Interoperability: transparent access to grids based on different
middleware technology (both computing and data resources)– Support several levels of parallelism
14
Layers in a Grid systemLayers in a Grid system
Basic Grid services:AA, job submission, info, …
Higher-level grid services (brokering,…)
Application toolkits, standards
Application
Grid middlewareCommand line tools
P-GRADE Portal servicesGraphical interface
16
What is a P-GRADE Portal workflow?
• a directed acyclic graph where– Nodes represent jobs (batch
programs to be executed on a computing element)
– Ports represent input/output files the jobs expect/produce
– Arcs represent file transfer operations
• semantics of the workflow:– A job can be executed if all
of its input files are available
17
Three Levels of parallelismThree Levels of parallelism
– PS workflow level: Parameter study execution of the workflow
– Workflow level: Parallel execution among workflow nodes (WF branch parallelism)
Multiple jobs can run parallel
Each job can be a parallel program
– Job level: Parallel execution inside a workflow node (MPI job as workflow component)
Multiple instances of the same workflow can
process different data files
18
25 times
Example: Computational ChemistryExample: Computational Chemistry
Department of Chemistry, University of Perugia
SOLUTION OF SCHRODINGER EQUATION FOR TRIATOMIC SYSTEMS USING TIME-DEPENDENT (RWAVEPR) OR TIME INDEPENDENT (ABC) METHOD
A single execution can be between 5 hours and 10 hours
SEQUENTIAL FORTRAN 90
Many simulations at the same time
See at demo booth 11: EGEE Application Porting Support Group
19
Grid interoperation by P-GRADEAcccessing Globus, gLite and ARC based grids/VOs simultaneously
P-GRADE
GEMLCA
Portal
GEMLCA GEMLCA RepositoryRepository
P-GRADEportal
20
Typical user scenarioTypical user scenarioJob compilation phaseJob compilation phase
Certificate servers
Portalserver
Gridservices
DOWNLOAD BINARI(ES)
UPLOAD SOURCE(S)
Client COMPILE – EDIT
21
Typical user scenarioTypical user scenarioApplication development phaseApplication development phase
Certificate servers
Portalserver
Gridservices
START EDITOR
OPEN & EDIT WORKFLOW
orPARAMETER
STUDY
SAVE APPLICATION
Client
22
Certificate servers
Portalserver
Gridservices
TRANSFER FILES, SUBMIT JOBS
DOWNLOAD (SMALL)
RESULTS
DOWNLOAD (SMALL)
RESULTS
Typical user scenariosTypical user scenarios Workflow execution phaseWorkflow execution phase
VISUALIZE JOBS and
APPLICATION PROGRESS
MONITOR JOBS
DOWNLOAD PROXY CERTIFICATES
Client
23
P-GRADE Portal P-GRADE Portal structural overviewstructural overview
User interface layer Presents the user interface
Internal layer – Java classesRepresents the internal concepts
Java Webstartworkflow editor
Web browser
EGEE and Globus Grid services (gLite WMS, LFC,…; Globus GRAM, GridFTP, …)
Client
P-GRADEPortalserver
Grid
Grid layer – gLite and Globus command line toolsInterfacing with grid services
24
Interface layerInterface layer
User interface layer
Java Webstartworkflow editor
Web browserClient
Web server
P-GRADEPortalserver
Gridpshere Web portal framework
Gridsphere portlets
P-GRADE portlets
Workflow monitor:
Java applet generator
Workflow editor:
Java webstart application
25
Interface layer functionalitiesInterface layer functionalities
User interface layer
Java Webstartworkflow editor
Web browserClient
Web server
P-GRADEPortalserver
Gridpshere Web portal framework
Gridsphere portlets
P-GRADE portlets
Workflow monitor:
Java applet generator
Workflow editor:
Java webstart application
•Workflow portlet• Workflow manager, Storage, Upload
•Certificate portlet• Upload, download and other operations
•Settings portlet• Grid settings, Quota settings
• File management• Manage files in the grid
• Compiler portlet• Compile jobs on portal server
• Login• Welcome• ...
26
P-GRADE vs. Non-P-GRADE P-GRADE vs. Non-P-GRADE portletsportlets
P-GRADE Portal portletsGridSphere
2.xGrid Portal framework
27
Interface layerInterface layer
User interface layer
Java Webstartworkflow editor
Web browserClient
Web server
P-GRADEPortalserver
Gridpshere Web portal framework
Gridsphere portlets
P-GRADE portlets
Workflow monitor:
Java applet generator
Workflow editor:
Java webstart application
28
Interface layerInterface layer
User interface layer
Java Webstartworkflow editor
Web browserClient
Web server
P-GRADEPortalserver
Gridpshere Web portal framework
Gridsphere portlets
P-GRADE portlets
Workflow monitor:
Java applet generator
Workflow editor:
Java webstart application
32
Portlets/functionalities of Portlets/functionalities of P-GRADE portalP-GRADE portal
• Settings (portlet)• Certificate and proxy management (portlet)• Information system visualization (portlet)• Graphical workflow editing• Workflow manager (portlet)• LFC (EGEE) file management (portlet)• Compilation support (portlet)• Fault-tolerance support
33
Settings PortletSettings Portlet
• Portal administrator can – connect the
portal to several grids
– register the basic resources of the connected grids
34
Settings PortletSettings Portlet
User can customize the connected grids by adding and removing resources
35
Certificate and proxy management Certificate and proxy management PortletPortlet
• User can upload his certificates of various grids to the MyProxy server
• User can download proxys and allocate to grids• User can use simultaneously as many proxys as
many grids are connected to the portal• As a result parallel branches of a workflow can be
executed simultaneously in several grids
SEE-GRID accessHUNGRID access
36
EGEE Grid
UK NGS
P-GRADE-Portal
London Rome
Athens
Solving Grid interoperation by Solving Grid interoperation by P-GRADE PortalP-GRADE Portal
Different jobs can be parallel executed in different grids
37
Interoperation vs. InteroperabilityInteroperation vs. Interoperability
Interoperation: – short term solution that defines what needs to be done to achieve
interoperation between current production grids using existing technologies
Interoperability:– native ability of Grids and Grid middleware to interact directly via
common open standards
As defined by the GIN (Grid Interoperation Now)CG (Community Group) of the OGF (Open Grid Forum)
Grid 1 Grid 2 Grid 3
P-GRADE Portal Grid 1
Grid 2 Grid 3
Interoperation Interoperability
38
Information system PortletInformation system Portlet
39
Graphical workflow editingGraphical workflow editing
• The aim is to define a DAG of batch jobs:
1. Drag & drop components:jobs and ports
2. Define their properties3. Connect ports by
channels (no cycles, no loops, no conditions)
4. Automatically generates JDL file
40
Workflow Workflow EditorEditorProperties of a jobProperties of a job
Properties of a job:• Binary executable• Type of executable• Number of required
processors• Command line parameters• The resource to be used
for the execution:• Grid/VO• (Computing element)
41
Workflow Workflow EditorEditorDefining broker jobsDefining broker jobs
Select a Grid with broker!(*_BROKER)
Ignore the resource field!
If default JDL is not sufficient use the built-in JDL editor!
42
Workflow Workflow EditorEditorDefining input-output filesDefining input-output files
File propertiesType: input: the job reads output: the job generatesFile type: local: comes from my desktop remote: comes from an SEFile: location of the fileInternal file name: Executable reads the file in this name – fopen(“file.in”, …)File storage type (output files only): Permanent: final result Volatile: only data channel
43
• Client side location:result.dat
• LFC logical file name(LFC file catalog is required – EGEE VOs) lfn:/grid/gilda/sipos/11-04_-_result.dat
• GridFTP address (in Globus Grids):gsiftp://somengshost.ac.uk/mydir/result.dat
Local fileLocal file
Remote fileRemote file
How to refer to an I/O file?How to refer to an I/O file?
• Client side location:c:\experiments\11-04.dat
• LFC logical file name(LFC file catalog is required – EGEE VOs) lfn:/grid/gilda/sipos/11-04.dat
• GridFTP address (in Globus Grids):gsiftp://somengshost.ac.uk/mydir/11-04.dat
Input file Output file
44
Local vs. remote filesLocal vs. remote files
Portalserver
Gridservices
Computing elements
Storage elements
REMOTE INPUTFILES
REMOTE OUTPUT
FILES
LOCAL INPUT FILES
& EXECUTABLES
LOCAL OUTPUT
FILES
LOCAL INPUT FILES
& EXECUTABLES
LOCAL OUTPUT
FILES
Only the permanent
files!
Your binary can access data services directly too• GridFTP API• GFAL API• lfc-*, lcg-* commands
45
Workflow managerWorkflow manager
• Lists available workflows• Enables
– Submitting
– Aborting
– Deleting
existing workflows
• Shows status, logs and results of workflow executions
• Orchestrates job executions inside a workflow
46
LFCLFC (EGEE) file management (EGEE) file management
47
Compilation supportCompilation support
48
Fault-tolerantFault-tolerant Grid applications Grid applications
• Utilizing– Condor DAGMan’s rescue mechanism– EGEE job resubmission mechanism of WMS
• If the EGEE broker leaves a job stuck in a CEs’ queue, the portal automatically – kills the job on this site and – resubmits the job to the broker by prohibiting this site.
• As a result – the portal guarantees the correct submission of a job as
long as there exists at least one matching resource – job submission is reliable even in an unreliable grid
49
Lessons learntLessons learnt
• P-GRADE portal provides– Easy-to-use but powerful workflow system (graphical editor, wf
manager, etc.)
– Three levels of parallelism• MPI job level
• Workflow branch level
• Parameter sweep at workflow level
– Multi-grid/multi-VO access mechanism for various grids (LCG-2, gLite and GT2)
• Simultaneous access
• Transparent access
• Migrating a workflow from one grid to another requires no modification in the workflow
50
About the PracticesAbout the Practices
51
PracticePractice 11 Solve matrix multiplication on Solve matrix multiplication on HUNGRIDHUNGRID
(one job workflow)(one job workflow)
Job executable:• C code, compiled on GILDA UI• Expects command line parameters: M
V• Knows nothing about the grid
Job input/output files:• Program reads matrixes from two files
called INPUT1 and INPUT2• Program writes result matrix into file
called OUTPUT
Local execution on a PC: ./multiply M V
Task: Execute the program on EGEE, transfer input and output files in Sandboxes from the client
3 32 1 31 1 13 3 3
Binaryexecutable
INPUT1
3 35 2 76 7 93 8 2
INPUT2
3 325 35 2914 17 1842 51 54
OUTPUT
52
gLite Storage Element
PracticePractice 22 Save the multiplication OUTPUT on a Storage Save the multiplication OUTPUT on a Storage
Element and register in Element and register in thethe File Catalog File Catalog
• Modify output file type from “Local” to “Remote”
• Specify a logical file name as target location:lfn:/grid/hungrid/userXX
3 32 1 31 1 13 3 3
Binaryexecutable
INPUT1
3 35 2 76 7 93 8 2
INPUT2
3 325 35 2914 17 1842 51 54OUTPUT
Storage Element is selectedautomatically by gLite middleware
lfn:/grid/hungrid/userXX/…
Logical file name is defined by you
gLite File Catalog
Browse result file using the File Manager Portlet
53
PracticePractice 33 Combine jobs to build a MatrixOperations workflowCombine jobs to build a MatrixOperations workflow
• AB[*, 0]T * AB[*, 1]Matrix A Matrix B
A * B
A * B [ *, 0 ] A * B [ *,
1 ]
B( A * B [ *, 0 ]T ) * ( A * B [ *, 1 ] )A * B [ *, 0 ]T
2 1 31 1 13 3 3
5 2 76 7 93 8 2
25 35 2914 17 1842 51 54
25 1442
35 1751
25 14 42
3255
54
Practice 4Practice 4Matrix multiplication PSMatrix multiplication PS
parameter study parameter study workflowworkflow with 5 parameters with 5 parameters
Multiplication job
1 2 34 5 6
7 8 12
2 1 31 1 13 3 3
Matrix2
1 2 34 5 67 8 9
1 2 34 5 6
7 8 15 . . .
. . .
. . .
55
Multiplication job
Auto generator
Auto generator
Input files stored on SEs and registered in LFC catalog
1 2 34 5 67 8 9
1 2 34 5 6
7 8 12
1 2 34 5 6
7 8 15
1 2 34 5 67 8 Y
9 <= Y <=21, step 3
2 1 31 1 13 3 3
Matrix2 Output files stored on SEs and registered in LFC catalog
. . .
. . .
. . .
Practice 4Practice 4Matrix multiplication PSMatrix multiplication PS
parameter study parameter study workflowworkflow with 5 parameters with 5 parameters
Collector
Compressed output files