2-1.1 job submission © 2010 b. wilkinson/clayton ferner. spring 2010 grid computing course....

76
2-1.1 Job Submission © 2010 B. Wilkinson/Clayton Ferner. Spring 2010 Grid computing course. Modification date: Jan 18, 2010

Post on 19-Dec-2015

216 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: 2-1.1 Job Submission © 2010 B. Wilkinson/Clayton Ferner. Spring 2010 Grid computing course. Modification date: Jan 18, 2010

2-1.1

Job Submission

© 2010 B. Wilkinson/Clayton Ferner. Spring 2010 Grid computing course. Modification date: Jan 18, 2010

Page 2: 2-1.1 Job Submission © 2010 B. Wilkinson/Clayton Ferner. Spring 2010 Grid computing course. Modification date: Jan 18, 2010

Types of jobs to be submitted to a Grid

• Programs written in C, C++, … that need to be compiled.

• Java programs that need a Virtual Java Machine

• Pre-compiled application packages

2-1.2

Page 3: 2-1.1 Job Submission © 2010 B. Wilkinson/Clayton Ferner. Spring 2010 Grid computing course. Modification date: Jan 18, 2010

Submitting a job that needs to be compiled

2-1.3Fig. 2.1

Page 4: 2-1.1 Job Submission © 2010 B. Wilkinson/Clayton Ferner. Spring 2010 Grid computing course. Modification date: Jan 18, 2010

Java programs

• Java compiler (javac) creates class file (bytecode) that is interpreted by a Java Virtual Machine (java).

• It is the Java Virtual Machine that is the executing program and the class file is an input file.

• Other class files usually need to be called too, found in path specified by CLASSPATH variable, so this variable must be set up properly.

2-1.4

Page 5: 2-1.1 Job Submission © 2010 B. Wilkinson/Clayton Ferner. Spring 2010 Grid computing course. Modification date: Jan 18, 2010

Submitting a Java job

2-1.5Fig. 2.2

Page 6: 2-1.1 Job Submission © 2010 B. Wilkinson/Clayton Ferner. Spring 2010 Grid computing course. Modification date: Jan 18, 2010

• Java programs offer more portability because class file could be sent to any remote computer having a Java Virtual Machine installed.

• However, speed of execution may be less than executing fully compiled binaries.

• Some studies have shown Java programs to run at 70% of equivalent C programs.

• Many internal components of Grid middleware software such as Globus actually use a mixture of Java and C. Java commonly used to create Web service components.

2-1.6

Page 7: 2-1.1 Job Submission © 2010 B. Wilkinson/Clayton Ferner. Spring 2010 Grid computing course. Modification date: Jan 18, 2010

Types of ApplicationsSince Grid is a collection of computers, user might wish to use these computers collectively to solve problems.

Two ways:

• Parallel programs -- Break problem down into tasks that need to be done to solve problem and submit individual tasks to different computers to work on them simultaneously.

• Parameter sweep problems -- Run same job on different computers at same time but with different input parameters.

Particularly attractive for Grid computing platforms because no dependences between each sweep (usually).

2-1.7

Page 8: 2-1.1 Job Submission © 2010 B. Wilkinson/Clayton Ferner. Spring 2010 Grid computing course. Modification date: Jan 18, 2010

Grid Resource Allocation Management (GRAM)

Principal job submission component of Globus

2-1.8

Page 9: 2-1.1 Job Submission © 2010 B. Wilkinson/Clayton Ferner. Spring 2010 Grid computing course. Modification date: Jan 18, 2010

Data Management

SecurityCommonRuntime

Execution Management

Information Services

Web Services

Components

Non-WS

Components

Pre-WSAuthenticationAuthorization

GridFTP

GridResource

Allocation Mgmt(Pre-WS GRAM)

Monitoring& Discovery

System(MDS2)

C CommonLibraries

GT2

WSAuthenticationAuthorization

ReliableFile

Transfer

OGSA-DAI[Tech Preview]

GridResource

Allocation Mgmt(WS GRAM)

Monitoring& Discovery

System(MDS4)

Java WS Core

CommunityAuthorization

ServiceGT3

ReplicaLocationService

XIO

GT3

CredentialManagement

GT4

Python WS Core[contribution]

C WS Core

CommunitySchedulerFramework

[contribution]

DelegationService

GT4

Globus Open Source Grid Software

I Foster

GRAM

Page 10: 2-1.1 Job Submission © 2010 B. Wilkinson/Clayton Ferner. Spring 2010 Grid computing course. Modification date: Jan 18, 2010

2-1.10

Job submission components

Fig. 2.3

Page 11: 2-1.1 Job Submission © 2010 B. Wilkinson/Clayton Ferner. Spring 2010 Grid computing course. Modification date: Jan 18, 2010

Running simple jobs

across a Grid computing

environment

2-1.11Fig. 2.4

Page 12: 2-1.1 Job Submission © 2010 B. Wilkinson/Clayton Ferner. Spring 2010 Grid computing course. Modification date: Jan 18, 2010

Specifying the job

Two basic ways a job might be specified:

•Directly by name of executable with required input arguments

or

•By a job description file – more powerful

2-1.12

Page 13: 2-1.1 Job Submission © 2010 B. Wilkinson/Clayton Ferner. Spring 2010 Grid computing course. Modification date: Jan 18, 2010

DirectlyFor very simple jobs, one can submit a single job using -c option, e.g.,

globusrun-ws -submit -c prog1 arg1 arg2

which executes program prog1 with arguments arg1 and arg2 on local host.

-c option actually causes globusrun-ws to generate a job description with the named program and arguments that follow.

-c option must be the last globusrun-ws option (why?). 2-1.13

Page 14: 2-1.1 Job Submission © 2010 B. Wilkinson/Clayton Ferner. Spring 2010 Grid computing course. Modification date: Jan 18, 2010

Example

globusrun-ws –submit –c /bin/echo hello

Globus job monitoring output created on command line and will indicate that the job completes.

However, output from echo program (hello) not displayed and is lost as is any standard output without further specification (see later).

1b.14

Page 15: 2-1.1 Job Submission © 2010 B. Wilkinson/Clayton Ferner. Spring 2010 Grid computing course. Modification date: Jan 18, 2010

2-1.15

Job Description FileGives details such as:• Job Description

- Name of executable

- Number of instances

- Arguments

- Input files

- Output files

- Directories

- Environment variables, paths, ...• Resource requirements

- Processor

- Number, cores, ...

- Type

- Speed, ...

- Memory

Used to match job with resources

Page 16: 2-1.1 Job Submission © 2010 B. Wilkinson/Clayton Ferner. Spring 2010 Grid computing course. Modification date: Jan 18, 2010

Job Description Languages

Several languages invented.

• Globus - specific:– Globus 1 and 2 used their Resource

Specification language RSL (version 1)– Globus 3 used an XML version called RSL-2– Globus 4 uses a variation of RSL-2 in a JDD

(Job Description Document)

• Job Submission Description Language (JSDL)– A recent industry-wide standard (2005)

2-1.16

Page 17: 2-1.1 Job Submission © 2010 B. Wilkinson/Clayton Ferner. Spring 2010 Grid computing course. Modification date: Jan 18, 2010

2-1.17

Resource Specification LanguageRSL version 1

• A meta-language describing job and its required execution.

Provides specification for:• Job description - directory, executable,

arguments, environment• Resource requirements - machine type,

number of nodes, memory, etc.

Page 18: 2-1.1 Job Submission © 2010 B. Wilkinson/Clayton Ferner. Spring 2010 Grid computing course. Modification date: Jan 18, 2010

2-1.18

RSL Version 1 exampleConjunction (AND): &

• To create 3-5 instances of myProg, each on a machine with at least 64 Mbytes memory available to me for 1 hours:

& (executable=myProg)

(count>=3)(count<=5)(memory>=64)

(max_time=60)

Other specifications possible including OR and multiple job requests.

Page 19: 2-1.1 Job Submission © 2010 B. Wilkinson/Clayton Ferner. Spring 2010 Grid computing course. Modification date: Jan 18, 2010

XML Job Description languages

• With introduction of XML in early 2000’s, job description languages began to be changed to XML.

• XML – eXtensible Markup Language

2-1.19

Page 20: 2-1.1 Job Submission © 2010 B. Wilkinson/Clayton Ferner. Spring 2010 Grid computing course. Modification date: Jan 18, 2010

2-1.20

Using XML

• Much more elegant and flexible, and in keeping with Web services.

• Can use XML parsers.

• Allows more powerful mechanisms with job schedulers.

• Resource scheduler/broker applies specification to local resources.

Page 21: 2-1.1 Job Submission © 2010 B. Wilkinson/Clayton Ferner. Spring 2010 Grid computing course. Modification date: Jan 18, 2010

2-1.21

• XML version of RSL 1

• Can specify everything from executable,

paths, arguments, input/output, error file,

number of processes, max/min execution

time, max/min memory, job type etc. etc.

• Introduced and used Globus version 3 (GT3)

Resource Specification Language, RSL version 2

Page 22: 2-1.1 Job Submission © 2010 B. Wilkinson/Clayton Ferner. Spring 2010 Grid computing course. Modification date: Jan 18, 2010

2-1.22

GT 3 RSL-2 ExampleSpecifying Executable

(executable=/bin/echo)

<gram:executable>

<rsl:path>

<rsl:stringElement value="/bin/echo"/>

</rsl:path>

</gram:executable>

Page 23: 2-1.1 Job Submission © 2010 B. Wilkinson/Clayton Ferner. Spring 2010 Grid computing course. Modification date: Jan 18, 2010

2-1.23

RSL and GT 3.2 RSL-2 comparison for echo program

&((executable=/bin/echo)

(directory="/bin")

(arguments="Hello World")

(stdin=/dev/null)

(stdout="stdout")

(stderr="stderr")

(count=1)

)

<?xml version="1.0" encoding="UTF-8"?> <rsl:rsl xmlns:rsl="http://www.globus.org/namespaces/2003/04/rsl" xmlns:gram="http://www.globus.org/namespaces/2003/04/rsl/gram" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation=" http://www.globus.org/namespaces/2003/04/rsl c:/ogsa-3.0/schema/base/gram/rsl.xsd http://www.globus.org/namespaces/2003/04/rsl/gram c:/ogsa-3.0/schema/base/gram/gram_rsl.xsd"> <gram:job> <gram:executable> <rsl:path> <rsl:stringElement value="/bin/echo"/> </rsl:path> </gram:executable> <gram:directory> <rsl:path> <rsl:stringElement value="/bin"/> </rsl:path> </gram:directory> <gram:arguments> <rsl:string> <rsl:stringElement value="Hello World"/> </rsl:string> </gram:arguments> <gram:stdin> <rsl:path> <rsl:stringElement value="/dev/null"/> </rsl:path> </gram:stdin> <gram:stdout> <rsl:pathArray> <rsl:path> <rsl:substitutionRef name="HOME"/> <rsl:stringElement value="/stdout"/> </rsl:path> </rsl:pathArray> </gram:stdout> <gram:stderr> <rsl:pathArray> <rsl:path> <rsl:substitutionRef name="HOME"/> <rsl:stringElement value="/stderr"/> </rsl:path> </rsl:pathArray> </gram:stderr> <gram:count> <rsl:integer value="1"/> </gram:count> <gram:jobType> <gram:enumeration> <gram:enumerationValue> <gram:multiple/> </gram:enumerationValue> </gram:enumeration> </gram:jobType> <gram:gramMyJobType> <gram:enumeration> <gram:enumerationValue> <gram:collective/> </gram:enumerationValue> </gram:enumeration> </gram:gramMyJobType> <gram:dryRun> <rsl:boolean value="false"/> </gram:dryRun> <gram:saveState> <rsl:boolean value="true"/> </gram:saveState> <gram:twoPhase> <rsl:integer value="600"/> </gram:twoPhase> </gram:job> </rsl:rsl>

Page 24: 2-1.1 Job Submission © 2010 B. Wilkinson/Clayton Ferner. Spring 2010 Grid computing course. Modification date: Jan 18, 2010

2-1.24

Job Description Document (JDD)

• RSL-2 simplified, renamed, and called JDD in more recent Globus 4 (GT4) documents

• Not completely interchangeable.

Page 25: 2-1.1 Job Submission © 2010 B. Wilkinson/Clayton Ferner. Spring 2010 Grid computing course. Modification date: Jan 18, 2010

2-1.25

GT 4 JDD ExampleSpecifying Executable

executable=/bin/echo

<executable>/bin/echo</executable>

Page 26: 2-1.1 Job Submission © 2010 B. Wilkinson/Clayton Ferner. Spring 2010 Grid computing course. Modification date: Jan 18, 2010

2-1.26

GT 4.0 JDD for echo program

<?xml version="1.0" encoding="UTF-8"?><job> <executable>/bin/echo</executable>

<directory>${GLOBUS_USER_HOME}</directory> <argument>Hello</argument>

<argument>World</argument><stdout>${GLOBUS_USER_HOME}/stdout</

stdout> <stderr>${GLOBUS_USER_HOME}/stderr</stderr></job>

Page 27: 2-1.1 Job Submission © 2010 B. Wilkinson/Clayton Ferner. Spring 2010 Grid computing course. Modification date: Jan 18, 2010

Job Submission Description Language (JSDL)

• A standard introduced by GGF (Global Grid forum) in 2005 and beginning to be widely adopted.

• Apart from specifying job, can specify required resources for job.

• We will use RSL-2/JDD in assignment 2(Globus 4.0 commands designed for RSL-2/JDD)

2-1.27

Page 28: 2-1.1 Job Submission © 2010 B. Wilkinson/Clayton Ferner. Spring 2010 Grid computing course. Modification date: Jan 18, 2010

Submitting a job

2-1.28

Page 29: 2-1.1 Job Submission © 2010 B. Wilkinson/Clayton Ferner. Spring 2010 Grid computing course. Modification date: Jan 18, 2010

2-1.29

GT4 job submission command globusrun-ws

• Submit and monitor GRAM jobs

• Written in C, for faster startup and execution than earlier Java version

• Supports multiple and single job submission

• Handles credential management

• Streaming of job stdout/err during execution

Page 30: 2-1.1 Job Submission © 2010 B. Wilkinson/Clayton Ferner. Spring 2010 Grid computing course. Modification date: Jan 18, 2010

2-1.30

Simple job submission

• Step 1: Create proxy with: grid-proxy-int command.

• Step 2: Issue globusrun-ws with parameters to specify job.

Page 31: 2-1.1 Job Submission © 2010 B. Wilkinson/Clayton Ferner. Spring 2010 Grid computing course. Modification date: Jan 18, 2010

2-1.31

Some globusrun-ws flags (options) for job submission

Page 32: 2-1.1 Job Submission © 2010 B. Wilkinson/Clayton Ferner. Spring 2010 Grid computing course. Modification date: Jan 18, 2010

2-1.32

Running GT 4 Jobusing XML job description file

• Command:

globusrun-ws –submit –f prog.xml

where prog.xml specifies job in JDD.

-submit causes job to be submitted

Submitted to localhost (machine that is executing command) as no contact resource specified.

Submitted immediately using “fork”

Page 33: 2-1.1 Job Submission © 2010 B. Wilkinson/Clayton Ferner. Spring 2010 Grid computing course. Modification date: Jan 18, 2010

2-1.33

With named executable-c option

Example: Submit program echo with argument hello to default localhost.

globusrun-ws –submit –c /bin/echo hello

-c Causes globusrun-ws to generate job description with named program and arguments.

-c option, if used, must be last option.

Only useful for very simple single jobs.

Page 34: 2-1.1 Job Submission © 2010 B. Wilkinson/Clayton Ferner. Spring 2010 Grid computing course. Modification date: Jan 18, 2010

2-1.34

Output modes

-submit Submits (or resubmits) a job in one of three output modes:

batchinteractive, or interactive-streaming.

Default (without additional flags to specify) is interactive.

Page 35: 2-1.1 Job Submission © 2010 B. Wilkinson/Clayton Ferner. Spring 2010 Grid computing course. Modification date: Jan 18, 2010

2-1.35

Interactive modeExample

Submit program echo with argument hello to default localhost.

% globusrun-ws –submit –c /bin/echo hello

Submitting job...Done.Job ID: uuid:d23a7be0-f87c-11d9-a53b-0011115aae1fTermination time: 07/20/2005 17:44 GMTCurrent job state: ActiveCurrent job state: CleanUpCurrent job state: DoneDestroying job...Done.

Output

Job ID

Job goes thro several states

Page 36: 2-1.1 Job Submission © 2010 B. Wilkinson/Clayton Ferner. Spring 2010 Grid computing course. Modification date: Jan 18, 2010

StreamingRefers to sending contents of a stream of data from one location to another location as it is generated.

Often associated with Linux standard output and standard error streams, stdout and stderr.

For a program that creates output on remote machine, need:• Files to hold output and error messages ,or • Re-direct output and error messages to user console.

2-1.36

Page 37: 2-1.1 Job Submission © 2010 B. Wilkinson/Clayton Ferner. Spring 2010 Grid computing course. Modification date: Jan 18, 2010

Provides for capturing program output and error messages and re-directing them to user’s console (output of globusrun-ws) or to specified files.

2-1.37

Interactive-streaming mode -s option

Page 38: 2-1.1 Job Submission © 2010 B. Wilkinson/Clayton Ferner. Spring 2010 Grid computing course. Modification date: Jan 18, 2010

Interactive-streaming mode Re-direction to user console

-s option

Example

globusrun-ws -submit -s -c /bin/echo hello

Output (hello) redirected to (globusrun-ws) stdout Error messages redirected to (globusrun-ws) stderr

2-1.38

Page 39: 2-1.1 Job Submission © 2010 B. Wilkinson/Clayton Ferner. Spring 2010 Grid computing course. Modification date: Jan 18, 2010

2-1.39

-s for streaming output

plus:

–so to specify output file–se to specify error file

Interactive-streaming mode Re-direction to files

-s option with –so and –se options

Page 40: 2-1.1 Job Submission © 2010 B. Wilkinson/Clayton Ferner. Spring 2010 Grid computing course. Modification date: Jan 18, 2010

2-1.40

Exampleglobusrun-ws -submit

-s -so outfile -se errorfile -c /bin/echo hello

name of file holding output Argument for echo

name of file holding error messages

Page 41: 2-1.1 Job Submission © 2010 B. Wilkinson/Clayton Ferner. Spring 2010 Grid computing course. Modification date: Jan 18, 2010

2-1.41

Example (JDD)<job>

<executable>/bin/echo</executable>

<argument>Hello</argument>

<stdout>jobOut</stdout>

<stderr>jobErr</stderr>

</job>

Specify streaming to files using Job description file

Page 42: 2-1.1 Job Submission © 2010 B. Wilkinson/Clayton Ferner. Spring 2010 Grid computing course. Modification date: Jan 18, 2010

Batch submission

A long-standing Computer Science term from early days of computing when jobs submitted to system in a group (a batch) and wait their turn to be executed sometime in the future.

Originally appeared when programs were submitted by punched cards to a shared system, perhaps to be run perhaps overnight.

Batch submission really part of a scheduling approach.

2-1.42

Page 43: 2-1.1 Job Submission © 2010 B. Wilkinson/Clayton Ferner. Spring 2010 Grid computing course. Modification date: Jan 18, 2010

Batch submission-b option

In globusrun-ws, batch referred to as an output mode because of way output generated.

Once job submitted, control returned to command line, and one will need to query system to find out status of job.

2-1.43

Page 44: 2-1.1 Job Submission © 2010 B. Wilkinson/Clayton Ferner. Spring 2010 Grid computing course. Modification date: Jan 18, 2010

For example, suppose we ran the job:

globusrun-ws –submit /bin/sleep 100

in interactive mode. Would return when program (sleep for 100 seconds in this case) completes.

We would get normal globusrun-ws output, such as:

Submitting job...Done.

Job ID: uuid:d23a7be0-f87c-11d9-a53b-0011115aae1f

Termination time: 07/20/2005 17:44 GMT

Current job state: Active

Current job state: CleanUp

Current job state: Done

Destroying job...Done.

only each line would appear as process moves to next status condition.

2-1.44

Page 45: 2-1.1 Job Submission © 2010 B. Wilkinson/Clayton Ferner. Spring 2010 Grid computing course. Modification date: Jan 18, 2010

Alternatively, could execute sleep in batch output mode: (-b option):

globusrun-ws –submit –b /bin/sleep 100

Output would immediately appear of the form:

Submitting job…Done

JoB ID: uuid:f9544174-60c5-11d9-97e3-0002a5ad41e5

Termination time: 01/08/2005 16:05 GMT

Displays ManagedJob EPR as job ID (more on this later).

Control returned to command line.

Program may not have finished. In this case it will not for 100 seconds.

2-1.45

Page 46: 2-1.1 Job Submission © 2010 B. Wilkinson/Clayton Ferner. Spring 2010 Grid computing course. Modification date: Jan 18, 2010

Now one has to query state of job to find out when it completes.

Need job ID (ManagedJob EPR)

Convenient to have that put in a file using –o option when submitting job, e.g.

globusrun-ws –submit –b -o jobEPR /bin/sleep 100

where jobEPR holds the job ID (ManagedJob EPR).

2-1.46

Page 47: 2-1.1 Job Submission © 2010 B. Wilkinson/Clayton Ferner. Spring 2010 Grid computing course. Modification date: Jan 18, 2010

To watch status of submitted job“Attach” interactive monitoring with -monitor option.

Job ID (ManagedJob EPR) provided with -j option, e.g.:

globusrun-ws –monitor –j jobEPR

where jobEPA holds ManagedJob EPR.

Then can see stages job goes through with interactive output immediately:

job state: Active

Current job state: CleanUp

Current job state: Done

Requesting original job description...Done.

Destroying job...Done

although job itself still batch output job.2-1.47

Page 48: 2-1.1 Job Submission © 2010 B. Wilkinson/Clayton Ferner. Spring 2010 Grid computing course. Modification date: Jan 18, 2010

2-1.48

Some other options

-status Reports the current state of the job and exits

-kill Requests immediate cancellation of job and exits.

-term time Set an absolute termination time, or a time relative to successful job creation

Page 49: 2-1.1 Job Submission © 2010 B. Wilkinson/Clayton Ferner. Spring 2010 Grid computing course. Modification date: Jan 18, 2010

2-1.49

Page 50: 2-1.1 Job Submission © 2010 B. Wilkinson/Clayton Ferner. Spring 2010 Grid computing course. Modification date: Jan 18, 2010

2-1.50

Specifying where job is submitted

Request to run job processed by “factory” service called ManagedJobFactoryService.

Default URL:

https://localhost:8443/wsrf/services/ManagedJobFactoryService

Page 51: 2-1.1 Job Submission © 2010 B. Wilkinson/Clayton Ferner. Spring 2010 Grid computing course. Modification date: Jan 18, 2010

2-1.51

To specify where job is submitted-F Specifies “contact” for the job submission.

globusrun-ws –submit –F http://localhost:8440 –f prog1.xml

Job submitted to localhost

Globus container that hosts services running on port 8440

Factory service still located at

wsrf/services/ManagedJobFactoryService

Page 52: 2-1.1 Job Submission © 2010 B. Wilkinson/Clayton Ferner. Spring 2010 Grid computing course. Modification date: Jan 18, 2010

2-1.52

Selecting a different host

Example

globusrun-ws –submit –F

https://140.221.65.193:4444/wsrf/

services/managedJobFactoryService –f prog1.xml

Page 53: 2-1.1 Job Submission © 2010 B. Wilkinson/Clayton Ferner. Spring 2010 Grid computing course. Modification date: Jan 18, 2010

Transferring Files

2-1.53

Job submission command, for example:

globusrun-ws –submit –F http://coit-grid01.uncc.edu –c prog1

requires prog1 to be existing on the remote machine in the default directory ( ${GLOBUS_USER_HOME} ).

Up to user to ensure executable is in place.

Page 54: 2-1.1 Job Submission © 2010 B. Wilkinson/Clayton Ferner. Spring 2010 Grid computing course. Modification date: Jan 18, 2010

GridFTPA Globus component that provides for:

• Large data transfers• Secure transfers• Fast transfers

– Parallel transfers -- employing multiple virtual channels sharing a single physical network connection– Striping -- employing multiple physical channels using multiple hardware interfaces.

• Reliable transfers• Third party transfers.

2-1.54

Page 55: 2-1.1 Job Submission © 2010 B. Wilkinson/Clayton Ferner. Spring 2010 Grid computing course. Modification date: Jan 18, 2010

Data Management

SecurityCommonRuntime

Execution Management

Information Services

Web Services

Components

Non-WS

Components

Pre-WSAuthenticationAuthorization

GridFTP

GridResource

Allocation Mgmt(Pre-WS GRAM)

Monitoring& Discovery

System(MDS2)

C CommonLibraries

GT2

WSAuthenticationAuthorization

ReliableFile

Transfer

OGSA-DAI[Tech Preview]

GridResource

Allocation Mgmt(WS GRAM)

Monitoring& Discovery

System(MDS4)

Java WS Core

CommunityAuthorization

ServiceGT3

ReplicaLocationService

XIO

GT3

CredentialManagement

GT4

Python WS Core[contribution]

C WS Core

CommunitySchedulerFramework

[contribution]

DelegationService

GT4

Globus Open Source Grid Software

I Foster

GridFTP

Page 56: 2-1.1 Job Submission © 2010 B. Wilkinson/Clayton Ferner. Spring 2010 Grid computing course. Modification date: Jan 18, 2010

Third party transfers

Transferring a file from one remote location to another remote location controlled by a party at another location (the third party).

Already seen third party transfers in Grid portal at file management portlet.

There, user can initiate a transfer between two locations from portal running on a third system.

2-1.56

Page 57: 2-1.1 Job Submission © 2010 B. Wilkinson/Clayton Ferner. Spring 2010 Grid computing course. Modification date: Jan 18, 2010

GridFTP third party transfers

2-1.57Fig 2.5

Page 58: 2-1.1 Job Submission © 2010 B. Wilkinson/Clayton Ferner. Spring 2010 Grid computing course. Modification date: Jan 18, 2010

ReliableFileTransfer (RFT) service

GridFTP is not a Web/Grid service.

ReliableFileTransfer (RFT) service provides service interface and additional features for reliable file transfers (retry capabilities etc.).

RFT uses GridFTP servers to effect actual transfer.

2-1.58

Page 59: 2-1.1 Job Submission © 2010 B. Wilkinson/Clayton Ferner. Spring 2010 Grid computing course. Modification date: Jan 18, 2010

Data Management

SecurityCommonRuntime

Execution Management

Information Services

Web Services

Components

Non-WS

Components

Pre-WSAuthenticationAuthorization

GridFTP

GridResource

Allocation Mgmt(Pre-WS GRAM)

Monitoring& Discovery

System(MDS2)

C CommonLibraries

GT2

WSAuthenticationAuthorization

ReliableFile

Transfer

OGSA-DAI[Tech Preview]

GridResource

Allocation Mgmt(WS GRAM)

Monitoring& Discovery

System(MDS4)

Java WS Core

CommunityAuthorization

ServiceGT3

ReplicaLocationService

XIO

GT3

CredentialManagement

GT4

Python WS Core[contribution]

C WS Core

CommunitySchedulerFramework

[contribution]

DelegationService

GT4

Globus Open Source Grid Software

I Foster

RFT

Page 60: 2-1.1 Job Submission © 2010 B. Wilkinson/Clayton Ferner. Spring 2010 Grid computing course. Modification date: Jan 18, 2010

Globus file transfer commandglobus-url-copy

Example

globus-url-copy

gsiftp://www.coit-grid02.uncc.edu/~abw/hello

file:///home/abw/

copies file hello from coit-grid02.uncc.edu to the local machine using GridFTP.

Users needs valid security credentials (a certificate and proxy)

2-1.60

Source URL

Destination URL

Page 61: 2-1.1 Job Submission © 2010 B. Wilkinson/Clayton Ferner. Spring 2010 Grid computing course. Modification date: Jan 18, 2010

Question

Why three /’s in file URL, i.e. file:/// ?

Answer

The general form of file URL is file://host/path. If host omitted, it is assumed to be localhost, left with three /’s, i.e. file:///.

2-1.61

Page 62: 2-1.1 Job Submission © 2010 B. Wilkinson/Clayton Ferner. Spring 2010 Grid computing course. Modification date: Jan 18, 2010

File Staging

Moving complete files to where they are needed.

Usually associated with input and output files.

Input file need to be moved to where program located

Output files generated need to be moved back to user, or as input to other programs.

Note different to input and output streaming, which moving a series of data items as a stream as it happens.

2-1.62

Page 63: 2-1.1 Job Submission © 2010 B. Wilkinson/Clayton Ferner. Spring 2010 Grid computing course. Modification date: Jan 18, 2010

File staging

2-1.63Fig 2.6

Page 64: 2-1.1 Job Submission © 2010 B. Wilkinson/Clayton Ferner. Spring 2010 Grid computing course. Modification date: Jan 18, 2010

Staging example in JDD<job>

<fileStageIn>

<transfer>

<sourceUrl>gsiftp://coit-grid05.uncc.edu:2811

/prog1Out</sourceUrl>

<destinationUrl>file:///prog1Out</destinationUrl>

</transfer>

</fileStageIn>

</job>

2-1.64

NOTICE THAT THE CONCEPT OF LOCAL AND REMOTE ARE REVERSED!

Page 65: 2-1.1 Job Submission © 2010 B. Wilkinson/Clayton Ferner. Spring 2010 Grid computing course. Modification date: Jan 18, 2010

Staging example in JDD<job>

<fileStageOut>

<transfer>

<sourceUrl>file:///prog1Out</sourceUrl>

<destinationUrl>gsiftp://coit-grid05.uncc.edu:2811

/prog1Out</destinationUrl>

</transfer>

</fileStageOut>

</job>

2-1.65

NOTICE THAT THE CONCEPT OF LOCAL AND REMOTE ARE REVERSED!

Page 66: 2-1.1 Job Submission © 2010 B. Wilkinson/Clayton Ferner. Spring 2010 Grid computing course. Modification date: Jan 18, 2010

<job>

<fileCleanUp>

<deletion>

<file> <path of file to delete> </file>

</deletion>

</fileCleanUp>

</job>

File Cleanup example in JDD

2-1.66

Page 67: 2-1.1 Job Submission © 2010 B. Wilkinson/Clayton Ferner. Spring 2010 Grid computing course. Modification date: Jan 18, 2010

2-1.67

Sources of GT 4 information

http://www.globus.org/toolkit/docs

Page 68: 2-1.1 Job Submission © 2010 B. Wilkinson/Clayton Ferner. Spring 2010 Grid computing course. Modification date: Jan 18, 2010

2-1.68

Questions(multiple choice)

Page 69: 2-1.1 Job Submission © 2010 B. Wilkinson/Clayton Ferner. Spring 2010 Grid computing course. Modification date: Jan 18, 2010

2-1.69

When one issues the GT4.0 command:

globusrun-ws -submit -F localhost:8440 -s -so hello1 -c /bin/echo hello

what is hello?

(a) A java class(b) An xml file containing the description of the job to

be run(c) The executable to run in Globus(d) The argument for the program that will be

executable

Page 70: 2-1.1 Job Submission © 2010 B. Wilkinson/Clayton Ferner. Spring 2010 Grid computing course. Modification date: Jan 18, 2010

2-1.70

When one issues the GT4.0 command:

globusrun-ws -submit -F localhost:8440 -s

-so hello1 -c /bin/echo hello

is the order of the flags important, and if so why?

(a) Not important

(b) Important: -c must be last as it uses the remaining arguments

(c) Important: -s must be before -so

(d) Important: -F must be first

Page 71: 2-1.1 Job Submission © 2010 B. Wilkinson/Clayton Ferner. Spring 2010 Grid computing course. Modification date: Jan 18, 2010

2-1.71

When one issues the GT4.0 command:

globusrun-ws -submit -F localhost:8440 -s

-so hello1 -c /bin/echo hello

what is localhost?

(a) The server logged into running globusrun-ws.

(b) The computer you are using to log into the server

(c) None of the other answers.

Page 72: 2-1.1 Job Submission © 2010 B. Wilkinson/Clayton Ferner. Spring 2010 Grid computing course. Modification date: Jan 18, 2010

2-1.72

When one specifies streaming with the -s options such as:

globusrun-ws -submit -F <hostname> -s

-so hello1 -c /bin/echo hello

Where does the output go (on which machine)?

(a) The server logged into running globusrun-ws.

(b) The computer you are using to log into the server

(c) The remote machine <hostname>

(c) None of the other answers.

Page 73: 2-1.1 Job Submission © 2010 B. Wilkinson/Clayton Ferner. Spring 2010 Grid computing course. Modification date: Jan 18, 2010

2-1.73

When one submits the command:

globusrun-ws -submit -F <hostname> -f

test1.xml

Where does the output go (on which machine)?

Assume test1.xml specifies the stdout and stderr.

(a) The server logged into running globusrun-ws.

(b) The computer you are using to log into the server

(c) The remote machine <hostname>

(c) None of the other answers.

Page 74: 2-1.1 Job Submission © 2010 B. Wilkinson/Clayton Ferner. Spring 2010 Grid computing course. Modification date: Jan 18, 2010

2-1.74

Why is the concept of local and remote reversed when doing file staging in a JDD?

(a)Because the file transfer is 3rd party

(b)Because the job is being submitted on the remote machine

(c) GridFTP will be executed on a different machine than the job

(d) They aren’t reversed

Page 75: 2-1.1 Job Submission © 2010 B. Wilkinson/Clayton Ferner. Spring 2010 Grid computing course. Modification date: Jan 18, 2010

2-1.75

What does the tag <count> specify in an RSL-2/JDD file?

(a) The number of different jobs submitted.

(b) The number of computers to use.

(c) The number of identical jobs to submit.

(d) The number of arguments.

Page 76: 2-1.1 Job Submission © 2010 B. Wilkinson/Clayton Ferner. Spring 2010 Grid computing course. Modification date: Jan 18, 2010

1b.76

Questions