application execution using hybrid resources

APPLICATION EXECUTION

USING

HYBRID RESOURCES

A SIParCS Internship Project Report

Mentor: Matthew Woitaszek

Zhifeng Yun

Center for Computation & Technology

Louisiana State University

August 6, 2010

1

Abstract

Multicluster systems have become one of the major execution environments tosolve large-scale compute-intensive applications. However, it is challenging to achieveautomatic load balancing of the jobs across these participating autonomic systems.This report addresses the workload allocation problems for the Daymet application withlarge number of sequential jobs in the multicluster system, by using the LONI DA-TCexecution model. Cloud computing toolkit is deployed into NCAR clusters and Cloudresources are investigated and integrated into the DA-TC model for Daymet execution.Experiments show that this model can work seamlessly with multiple clusters andhybrid Grid/Cloud resources for the Daymet execution to achieve reduced turnaroundtime.

1 Introduction

Clusters are now a popular computing platform for scientific compute-intensive appli-cations, and myriad investments in cluster systems have been provided by institutions andgovernments in the world. However, individual clusters still have rather limited capacityand cannot meet the growing demands of many large-scale scientific and engineering appli-cations. Thus, people pursue approaches to sharing the workload of individual applicationsacross multiple existing networked clusters so as to achieve substantial computational capa-bility increment and furthermore efficient resource utilization without any further investment.However, as one particular case of grid computing, multiclusters face the same challenges asany common grid environment, including security, performance prediction, reliability, andmeta-scheduling.

The structure of a multicluster environment is shown in Figure 1. There is a super-scheduler over multiple clusters, which assigns application tasks onto participating clustersfor execution. The assigned tasks are submitted to the local scheduling systems on par-ticipating clusters. A local scheduling system arranges the job executions under its ownscheduling policies.

It is challenging to efficiently manage application execution across the multicluster sys-tem, due to the nature of participating clusters and network connection [6]. Participatingclusters are geographically distributed, heterogeneous, and self-administrative. The networkconnection provided by the Internet has security vulnerabilities and unpredictable perfor-mance. The completion of a submitted application depends on the accomplishment of allthe application tasks assigned to different clusters. The slowest cluster is the bottleneck ofapplication execution. Application execution also has to take into account the risk of anysystem failure in the participating clusters.

In order to investigate the possibility of using multiple clusters to execute the large-scale loosely-coupled application, we implemented the LONI DA-TC (Dynamic Assignmentwith Task Containers) execution model [9] on several PBS clusters at NCAR. The Daymetapplication is installed on these systems and used to demonstrate the functionality of the

2

Compute Nodes

Internet

LocalScheduler

High Speed Network

LocalScheduler

High Speed Network

LocalScheduler

High Speed Network

Super-Schedulers

Waiting Queues

Figure 1: The multicluster architecture.

system on NCAR workloads. Experiments show that this execution model can be usedseamlessly on NCAR clusters to achieve reduced turnaround time for the Daymet application.

Recently the usage of Cloud computing technology for application execution has becomeincreasingly popular. By using the virtualization and of on-demand virtual machines in theCloud environment, users can get their jobs executing immediately without any waiting time.Besides of this, it also provides user the ability to configure the virtual machines with thenecessary software and hardware supports for the application execution beforehand so thatthe virtual machines can be used for job execution without any further software installationand configuration after booting. In this report, we presented the approaches to integratethe Cloud resources into our DA-TC model to take advantages of this new technology forapplication execution.

The remainder of this report is organized as follows. The DA-TC model is described inSection 2. Section 3 introduces the Cloud resources and how it can be integrated into theDA-TC model. Section 4 introduced how the Daymet application can be implemented byDA-TC model. The summary and future directions are discussed in Section 5.

2 DA-TC Application Execution Model

The DA-TC execution model is based on Dynamic Assignment with Task Containerconcept [9]. It is designed to improve application execution in a multicluster Grid envi-ronment, and has been successfully used for large-scale applications such as collaborativemechanical design [17], sawing optimization [15], and water nucleation simulation [16].

Using the DA-TC model, an application is assumed to consist of a number of tasks where

3

the parallelizable tasks are executed on remote clusters.The DA-TC model introduces the task container (TC) concept. A task container waits

for resources on a participating cluster, and after being allocated resources, it provides alightweight hosting environment for task execution. A TC is viewed as a normal job bya local resource scheduling system. It is submitted into a queue and waits for resourceallocation. The local scheduler allocates resources to a TC under its own scheduling policies.Resources are released after a TC execution ends. From a task execution perspective, a TCis a host environment. It provides a standardized method to manage the lifecycle of taskexecution on any participating cluster. Each task is associated with task metadata. A TCretrieves task execution requirements from metadata and takes actions to perform a task,including stage-in, invocation, task termination, task execution monitoring, stage-out, etc.A TC is a lightweight environment. It can be easily deployed, and can launch any existing“legacy” task executable on participating clusters.

The DA-TC execution model adopts a dynamic task assignment strategy and employs anapplication execution agent (AEA). The AEA plays the super-scheduler role in one developedmodel. It dynamically assigns a task to a task container whose status is ready. It also allowsa user to interact with the application at run time and steer execution progress.

There are three phases for the AEA to perform an application: execution preparation,dynamic task assignment, and execution termination. The actions in execution preparationinclude: 1) Obtain static and dynamic information of all the participating clusters; 2) Decidehow many TCs will be assigned to each participating cluster; 3) Deploy TC executables oneach cluster; 4) Initialize the TC (and tasks as well) status tables; 5) Submit all the TCs tothe participating clusters; 6) Wait for TC status to be ready.

The second phase is dynamic task assignment. Once the AEA understands that a TC isready, it selects an appropriate task (or more) according to workflow management and taskmetadata, updates the status of the TC and the task to stage-in, and then sends the taskmetadata to the TC. The TC takes care of actual execution of the task, such as stage-in,invocation, and stage-out. Task metadata includes task dependences, executable location,data location, resource requirements, etc. During this phase, AEA maintains the status ofall TCs and tasks at run time.

After all the tasks have been accomplished, the AEA takes actions to terminate therunning TCs, and, if applicable, withdraw the waiting TCs. It is possible that even if allthe tasks of an application are finished, there are still some TCs waiting in the queues ofthe slow clusters. This requires that the AEA interacts with local scheduler systems on theparticipating clusters to delete the queuing TCs.

The progress of application execution is achieved by the interactions between the AEAand the TCs. Figure 2 shows the interaction diagram between the AEA and the TCs. Tocarry out an application execution, the first thing for AEA to do is to submit TCs to par-ticipating clusters. The number of TCs for application execution depends on user configura-tion. The submitted TCs are placed as normal jobs in the scheduling queues of participatingclusters, waiting for resource allocation by local resource management systems. One par-ticipating cluster may host multiple task containers, according to different load balancing

4

strategies adopted by the AEA. After a TC obtains the required computing resources from alocal scheduling system, it communicates with the AEA for task assignment. First, the TCsends the AEA a message to say that it is ready to run a task. Second, the AEA updatesthe TC status table and then one or more tasks are selected, based on application workflowmanagement strategies. Third, task stage-in, execution, and stage-out are performed, andthe status tables associated with tasks and TCs on the AEA are updated. After a task iscompleted successfully, the AEA and the TC are ready for the execution of the next task.

Other Job: R

Other Job: R

Task Container: R

Task Container: R

Task Container: R

Other Job: Q

Other Job: Q

Local Scheduling QueueQueue Head

Queue End

Other Users' Jobs/Tasks or DA-TC Task Containers

ReadyStatusUpdate

Task Stage-in

TC AEA

Busy

TaskExecution

Task CompletionTask Stage-out

Ready

StatusUpdate

. . .

. . .

Figure 2: The interaction diagram between AEA and TC. “R” denotes running and “Q”queuing. “Other” delegates the jobs submitted by other users.

This dynamic task assignment strategy and the task container technology can essentiallyimprove QoS of application execution in a multicluster environment. A TC is used to applyand hold resources for task execution, which provides quick execution for tasks that aredynamically assigned by the AEA. By this dynamic load balancing method, the fast clusterswill be assigned more tasks. The execution bottleneck caused by the slow clusters is elimi-nated. Any participating cluster, no matter how slow it is, becomes a beneficial factor, nota bottleneck, to speed up application execution. The overall waiting time of tasks is greatlyshortened and resource utilization is enhanced. The reliability of application execution canalso be upgraded. A task will not be assigned on a participating cluster if a valid TC sta-tus can not be provided to the AEA due to network disconnection, system maintenance,or system failure. Task completion status is monitored by the AEA at run time. If a taskexecution error happens, the AEA can intelligently make decisions, e.g., resubmission on thesame or different TC, to try to fix the problem.

The DA-TC model also provides capabilities to interact with a user. User interactionincludes status monitoring and execution steering. There are three kinds of status an ap-plication user is interested in: application progress, task container status, and task status.Checking the application progress, a user knows how many tasks have been completed, howmany tasks are running, and how many tasks are still waiting. AEA interacts with local

5

resource management systems to obtain task container status: queuing or running. A user isallowed to know which tasks have been assigned on a running TC and retrieve the status ofone particular task. In addition, notifications for application start and end can be provided.Execution steering allows a user to change the order of the queuing tasks, delete task(s), addnew task(s), modify datasets, pause execution, etc.

3 Cloud Computing

3.1 Introduction

Recently, the usage of cloud computing through virtualization and of on-demand vir-tual machines (VMs) has become increasingly popular. Cloud computing is Internet-basedcomputing, whereby shared resources, software, and information are provided to computersand other devices on demand, like the electricity grid [3]. It basically gives users the accessto the compute/data resources that they do not own. The cloud services can provide thedynamic provision of services and resource pools in a coordinated fashion. Figure 3 showsthe basic cloud services. It can provision the storage, CPU and network for each individualvirtual machine with the service level agreement (SLA). Users can use the cloud client toconnect with the cloud services in order to boot the virtual machines and virtual clusterson demand. Once these virtual systems are booted, they can be used immediately for com-puting without any waiting period. The location of resource usually is irrelevant. However,it may be relevant from the performance perspective if the computing needs a large chunkof data which will introduces the network latency when staging in and out these data. Webinterfaces are always provided so that users can get to work anywhere as long as there isInternet connection.

There are lots of commercial Clouds provided by different companies, such as Amazon,Microsoft, and Google, etc. Some science Cloud software are also available, such as Nimbusfrom Argonne National Laboratory & University of Chicago and Eucalyptus from Universityof California at Santa Barbara. Both of them can bring up the Cloud computing serviceson the clusters so that the clients can lease the remote resources by deploying VMs ontothese clusters. They are also compatible to the Amazon Web Services [2] and supports bothKVM [8] and Xen [14] virtualization.

These infrastructure-as-a-service Clouds have different advantages compared to tradi-tional Grid systems: users are provided with greater flexibility and have the ability to cus-tomize their virtual machine environment. Comparing with the Grid systems, where theresources are always shared by large number of users, each virtual machine or virtual clusterbooted in the cloud are dedicated to one user. Jobs submitted to the Grids systems aretypically handled by the batch scheduler, whereas in the cloud environment, the jobs can beassigned directly to a virtual machine or virtual cluster by the user. Physical resources areprovisioned to the virtual machine instead of workload in the cloud. Besides of these, theGrids system lack of autonomy and SLA guarantees.

However, how this Cloud computing technology can be beneficial to different applica-

6

Resources

Storage CPU Network

Network

Cloud Services

StorageProvision

CPUProvision

NetworkProvision

SLA, Billing, etc

Cloud Client

Cloud Client Cloud

Client

Figure 3: The Cloud services.

tion scenarios is uncertain. This also arises the needs to integrate traditional Grids andClouds. Developing and running applications in such a hybrid and dynamic computationalinfrastructure presents new and significant challenges [7]. It will need the execution systemto be able to support the hybrid execution models, coordinate and manage the executionin an efficient and scalable manner. How to determine and provision the appropriate mixof Grid/Cloud resources, as well as dynamically schedule them across the hybrid executionenvironment to fulfill different performance objectives will be the key issues in this case [10].

3.2 Cloud Service Toolkit – Nimbus

In order to take advantage of the Cloud technology for application execution, we deployedthe Nimbus science cloud toolkit onto our clusters. The Nimbus toolkit consists of thefollowing components:· Infrastructure-as-a-Service (IaaS). The Nimbus toolkit turns the physical clusters into

“Infrastructure-as-a-Service” (IaaS) cloud computing platforms. It gives the administratorsthe choice to initiate and terminate the IaaS once needed.· Cumulus storage service. Cumulus is a storage cloud implementation compatible with

the Amazon Web Services S3 REST API. It provides secure management of cloud disk spacegiving each user a “repository” view of VM images they own and images they can launch.Cumulus replaces the globus GridFTP-based [1] upload and download of VM images. It isintegrated with the Nimbus installation, but can also be installed on its own to manage astorage cloud.· Cloud client. A easy to use end-user tool which provides users the ability to transfer

7

images, check current stored images, launch, query and terminate VMs belongs to thatuser. Other functionalities such as check the information and initiate the Grid proxy, querysecurity setups, etc are also available for end users.· Workspace service. The workspace service is composed of a WS front-end and a VM-

based resource manager deployed on a site. It supports two front-ends: one based on theWeb Service Resource Framework (WSRF) [5], and one based on Amazons EC2 WSDL. Thisservice is in charge of the hardware implementation and virtualization for the VMs. It allowsa remote client to deploy and manage flexibly defined groups of VMs, and it will dynamicallyprovision resources and environment for each VM. It will also publish the information of eachworkspace so that user can use the Cloud client to easily get all the information of the VMsbelong to him, such as IP address, Cloud name, time duration, etc. The users can also usethese information to directly login to these VMs to perform tasks as physical resources.· The workspace control tools, which are used to start, stop, and pause VMs; implement

VM image reconstruction and management; connect the VMs to the network; and delivercontextualization information.·Workspace pilot, which extends existing local resource managers (LRMs) such as Torque [13]

or SGE [11] to deploy virtual machines to allows RPs to use virtualization without signifi-cantly altering the site configuration.· Context broker, which allows a client to deploy a “one-click” functioning virtual cluster

as opposed to a set of “unconnected” virtual machines as well as “personalize” VMs.

3.3 Cloud Resources Integration

Our DA-TC execution has been proved to be efficient for large-scale loosely-coupledapplications. It can achieve dynamic load balancing and reduced turnaround time for theseapplications. However, how to integrate the Cloud resources as well as traditional Gridsresources to collaboratively perform job execution is a challenge. In order to support thesehybrid execution resources, we have two different approaches.

The first approach is to treat each virtual machine as a task container. It is quite straight-forward. The basic idea behind our DA-TC model is to decouple the resource allocation fromresource binding. In the DA-TC model, the resources are bound to each individual task con-tainers instead of each task so that the queuing time of each task can be reduced. In theCloud computing, the resources are bound to each virtual machine instead of each task,which is coherent to our DA-TC model. Once the virtual machine is booted, the AEA candirectly assign job to execute on this virtual machine.

Figure 4 shows how to launch the virtual machines. Different users can login to the Cloudclient simultaneously. Users can use the client to query the status of the VMs, check theimages already stored at the server side, and transfer in the images they want to boot usingCumulus storage service. Once the users issued the command to launch the VM, the clientwill contact the Nimbus Cloud services on the server side. Users can define the name andduration of this workspace. They can also use customized XML configuration file to definethe memory size and CPU number for each VM. Multiple VMs can be launched at the same

8

Server

Storage CPU Network

Network

VMM

Cloud Client

VM1 VM2 VMnVM3

User1

User2 User3

Figure 4: Launch virtual machines through Nimbus.

time by using the same image. The virtual machine monitor (VMM, also called hypervisor)on the server node will provision the resources for the VMs based on the request. Oncebooted, the information of each VM, such as the beginning time and ending time of the VM,the IP address and name of the workspace, are returned for users’ further usage.

Although the VMs are easy to boot, and these VMs can be treated identically as TCsin our DA-TC model, there are still some difference between them. Due to the isolation ofeach VM on the server node, each VM will need a copy of the binary executables, whichwill cause more overhead if using multiple VMs. The user also needs to configure the imageto make sure the AEA will automatically notice that the VMs begin running once they arebooted. The scheduling of VMs is also different from the scheduling of TCs.

Another approach to integrate the Cloud resources to the DA-TC model is to use thevirtual clusters. The users can use the virtual cluster as one of the multiple resources toexecute the tasks. This approach will fit our DA-TC paradigm, and the virtual cluster iseasy to boot, though it will also need some configurations to be used. The big advantageto use virtual cluster is that once booted, it is a dedicated resource to use, thus there is noqueuing time for the TCs anymore.

Figure 5 shows how to launch the virtual cluster through the Nimbus. The first step isto prepare the image. Some configurations are needed for the image so that once booted,the users can submit the jobs through batch scheduler, such as PBS to the PBS queue forexecution. The users also need to install and configure the context agent on the image beforebooted. Other configurations, such as SSH and name service, are also needed for using thevirtual cluster. Once the image is ready, the users can customize the XML configurationfile for launching the virtual cluster. Users can define the number of compute nodes, theimage for booting each node, and scheduler information in this file. Once configured, theuser can use the Cloud client to launch the virtual cluster. After booting, the context agentin each VM will contact the context broker on the server node to define all the head nodeand compute nodes information. The IP address, hostname and status of each head nodeand compute node will be returned to the user so that the user can login directly to the head

9

Nimbus Service on Server01

Context Broker on Server01

VMM on Server01

CAVM1Head

CAVM2

Comp

CAVMmComp

VMM on Server02

CAVMm+1Comp

CAVMn

Comp

CAVMm+2Comp

Network

Cloud Client

User1

User2 User3

Figure 5: Launch virtual cluster through Nimbus.

node to submit jobs based on these information. Noticed that the head node and computenodes can across different sites.

Local Scheduler

High Speed Network

Local Scheduler

High Speed Network

Local Scheduler

Ethernet

Internet

AEA

Cluster 1 Cluster 2 Virtual ClusterCompute Nodes

Waiting Queues

Figure 6: Integrating hybrid resources for job execution in DA-TC model.

We deployed and integrated the virtual machines and virtual clusters into our DA-TCmodel based on this second approach. Figure 6 shows the way in DA-TC model to integratevirtual cluster and physical clusters for job execution. Once the virtual cluster is ready, the

10

AEA will submit the TCs to it. These TCs will be allocated resources immediately. Thenthe AEA will assign the centrally queued workload to the TCs dynamically based on theavailability.

In our experiments, the virtual cluster can be successfully integrated into our DA-TCmodel and works identically as physical clusters for job execution. The users can configurethe images to suit application execution requirements for software environment and hardwareallocation beforehand so that the virtual cluster can be used immediately for job executiononce booted. Launching the virtual cluster is easy and fast. Comparing with the hugewaiting time wasted in the queue, the virtual cluster provides a fast and secure way forapplication execution.

4 Case Study: Daymet

4.1 Introduction

The Daymet is a component of the terrestrial ecosystem modeling system of the scienceapplication. It is a collection of algorithms and computer software designed to interpolateand extrapolate from daily meteorological observations to produce gridded estimates of dailyweather parameters over large regions [12]. The Daymet run will require the input data suchas digital elevation data and observations of maximum temperature, minimum temperatureand precipitation from ground-based meteorological stations. There are approximately 6000stations in the U.S. National Weather Service Co-op network and the Natural ResourcesConservation Service SNOTEL network (automated stations in mountainous terrain) [12].The output of the Daymet will be further processed and the generated data can be analyzedby text analysis or visualization tools. Figure 7 gives the visualization of the 18-year meanprecipitation of the United States from the output of Daymet.

4.2 Challenges

Figure 8 lists the Daymet execution steps. In the Daymet, the grids data are subdividedinto sections, which are called tiles. The execution of each tile is coordinated by the single-threaded Perl Script. From the figure we can see the execution of each tile is a sequentialjob. However, the execution of each tile will have variable length. Using the Daymet toaccommodate simulations of small areas goes well, but will soon become a overwhelming jobfor the scientists in order to achieve a high resolution of a large area, which may includethousands of tasks. Management of these many tasks requires tedious attention to details,including periodically monitoring running simulations, transferring data, correctly scriptingconfiguration files for each model, and detecting failed simulations and handling the failuresas appropriate [4]. Besides of these, the total execution time will be extremely long if eachtask needs to be queued and executed individually.

Due to the nature of these tiles, they can be individually scheduled on different com-puting resources across multiple administrative boundaries in order to achieve “task level

11

Figure 7: Mean value of precipitation of 18-year data.

Modify Input Files

Tile 1

Tile N

Output

Grids Data

Run Filter

Generate Stats

Fill Missing Values

Interpolation

Prediction

Binary Executables

InterpolationParameters

ProjectionParameters

SurfaceWeather

Observation

InterpolationParameters

ProjectionParameters

SurfaceWeather

Observation

Figure 8: Daymet execution.

12

parallelism” to shorten the execution time. However, supporting multiple resources requiresexcessive resource-specific knowledge and software development experience for the develop-ers. Different resources are administered independently and may have varying performanceand characteristics. Meta-scheduling becomes the burden of the application and gatewaydeveloper since static scheduling can not give good performance due to the heterogeneityof computational performance of each resource, also because of the variable length of eachtile. Any system failure in participating resources will affect execution. The slowest systemis the bottleneck for the application execution since we will need all the outputs in order tofurther down to next step. We will need a system that can not only leverage users’ workloadbut can also achieve reduced turnaround time and better execution reliability.

4.3 System Implementation

Although all these challenges make the execution difficult, we can use the DA-TC modelto fully remove the difficulties and make the execution efficient. In order to implement theDA-TC model for Daymet, there are several steps involved in system preparation.

1. Executable Preparation. Due to the incompatibility of the Daymet executables todifferent Linux/Unix Operation System, the user needs to compile the source files andprovides different versions of the executables to different systems. The DA-TC modelcan based on the system characteristics of the participating clusters to stage-in thecompatible executables to different clusters.

2. TC Scheduling. The DA-TC model will do the static scheduling for the Task Con-tainer submission. The users need to specify the resources they intend to use for theDaymet execution in the configuration file. The DA-TC information service can pro-vide all the system information for these resources, such as LRMS, Available CPUs,CPU speed, System Architecture, Memory size, and Running and Queuing jobs, etc.The users can also use the scheduling algorithms provided to generate the staticscheduling for the TCs. The scheduling algorithms include Queue Length Mode,Weighted Workload Allocation mode, and Least Normalized Load mode, etc. Theusers can also specify the scheduling number directly by writing into the configurationfile.

3. Data Preparation. Since the tiles to execute are organized in discontinuous namepattern, and also to allow the users to specify the tiles they want to skip during theexecution, we use a file to specify the name of the tiles which will not be included atthe Daymet execution. The users also need to give the beginning name of the tiles forthe DA-TC executing sequence. Each tile will have the Gzipped file of the Daymetinput data and Perl script coordinating the execution of each tile.

After the system preparation, the DA-TC model can be used to execute these thousandsof Daymet tiles fully automatic. The whole procedure of the Daymet execution will be:

13

1. Binary executables and Grids data staging in. Since each tile execution willneed the executables and Grids data, these files are transferred to each participatingclusters as the first step. If these files are already there, the AEA will skip this step.

2. Task container submission. The AEA will submit the TCs to each cluster basedon the scheduling results or users’ specification. The TCs will be treated as normaljobs waiting in the queue for resource allocation.

3. Input data and script staging in. Once the TC is allocated resources, the AEAwill assign next available tile to this TC. The corresponding input data file and thePerl script in that tile directory will be staged in to this TC location.

4. Tile execution. The TC will take charge of the execution of this tile based on thePerl script.

5. Results staging out. After the execution, all the output data are generated. Theoutputs are gzipped and transferred back to the AEA machine under the correspondingtile directory.

6. Terminating. After staging out the data, the TC is available for next tile. If at thistime, there is no further tile need to execute, the AEA will terminate this TC.

4.4 Results

The DA-TC model has been successfully implemented in the NCAR’s clusters to performthe Daymet execution. By using the “Frost” as the AEA and integrating two NCAR’sevaluation clusters and one external cluster, we are able to finish all 800 tiles within threehours with only eleven TCs. It is a huge reduction of the execution time for the Daymetapplication. By using more number of TCs, the total execution can be further reduced.Through the DA-TC model, the scientists can be insulated from tedious configuration details,thereby increasing their productivity.

5 Summary and Future Work

In this summer internship, we have successfully implemented the DA-TC model intothe NCAR clusters to perform the Daymet application execution. We also deployed theNimbus cloud service on these clusters. Users can easily turn the clusters into Cloud serviceplatforms by initiating the Nimbus service. Users can also launch the virtual machines andvirtual clusters by using the Cloud client. We have investigated and integrated these Cloudresources into the DA-TC model for Daymet execution.

We plan to extend the work by studying the scalability of the configured images ontodifferent operation systems. We also want to investigate the scheduling and load balancingalgorithms of the virtual machine deployment. These will in turn provide better integrationof the Cloud resources into the DA-TC model to easily extend its capabilities.

14

Acknowledgements

I acknowledge the help of Matthew Woitaszek and Michael Oberg with explanation andoperation of the configuration of the system administration, networking and virtual machinesetups. I also thank Ben Mayer for organizing the inputs and scripts for the Daymet tiles.This work was supported by the NCAR’s NSF research grant for the SIParCS program.

References

[1] W. Allcock, J. Bester, J. Bresnahan, A. Chervenak, L. Liming, S. Tuecke, Status Of ThisMemo, L. Liming, and S. Tuecke. Gridftp: Protocol extensions to ftp for the grid.GWD-R (Recommendation, page 3, 2001.

[2] Amazon Elastic Compute Cloud. http://aws.amazon.com/ec2.

[3] Cloud computing. http://en.wikipedia.org/wiki/cloud computing.

[4] Jason Cope, Craig Hartsough, Sean Mccreary, Peter Thornton, Henry M. Tufo, NathanWilhelmi, and Matthew Woitaszek. Experiences from simulating the global carbon cyclein a grid computing environment, the fourteenth global grid forum (ggf 14),chicago,2005.

[5] K. Czajkowski, D. Ferguson, I. Foster, J. Frey, S. Graham, I. Sedukhin,D. Snelling, S. Tuecke, and W. Vambenepe. The ws-resource framework. Inhttp://www.globus.org/wsrf/specs/ws-wsrf.pdf, 2004.

[6] Ian Foster, Carl Kesselman, and Steven Tuecke. The anatomy of the grid - enablingscalable virtual organizations. International Journal of Supercomputer Applications,15:2001, 2001.

[7] H. Kim, Y. el Khamra, S. Jha, and M. parashar. An autonomic approach to integratedhpc grid and cloud usage. In eScience’09, 2009. to appear IEEE eScience 2009, Oxford.

[8] KVM. http://www.linux-kvm.org/page/main page.

[9] Zhou Lei, Zhifeng Yun, Gabrielle Allen, Xin Li, Nian-Feng Tzeng, and Christopher D.White. Improving application execution in multicluster grids. In CSE, pages 163–170,2008.

[10] Andre Luckow, Lukasz Lacinski, and Shantenu Jha. Saga bigjob: An extensible andinteroperable pilot-job abstraction for distributed applications and systems. ClusterComputing and the Grid, IEEE International Symposium on, 0:135–144, 2010.

[11] SGE. http://gridengine.subsource.net.

[12] Peter Thornton. Daymet user’s guide, 2005.

15

[13] Torque. http://www.clusterresources.com/pages/products/torque- resource-manager.php.

[14] Xen. http://www.xen.org.

[15] Zhifeng Yun, Sun Joseph Chang, Zhou Lei, Gabrielle Allen, and Ashwin Bommathana-halli. Grid-enabled sawing optimization: from scanning images to cutting solution. InProceedings of the 15th ACM Mardi Gras conference, MG ’08, pages 16:1–16:8, 2008.

[16] Zhifeng Yun, Samuel J. Keasler, Maoyuan Xie, Zhou Lei, and Gabrielle Allen. Aninnovative simulation approach for water mediated attraction based on grid computing.In IMSCCS, pages 204–211, 2007.

[17] Zhifeng Yun, Maoyuan Xie, Fuguo Zhou, Gabrielle Allen, Tevfik Kosar, and Zhou Lei.Collaborating mechanical design phases across a grid. In Proceedings of the 2008 11thIEEE International Conference on Computational Science and Engineering - Work-shops, pages 65–70, 2008.

16

A Using DA-TC to execute Daymet

According to the section 4.3, before using it to execute the tile sequence, there are someinitial setups needed to make it work. The home directory of the DA-TC is on the “Frost”machine, and “Frost” is also used as the AEA in the DA-TC execution. Throughout thisguide, we will refer to this DA-TC home directory as $PEL = /ptmp/zyun/Pel NCAR.

1. SSH setup. The DA-TC model used in NCAR to execute the Daymet applica-tion is built on SSH and LRMS. The first step is to copy the SSH public key to theauthorized keys based on the pair of each AEA and participating clusters. It is alsoneeded in the known hosts so that the login from AEA to any participating clustersand from participating clusters to AEA will not prompt anything during the SSH.There is also a script in $PEL/setup can help the initial SSH setup.

2. Executable setup. The users need to provide the compatible version of the exe-cutables for the participating clusters. All the binary executables will be stored in$PEL/daymet/App bin/bin.

3. TC scheduling. The users can define the number of TCs to each participating clus-ters manually by editing the file $PEL/daymet/.sched result. Another way to getthe scheduling results is through the information services. The users just need tolist the resource names in the file $PEL/info/resources and by running the scriptsquery info.pl and schedule.sh, the information services will generate the results forthe TC scheduling.

4. Data preparation. The users need to copy the Grids data to the location $PEL/daymet/App bin/grids. The DA-TC also allow the users to define the tile names that they donot want to execute in file $PEL/setup/Missedf ilename. All the names in this filewill be skipped during the execution. The users also need to edit the file $PEL/lib/MCI LIB task.pm to give the beginning tile name for the execution. The locationsfor staging in the input data, Perl script and staging out the results are defined in thefile $PEL/GEMS TC/tc.pl.

5. Running DA-TC. After all these setups, the users can begin to use the DA-TC modelto execute the Daymet tiles. In the $PEL directory, execute

$ ./gems.pl 0 . ./daymet app 800

The number “0” is used to define the execution pattern, which is task farming in thiscase. “.” gives the DA-TC tool location. “./daymet” is the location of the application.“app” is the application name appeared in the PBS queue status. And “800” gives thenumber of tiles will be executed in this run.

17

B Nimbus Deployment

The Nimbus deployed onto NCAR clusters is the version 2.5. The Nimbus service is in-stalled in the directory $NIMBUS HOME = /home/zyun/nimbus on the DCS0203. TheNimbus Client service is installed in the directory $NIMBUS CLIENT = /home/zyun/nimbus−cloud−client on the DCS0102. The Nimbus workspace control is installed in $NIMBUS V MM =/opt/nimbus on the DCS0203 as well. The Host name is dcs0203 and the CA name isDCS0203NimbusCA for the X.509 certificate.

B.1 Dependencies

Deployment of Nimbus will reply on some software packages. It will need Sun Java(1.5+), Python (2.5+ but not 3.x), GCC, Apache Ant (1.6+), and Libvirt (0.7+). Noticethat the Libvirt will need manual installation. Make sure when configure the source code,adding XEN support by using “–with-xen” postfix.

In order to start the Libvirt:$ sudo /opt/libvirt-0.8.2/sbin/libvirtd –daemonTo stop the Libvirt:$ ps aux | grep libvirtand then kill the process.It is needed to setup Libvirt permissions by editing the file /opt/libvirt-0.8.2/etc/libvirt/libvirtd.conf.After all these setups, the administrators can proceed to page http://www.nimbusproject.org/docs/2.5

/admin/z2c/service-setup.html to install the new Nimbus service and Client and configurethe users and credentials if wanted.

B.2 DHCP

Edit the network configuration in etc/network/interfaces of the DCS0203 machine toallow bridged network. By default, the name of the bridge is “xenbr0” defined in file $NIM-BUS VMM/etc/workspace-control/networks.conf.

Edit the files in $NIMBUS HOME/services/etc/nimbus/workspace-service/network-pools/.These are the files represent the network which can be provided to the virtual machines whenthey booted. After configuration, restart the service by

$ cd $NIMBUS HOME/bin$ nimbusctl services restartIt will generate an updated version of the dhcp entries in $NIMBUS HOME/services/var/nimbus/.

Copy the contents of the file dhcpd.entries to /etc/dhcpd.conf-MACs-nimbus of the machine“ma0123en” and edit /etc/dhcpd.conf of that machine to include this new file.

After these configurations, the administrators can proceed to page http://www.nimbusproject.org/docs/2.5/admin/z2c/vmm-setup.html and follow the manual to install the VMM, test theVM creation, and configure the SSH, etc.

18

C Using Nimbus Cloud Services

C.1 Configure the images

Due to the unportability of the images on different operation systems, we need to con-figure the image before we use it to launch the virtual machines. Users can edit the imageby mounting it first

$ mount -o loop image mntThere are several files need to pay attention. The first one is $MNT/etc/inittab. Make

sure hvc0 is listed in inittab in the image, like ’1:2345:respawn:/sbin/getty 38400 hvc0’.Otherwise there is no login prompt when using “xm console”. If the file $MNT/etc/securettyexists, make sure to add “hvc0” to this file to allow root to login at the Xen console prompt.The second one is $MNT/etc/hosts. Make sure to add the DCS information here. The thirdone is $MNT/etc/ssh/sshd config. Make sure it permits Root login.

The Client service will copy the SSH public key to the authorized keys of the Root direc-tory of the virtual machine once booted so that users can login as Root without password.Users can also edit the image beforehand to allow different username for use in that virtualmachine by editing the files $MNT/etc/passwd, $MNT/etc/shadow, and $MNT/etc/group.

C.2 Launch VMs

In order to launch the VMs through the Cloud client, the first thing needs to do is toquery the grid-proxy information. Users can initiate the proxy information by using

$ cd $NIMBUS CLIENT$ ./bin/grid-proxy-init.shUsers can also use the following command to get all the security information.$ ./bin/cloud-client.sh –securityAfter granted the proxy, users need to transfer the image to the service node through Cu-

mulus. All the images transferred will be stored in $NIMBUS HOME/cumulus/posixdata/R/.Users can then launch the VMs, query the status of the VMs, and terminate them throughthe client. Execute “./bin/cloud-client.sh -h” for help.

Here is the reference page: http://www.nimbusproject.org/docs/2.5/clouds/cloudquickstart.html.Notice that each VM started from Cloud client can specify the memory request in

$NIMBUS CLIENT/conf/cloud.propertiesHowever, all the memory requests of the VMs started by this client can not exceed the

number specified in $NIMBUS HOME/services/etc/nimbus/workspace-service/vmm-pools/testpoolin the service node. Once changed, needs to restart the nimbus service to apply.

The total time of all VMs can be started from the client is limited by $NIMBUS HOME/services/etc/nimbus/workspace-service/group-authz/group01.properties.

19

C.3 Launch virtual cluster

In order to launch the “one-click clusters”, the users need to prepare the images beforelaunching. The users need to install the context agent into the default location (/opt/nimbus/ctx)so that once booted, the context agent can contact with the context broker on the servicenode to get the information of head node and compute nodes. Users also need to configurethe PBS in order to make it work appropriately.

Once the image is ready, the user needs to upload the image to the service node sidethrough Cumulus service. Then the user needs to edit the file $NIMBUS HOME/services/etc/nimbus/workspace-service/metadata.conf on the service node in order to configure the IP addressand port number for the context agent to contact context broker.

The last step is to edit $NIMBUS CLIENT/samples/base-cluster.xml to include the DNin the gridmap field. Users can also define the images to boot for the head node and computenodes and how many compute nodes to boot.

After all these configurations, users can launch the one-click cluster by$ cd $NIMBUS CLIENT$ ./bin/cloud-client.sh –run –hours 1 –cluster samples/base-cluster.xmlIt will take a while for waiting for the launch updates and context broker updates. In

our case, where we boot one head node and two compute nodes with 6G for each, it takes 8minutes to complete.

C.4 Nimbus web service

In order to use the Web Service of the Nimbus toolkit, the first thing to do is to enable it.The administrators can edit the file $NIMBUS HOME/nimbus-setup.conf and change thevalue of web.enabled to True. Once changed, it needs to run $NIMBUS HOME/bin/nimbus-configure to propagate the change. When enabled, the web application listens by default onport 1443. This and other configuration options are location in the $NIMBUS HOME/web/nimbusweb.conf file. Changes to this file require a restart of the service.

Since the current service node (DCS0203) is not in shared network, we need to use theSSH tunnel. The web can be viewed by proceeding in the following steps:

In one terminal, execute$ ssh -l zyun -L 7777:dcs0203:22 frost.ucar.edu cat -Open the X11, and execute$ ssh -X -Y -p 7777 localhostIt will allow you to login to the DCS0203 now. Launch the iceweasel, and in the browser,

input the URL as “https://localhost:1443/nimbus/”, then it will proceed to the NimbusHome webpage. All the html files are located at $NIMBUS HOME/web/src/python/nimbusweb/portal/templates/ on the service node.

application execution using hybrid resources

Documents