[advances in intelligent and soft computing] distributed computing and artificial intelligence...

S. Omatu et al. (Eds.): Distributed Computing and Artificial Intelligence, AISC 151, pp. 347–353. springerlink.com © Springer-Verlag Berlin Heidelberg 2012

A Scientific Computing Environment for Accessing Grid Computing Systems Using Cloud Services

Mariano Raboso, José A. de la Varga, Myriam Codes, Jesús Alonso, Lara del Val, María I. Jiménez, Alberto Izquierdo, and Juan J. Villacorta*

Abstract. This paper shows how virtualization techniques can be introduced into the grid computing infrastructure to provide a transparent and homogeneous scientific computing environment. Today’s trends in grid computing propose a shared model where different organizations make use of a heterogeneous grid, frequently a cluster of clusters (CoC) of computing and network resources. This paper shows how a grid computing model can be virtualized, obtaining a simple and homogeneous interface that can be offered to the clients. The proposed system called virtual grid, uses virtualization support and is developed from integration of standard grid and cloud computing technologies. Furthermore, a Scientific Computing Environment (SCE) has been developed to provide uniform access to the virtual grid.

Keywords: grid computing, cloud computing, virtualization, scientific computing environment, message passing interface.

1 Introduction

Increasing demand of computer resources for scientific research has been a strong motivation for the community to develop a wide variety of high performance computing infrastructures (HPC). Huge supercomputer resources are not always

*Mariano Raboso ⋅ José A. de la Varga ⋅ Myriam Codes ⋅ Jesús Alonso Facultad de Informática, Universidad Pontificia de Salamanca. Compañía 5, 37002 Salamanca, Spain e-mail: [email protected]

Lara del Val ⋅ María I. Jiménez ⋅ Alberto Izquierdo ⋅ Juan J. Villacorta Departamento de Teoría de la Señal y Comunicaciones e Ingeniería Telemática, Universidad de Valladolid, E.T.S.I. Telecomunicación, Paseo Belén 15, 47011 Valladolid, Spain

348 M. Raboso et al.

available to small research groups, usually limited by restrictive budgets, deploying tasks not always suitable for these systems. Initiatives, such as the European Grid Infrastructure (EGI) [1], aim to develop a sustainable grid infrastructure for all European researchers.

A grid computing system is a grid of parallel and distributed computing re-sources, working together towards a single goal and providing high global compu-tational power if convenient parallelization and concurrency issues are attained. A cloud computing system is an approach to computing, based on on-demand effi-cient use of aggregate resources, self-managed and consumed as a service.

We have developed an infrastructure combining grid and cloud services that provides transparent virtual grids to the clients. Although the grid is shared, the clients will use their own virtual grids with physical computing nodes through common services provided by the SCE (Scientific Computing Environment). Classic grid environments provide shared resources as CPU time and queues for batch processes but virtual grids, as they provide full access to their own infra-structure, allow the user to perform their own administration tasks.

Other solutions [2][3][4][5] use customized or proprietary software for sharing implementation, creating their own interfaces and middleware, or use cloud tech-nology as a part of the HPC system, affecting the overall system performance. Our solution uses a standard and open cloud technology based on EC2 Amazon tech-nology that allows not only a private cloud model but also a public or even hybrid. Cluster performance is not influenced by cloud technology, as once the virtual grid is configured and assigned, clients get direct and full access to the computing resources.

Furthermore, virtual clusters assignment can take in account cluster location so computing assignments can be done depending on site location. This is very useful to avoid network delays or low-bandwidth link issues when several organizations shared resources over a network. We have designed the cloud infrastructure to in-tegrate resources at the Computer Science Faculty (Pontifical University of Sala-manca) and the Array Processing Group (GPA) at the Valladolid University. Both universities are connected through RedIris network.

Integrating grid and cloud technologies has interesting advantages as efficiency and security improvements. Furthermore, providing a common interface by an SCE, makes the system more accessible and hides the clients from the technical details for customizing their own grid.

• Uniformity. Virtualizing the grid implies that every client sees their own grids and so, collateral effects can be minimized or even null. Users access through a common interface provided by the SCE using an authentication service. Cloud services are in charge of running virtual machines providing the master nodes of every virtual grid, but no participate from computation jobs.

• Customizable grids. It is possible to offer customized grids by deploying specific images for the virtual master machines. The cloud infrastructure and configuration server provide the images that can be customized with user-defined parameters such CPU resources, memory, nodes number or time limit.

• Efficient assignment of local grid infrastructure allowing users to run jobs from different sites, avoiding bottlenecks caused by bandwidth or latency issues. As

A Scientific Computing Environment for Accessing Grid Computing Systems 349

grid nodes can be located on different sites connected through WAN links, the configuration server can select only local nodes to deploy the grid.

• Flexible accounting. It is possible to extend general accounting plans from the cloud computing system, to offer different quality of service. This feature is provided by the cloud service.

• Improved LAN security. Independent grids isolate user jobs running in the grid. VLAN management is implemented on the cloud system, so different grids can be assigned to different VLANs.

• User-level security. Cloud technology by Eucalyptus uses credentials and certificates to login into the system and manage the virtual machines.

The next sections describe how virtualization process can be applied to the grid. Virtualization techniques used are described in detail in section 2. In order to pro-vide a common access, a SCE is developed. It is described in section 3. Finally, some conclusions are made.

2 Virtualizing the HPC System

In order to batch their jobs, HPC users must usually connect to a master node. Some effort has been made to offer a standard and uniform interface, so hardware and software component details are transparent for users [6], providing a middleware that enable users to solve complex scientific problems, while using simple interfaces from their Scientific Computing Environments (SCE) [7].

Another solution for sharing the grid computing system can be addressed. The main goal is to manage local virtual grids to serve client requests. This virtual grid contains a master (virtual machine) and several computing nodes. The system runs a new virtual machine using a previously configured master server image. This master machine is responsible of running jobs in the new grid and collecting statistics for the grid.

Fig. 1 Virtual grid system architecture and VM creation


Once the new master is running, the clients will manage their own grid with the computing resources (nodes) assigned through the master and the SCE. The infra-structure developed is called virtual grid. Figure 1 shows the virtual machine crea-tion process.

When clients want to gain access to the grid (virtual), they first make a request to the cloud system through the SCE, a web frontend that integrates the cloud computing and configuration services. The cloud service then runs an instance of a master virtual machine that downloads and self-configures the grid with the sub-net, nodes and policies assigned. As a result, the clients get their own grids with a master node and a set of computing nodes assigned.

The configuration is dynamically loaded by the virtual machine from the con-figuration server and includes node assignment, job policy algorithms and time limit if configured. This configuration is shown to the client through the SCE.

Figure 2 shows a sequence diagram describing relationships between clients, the SCE and the configuration server.

Fig. 2 Sequence diagram showing Virtual Grid creation and assignment.

The grid is based on a powerful technology for grid implementation called MPI (Message Passing Interface). Open-MPI middleware [8], which is used worldwide, provides a cluster of associated machines performing a grid. Each grid has a mas-ter node controlling the program execution. This special node is here performed by a virtual machine using the cloud technology. The cloud services are provided integrating Eucalyptus [9] technology and Ubuntu server. The rest of the nodes are not virtualized and are only assigned to the corresponding virtual grid.

Once the virtual grid is assigned to a client, it is accessible through a ssh ses-sion to the master node. Programs, usually written in C, C++ or Fortran then runs using the middleware with software MPI wrappers as mpicc and mpirun.


The configuration server stores on a database all the details for the system, the cloud and grid computing nodes available, client sessions details and node assignments.

3 Scientific Computing Environment for Virtual Grids

Using a scientific computing environment makes easier, for the scientific groups, to use HPC systems. Complex hardware and software architectures are convenient encapsulated as researchers, from a variety of science fields, are not usually con-cerned to such details. Therefore, users can access by a common and uniform interface.

It hides virtual grid implementation providing a direct link with the machine that is responsible of batching their jobs. The grid can be also easily configured using a web based frontend, and monitored using a platform called Ganglia.

Configuration tasks through the SCE allow the users of each virtual grid, to re-size the infrastructure to match their requirements. Resources aggregated are measured in terms of number of slots, a typical measurement for MPI systems re-lated to the number of processor cores. Figure 3 shows the configuration interface.

Fig. 3 SCE node configuration view for CLUSTER-UVA virtual grid.

When the client connects to the master machine a new window with a ssh ter-minal appears, providing full access to the grid. The client can connects to every node, but it usually only needs to batch the jobs in the master node. File systems


are configured and connected through NFS, and Ganglia configuration files are al-ready set by the configuration service when the master machine is started.

Performance is measured integrating Ganglia [10], a distributed monitoring system that uses a large variety of sensors that acquire information and send it to the management station (see Figure 4).

Fig. 4 Ganglia integrated performance view for CLUSTER-UVA virtual grid.

Ganglia monitoring system uses daemons running on every computing node that deliver statistics to the master node. A report can be downloaded from the master node where Ganglia is also installed. Through the SCE, the client views the statistics accessing via http protocol to the master machine.

4 Conclusions

Traditional high performance computing systems based on grid computing can be improved using cloud computing infrastructures. A new infrastructure for HPC called virtual grid has been developed, virtualizing the master node responsible of running user jobs. This virtualizing process improves security, flexibility and effi-cient use of resources.

Tests made have shown that delay when the cloud virtual machines start is too high. This issue does not affect to the grid performance as it only occurs once,


when the virtual machine starts. It is being considered to use VMware virtualiza-tion API to deploy the master machines for faster serving the virtual grids.

Performance for the whole system is determined by the cluster node capacity: computing node hardware, network and MPI. The virtualization service only in-troduces delays when preparing the virtual grid upon client requests, so it can be measured independently. Current hardware configuration introduces delays from three to five minutes, depending on the workload.

This research has been supported by projects: 10MLA-IN-S08EI-1 (Pontifical University of Salamanca), and PON323B11-2 (Junta de Castilla y León).

References

[1] European Grid Infrastructure, http://www.egi.eu [2] Walker, E., Gardner, J.P., Litvin, V., Turner, E.L.: Creating personal adaptive clusters

for managing scientific jobs in a distributed computing environment. In: Challenges of Large Applications in Distributed Environments, pp. 95–103. IEEE (2006)

[3] Lin, L., Decker, K.M., Jognson, M.J., Domain, C., Souffez, Y.: ISCN: towards a dis-tributed scientific computing environment. In: High Performance Computing on the Information Superhighway, April 28-May 2. HPC Asia 1997, pp. 157–162 (1997)

[4] Chine, K.: Scientific Computing Environments in the age of virtualization toward a universal platform for the Cloud. In: Chine, K.: 2009 IEEE International Workshop on Open-source Software for Scientific Computation (OSSC), pp. 44–48 (Septtember 2009)

[5] Li, X., Palit, H., Foo, Y.S., Hung, T.: Building an HPC-as-a-Service Toolkit for User-Interactive HPC Services in the Cloud. In: 2011 IEEE Workshops of International Conference on Advanced Information Networking and Applications (WAINA), March 22-25, pp. 369–374 (2011)

[6] Innovative Computing Laboratory, GridSolve: A system for Grid-enabling general-purpose scientific computing environments, http://icl.cs.utk.edu/netsolve/

[7] Zhihui, D., Lingjiang, W., Haili, X., Hong, W., Xuebin, C.: A Lightweight Grid Mid-dleware Based on OPENSSH - SCE. In: Sixth International Conference on Grid and Cooperative Computing, August 16-18, pp. 16–18 (2007)

[8] Open MPI: Open Source High Performance Computing., http://www.open-mpi.org

[9] Eucalyptus system, http://www.eucalyptus.com/products/ ubuntu_enterprise_cloud

[10] Massie, M., Chun, B., Culler, D.: The Ganglia Distributed Monitoring System: Design, Implementation, and Experience. Parallel Computing 30(7) (July 2004)

[advances in intelligent and soft computing] distributed computing and artificial intelligence...

Documents