the acm cloud and autonomic computing conference (cac...
TRANSCRIPT
The ACM Cloud and Autonomic Computing Conference (CAC 2013)InterContinental Hotel, Miami, Florida, USA August 5-August 9, 2013
HTCaaS: Efficient and Simplified Large-Scale Scientific Computing over Supercomputers, Grids and Cloud
The material presented in this poster is based upon the work supported by KISTI(Korea Institute of Science and Technology Information) with project code [K-13-L01-C02-S06].
Acknowledgement
Problem Statement
Soonwook Hwang, Seoyoung Kim, Jik-Soo Kim, Seungwoo Rho, Sangwan Kim, Seokkyoo KimNational Institute of Supercomputing and Networking (NISN) at KISTI, Daejeon, Republic of Korea
HTC-as-a-Service
Application Support
Interfaces & Client ToolsExploiting Computing Resources for High-Throughput Computing
• A large number of independent, sequential jobs that can be scheduled on many different computing infrastructures
• Leveraging the complexity of integrating various computing resources
• Effective management and exploitation of all available computing resources
The Number of Jobs are increasing• Growing in the number of jobs drives HTC area into Many-Task
Computing• Needs efficient job submission, processing and load balancing
END-USER SUPPORTSAutodock; Pharmaceutical & New drug Discovery
• New drug discovery by simulating protein docking• A target protein of 3CL-pro of SARS with 1.1
Million chemical compounds for protein dockingMadgraph5; High Energy Physics(HEP)
• Compute the collider processes of high-energy particles• 5-million particle collisions resulting in tens of
millions of interactions for QCD simulationsN-body Calculations; Nuclear Physics
• Calculations for simulation of a dynamical system of particles, usually under the influence ofphysical forces such as gravity
3rd PARTY DEVELOPER SUPPORT(API)ezBioNet; Bio-technology
• A modeling & simulation system for analyzing biological reaction networks
• Support connection to computations on supercomputing
• Hiding heterogeneity and complexity of integrating various computing infrastructures from users throughout pluggable resource interfaces
• Allowing users to efficiently submit a large number of jobs at once (based on the Meta-job) and monitor them
• Exploiting intelligent scheduling algorithms to support multiple users submitting large numbers of tasks to distributed computing resources
• Successfully integrated with *PLSI Supercomputers in Korea, International Computational Grids, and Amazon EC2
System Architecture• Agent-based multi-level scheduling & streamlined job dispatching• User-level job scheduling & accountingwhich can dynamically
adapt to the current system loads• Jobs& In/Outputs are managed by Job manager and User data
manager • Agents are dispatched from Agent manager and process jobs
in Supercomputers, Grids and Clouds • Loosely-coupled Service-oriented architecture with WS-Interface
*PLSI : Partnership and Leadership for the Nationwide Super- computing Infrastructure
Web portalWeb portal CLICLI
GUIGUI
• Powerful Meta-jobdescriptions which allow users to easily specify a large amount of computations (e.g., parameter sweeps)
Current status & Future workEfforts to improve HTC-as-a-Service
• Applying job profiling based allocation and provisioning model on Clouds
• Carrying out performance studies using micro-benchmark and stress test on HTCaaS
N-body
GridsPrivate Private Clouds
• Supporting many client interfaces including a native WS-interface, Java API, and easy-to-use client tools (CLI/GUI/Web portal)