pragma institute on implementation @ pragma 19 wilfred w. li, ph.d., ucsd, usa xiaohui wei, ph.d.,...
TRANSCRIPT
PRAGMA Institute on Implementation @ PRAGMA 19
Wilfred W. Li, Ph.D., UCSD, USA
Xiaohui Wei, Ph.D., JLU, PRC
Hosted by JLUChangchun, Jilin, PRC, Sept 13, 2010
Avian Flu Grid: Transition from Grid to Cloud Computing
Scientific Driver and Use Cases
http://www.reactome.org/http://www.wikipedia.org http://library.thinkquest.org/05aug/01479/prevention1.html
Harris et al, PNAS, 2006
Relaxed Complex Scheme and Ensemble based Virtual Screening Contributed to HIV Integrase Inhibitor Development
“ Exploration of the structural basis for this unexpected result … suggests an approach to the development of integrase inhibitors with unique resistance profiles.”
D. Hazuda et al., Proc. Natl. Acad. Sci. USA (Aug. 2004), refers to Schames, et al. (2004).
Discovery of unexpected binding site in HIV-1 Integrase using MD and AutoDock: Schames, … & McCammon, J. Med. Chem. (released on web, early 2004)
February, 2006 – Phase III Clinical TrialsFebruary, 2007 – Name announced: Isentress (raltegravir)October, 2007 – FDA “fast track” approval
New Class of HIV Drugs: Merck & Co.
MK-0518
Source: A. McCammon
Ensemble-based Virtual Screening with Relaxed Complex SchemeNAMD2Amber
NCI Diversity Set: 3.3 MB, 2000 compounds;Required at each siteZINC subset: 200,000. A few hundred MB
Multiple targets: HA, NA subtypesEach target: 30~50 MD snapshots, 1~2 MB each
AutoDock4
Simulation Data: hundreds of GB
Docking Data: hundreds of MB
Total data to date: ~5 TB in long term storage. Each experiment is about 1 Petaflops accumulative in computation cost.
Source: Amaro
Advances in Computing Infrastructure Enables Complex Simulations of Biomolecular Systems
Amaro & Li, CTMC, 2010
New Challenges
• Virtualization – What does it mean to us?– Rock’n Rolls, on demand virtual machines,
• Production environment – Where is it? What form should it take?– GPU clusters, virtual machines, cloud services
• Most work is still done on local clusters, the desire to use the grid/Cloud is there– It’s happening, and quite exciting
• Collaboration – How to stay in touch better, PRIME, MURPA, research in general?
Transparent access of applications on Avian Flu Grid through middleware
CNIC Duckling Portal
Konkuk Glyco-M*Grid
NBCR CADD
2 – 4 March 2010 PRAGMA 18, San Diego 8
Other Examples of Continued Software Development at Member Institutions
– Drugscreener-G – KISTI, Korea– Grid Enabled Virtural Screening Service – ASGC,
Taiwan– CADD Pipeline – NBCR, USA– WISDOM project – CNRS, EU– Glyco-M*Grid – Kookmin & Konkuk U, Korea
Virtual Screening with CSF • Virtual screening web services with remote clusters including
TeraGrid and PRAGMA Grid resources.
Virtual cluster at SDSC
AMAZON EC2
Integrating Visualization Workflows using Real-time bioMEdical data Streaming and visualization (RIMES)
Kevin Dong, CNIC
Lau, Haga and Date
ViewDock TDW
• OPAL as resource manager of CSF4• CSF4 allocate service instances of OPAL for jobs
1313
New OPAL-CSF4 Cloud model
PRAGMA 19 workshop, Changchun, Jilin, China, Sep.13-15, 2010.
1414
Parallel job scheduling in CSF4
• Two phase resource allocation in parallel job plugin– Construct virtual clusters according to job requirements– Distribute real jobs to virtual clusters
PRAGMA 19 workshop, Changchun, Jilin, China, Sep.13-15, 2010.
Social Networks and Collaborative Environment
Social Network Site Number of Users Features API Examples
Google 170 million (Gmail) Google Integrated Suite of Tools
Google Apps Engine
LinkedIn 65 million Professional Huddle/Zoho Office Online
Twitter 100 million Short MMS/SMS TwitPic
Google Wave 100,000 X 7? Upload any file Google Wave Robot
Facebook 400 million+ Social network Facebook Apps
Are these too big to fail?Utility Computing finally?
Kepler Opal Web Services Actor
Web Form for Virtual Screening Service
Cloud Computing with Amazon EC2
AutoDock Workflow
A Virtual Screening Vision Workflow
A web service
Condor pool SGE Cluster PBS Cluster
Globus Globus Globus
Application Services
Opal GUI PMV/Vision Kepler
Transparent Access Layer for Applications
Grid/Cloud ResourcesGrid/Cloud Resources
22
Vision Workflow Snippet Using Opal• Two Major Steps
1. Run PDB2PQR web service.This step is skipped if an appropriate PQR file exists on the local machine.
2. Run PrepareReceptor web service.
Output is URL to PDBQT
• PDB2PQR and PrepareReceptor are skipped if an appropriate PDBQT file exists on the local machine.
– Output is PDBQT path on local machine.
Macro that runsPDB2QR web service.
Macro that runsPrepareReceptor
web service
Opal 2 for SaaS
Biomedical CLOUD
Resource Manager(edu.sdsc.nbcr.opal.manager.CSFJobManager)
Service Manager
Scheduling: Workflow Job
Array Job
AutoDock NAMD
OPAL2
CSF4
User Interface
Grid Sites
MetaScheduler
generate RSL files
Grid Resources
Input/Ouput Files: StageIn and StageOut
VM Replication Experimenthttp://goc.pragma-grid.net/wiki/index.php/VC-replication-2
SDSC VM hosting server AIST VM hosting server
AFG VM(original) AFG VM
(copy)
• VM hosting server: •Rocks 5.3 Xen roll
• Avian Flu Grid VM• Rocks VM• Globus/SGE• Autodock
• Replication updates • hostname and IP • Compute nodes• Network configurations• Globus configuration• SGE configuration
NBCR VM hosting server
AFG VM(copy)
VM replication