2011 ieee international symposium on parallel and ...toc.proceedings.com/12697webtoc.pdf · 2011...
TRANSCRIPT
Volume 2
IEEE Catalog Number: ISBN:
CFP1151J-PRT 978-1-61284-425-1
2011 IEEE International Symposium on Parallel and Distributed Processing Workshops and Phd Forum (IPDPSW 2011)
Anchorage, Alaska, USA 16-20 May 2011
Pages 1-715
1/3
2011 IEEE International Parallel &
Distributed Processing Symposium
IPDPS 2011
Table of Contents
Volume - 2
Message from the IPDPS General Chair to the Workshops..............................................................xxx
Message from the Workshops Chair.........................................................................................................xxxii
Message from the PhD Forum Chair.......................................................................................................xxxiii
Heterogeneity in Computing Workshop (HCW 2011)
HCW Introduction ..........................................................................................................................................................1
Anne Benoit
Message from the HCW Steering Committee Chair ......................................................................................................3
H.J. Siegel
Message from the HCW General Chair ..........................................................................................................................4
Anne Benoit
Message from the HCW Program Chair .........................................................................................................................5
Behrooz Shirazi
HCW Keynote ................................................................................................................................................................6
Yves Robert
Session I: Optimization and Evaluation of Heterogeneous Environments
Time Utility Functions for Modeling and Evaluating Resource Allocations in
a Heterogeneous Computing System ..............................................................................................................................7
Luis Diego Briceño, Bhavesh Khemka, Howard Jay Siegel, Anthony A. Maciejewski,
Christopher Groër, Gregory Koenig, Gene Okonski, and Steve Poole
Optimized Barriers for Heterogeneous Systems Using MPI ........................................................................................20
Jan C. Meyer and Anne C. Elster
Characterizing Task-Machine Affinity in Heterogeneous Computing Environments .................................................34
Abdulla M. Al-Qawasmeh, Anthony A. Maciejewski, Rodney G. Roberts, and Howard Jay Siegel
v
Session II: Scheduling and Resource Allocation in Heterogeneous Environments
Scheduling on Unspecified Heterogeneous Distributed Resources ..............................................................................45
Daniel Millot and Christian Parrot
MO-Greedy: An Extended Beam-Search Approach for Solving a Multi-criteria
Scheduling Problem on Heterogeneous Machines .......................................................................................................57
Louis-Claude Canon and Emmanuel
A Model-Based Schedule Representation for Heterogeneous Mapping of Dataflow
Graphs ...........................................................................................................................................................................70
Hsiang-Huang Wu, Chung-Ching Shen, Nimish Sane, William Plishker, and Shuvra S. Bhattacharyya
Session III: Heterogeneity: GPUs, Tools, and Data Management
A Waterfall Model to Achieve Energy Efficient Tasks Mapping for Large Scale GPU
Clusters .........................................................................................................................................................................82
Wenjie Liu, Zhihui Du, Yu Xiao, David A. Bader, and Chen Xu
An Agent-Based Approach to Reconciling Data Heterogeneity in Cyber-Physical
Systems .........................................................................................................................................................................93
Jing Lin, Sahra Sedigh, and Ali R. Hurson
NVCR: A Transparent Checkpoint-Restart Library for NVIDIA CUDA ..................................................................104
Akira Nukada, Hiroyuki Takizawa, and Satoshi Matsuoka
Use of Internet Embedding Tools for Heterogeneous Resources Aggregation ..........................................................114
Olivier Beaumont, Nicolas Bonichon, Philippe Duchon, and Hubert Larchevêque
Reconfigurable Architectures Workshop (RAW 2011)
RAW Introduction ......................................................................................................................................................125
Juergen Becker, Pascal Benoit, and René Cumplido
Session 1: Novel Approaches for MPSoC and Multicore Architectures
A High-Level Power Model for MPSoC on FPGA ....................................................................................................128
Roberta Piscitelli and Andy D. Pimentel
Modular Framework for Multi-level Multi-device MPSoC Simulation .....................................................................136
Christoph Roth, Gabriel Marchesan Almeida, Oliver Sander, Luciano Ost, Nicolas Hébert,
Gilles Sassatelli, Pascal Benoit, Lionel Torres, and Jürgen Becker
A Heterogeneous Multicore System on Chip with Run-Time Reconfigurable Virtual
FPGA Architecture .....................................................................................................................................................143
M. Hübner, P. Figuli, R. Girardey, D. Soudris, K. Siozios, and J. Becker
vi
Session 2: New Architectures for Reconfigurable Computing
A Scalable Microarchitecture Design that Enables Dynamic Code Execution
for Variable-Issue Clustered Processors .....................................................................................................................150
Ralf Koenig, Timo Stripf, Jan Heisswolf, and Juergen Becker
CRM: Configurable Range Memory for Fast Reconfigurable Computing ................................................................158
Jongkyung Paek, Jongeun Lee, and Kiyoung Choi
Session 3: Methods and Design Techniques for Partially Reconfigurable Architectures
Tuple Spaces in Hardware for Accelerated Implicit Routing .....................................................................................166
Zachary K. Baker and Justin L. Tripp
High Speed Partial Run-Time Reconfiguration Using Enhanced ICAP Hard Macro ................................................174
Simen Gimle Hansen, Dirk Koch, and Jim Torresen
Online Routing of FPGA Clock Networks for Module Relocation in Partial
Reconfigurable Multi Clock Designs .........................................................................................................................181
Christian Schuck, Bastian Haetzer, Michael Hübner, and Jürgen Becker
Session 4: Improving Security of Reconfigurable Systems
Securing Boot of an Embedded Linux on FPGA .......................................................................................................189
Florian Devic, Lionel Torres, and Benoît Badrignans
Hyperelliptic Curve Cryptoarchitecture for Fast Execution of Schnorr and Okamoto
Authentication Protocols ............................................................................................................................................196
Alexander Klimm, Sebastian Vogel, and Jürgen Becker
A Reconfigurable Multi-core Cryptoprocessor for Multi-channel Communication
Systems .......................................................................................................................................................................204
Michael Grand, Lilian Bossuet, Guy Gogniat, Bertrand Le Gal, Jean-Philippe Delahaye,
and Dominique Dallet
Session 5: Tools for Partially Reconfigurable FPGAs
Migrating Static Systems to Partially Reconfigurable Systems on Spartan-6 FPGAs ...............................................212
Christian Beckhoff, Dirk Koch, and Jim Torresen
ReBit: A Tool to Manage and Analyse FPGA-Based Reconfigurable Systems ........................................................220
M.D. Santambrogio, A. Cazzaniga, A. Bonetto, and Donatella Sciuto
OpenPR: An Open-Source Partial-Reconfiguration Toolkit for Xilinx FPGAs ........................................................228
Ali Asgar Sohanghpurwala, Peter Athanas, Tannous Frangieh, and Aaron Wood
vii
Session 6: Emerging Methods for Coarse Grain Reconfigurable Architectures
Occam-pi as a High-Level Language for Coarse-Grained Reconfigurable Architectures .........................................236
Zain-ul-Abdin and Bertil Svensson
Dynamic Reconfiguration for Irregular Code Using FNC-PAE Processor Cores .....................................................244
Eberhard Schüler, Martin Vorbach, Frank May, and Markus Weinhardt
Session 7: Task Management in Reconfigurable Systems
A Replacement Technique to Maximize Task Reuse in Reconfigurable Systems .....................................................250
Juan Antonio Clemente, Daniel Mozos, and Javier Resano
Integrated Temporal Planning, Module Selection and Placement of Tasks for Dynamic
Networks-on-Chip ......................................................................................................................................................258
Philipp Mahr, Steffen Christgau, Christian Haubelt, and Christophe Bobda
Enhancing Resource Utilization with Design Alternatives in Runtime Reconfigurable
Systems .......................................................................................................................................................................264
Alexander Wold, Dirk Koch, and Jim Torresen
Session 8: Improving Computing Efficiency in FPGAs Based Systems
Asymmetric Large Size Signed Multipliers Using Embedded Blocks in FPGAs ......................................................271
Shuli Gao, Dhamin Al-Khalili, and Noureddine Chabini
Just-in-Time Instruction Set Extension - Feasibility and Limitations for
an FPGA-Based Reconfigurable ASIP Architecture ..................................................................................................278
Mariusz Grad and Christian Plessl
Poster Session 1
Improving Reconfigurable Hardware Energy Efficiency and Robustness
via DVFS-Scaled Homogeneous MP-SoC .................................................................................................................286
Roberto Airoldi, Fabio Garzia, and Jari Nurmi
Compression Based Efficient and Agile Configuration Mechanism for Coarse Grained
Reconfigurable Architectures .....................................................................................................................................290
Syed. M.A.H. Jafri, Ahmed Hemani, Kolin Paul, Juha Plosila, and Hannu Tenhunen
Fast Hardware Computation of x Mod z ....................................................................................................................294
J.T. Butler and T. Sasao
Native Double Precision LINPACK Implementation on a Hybrid Reconfigurable CPU ..........................................298
Thang Viet Huynh, Manfred Mücke, and Wilfried N. Gansterer
FeatureVerilog: Extending Verilog to Support Feature-Oriented Programming .......................................................302
Jun Ye, Qingping Tan, Tun Li, and Guorong Cao
viii
An FPGA-Based Accelerator to Speed-Up Matrix Multiplication of Floating Point
Operations ...................................................................................................................................................................306
B. Holanda, R. Pimentel, J. Barbosa, R. Camarotti, A. Silva-Filho, L. João, V. Souza, J. Ferraz,
and M. Lima
Poster Session 2
The SecretBlaze: A Configurable and Cost-Effective Open-Source Soft-Core
Processor .....................................................................................................................................................................310
Lyonel Barthe, Luís Vitório Cargnini, Pascal Benoit, and Lionel Torres
Bitonic Sorting on Dynamically Reconfigurable Architectures .................................................................................314
J. Angermeier, E. Sibirko, R. Wanka, and J. Teich
Distributed Resource Reservation in Massively Parallel Processor Arrays ...............................................................318
Vahid Lari, Frank Hannig, and Jürgen Teich
Reconfigurable Instruction Decoding for a Wide-Control-Word Processor ..............................................................322
Alen Bardizbanyan, Magnus Själander, and Per Larsson-Edefors
Distributed Security for Communications and Memories in a Multiprocessor
Architecture ................................................................................................................................................................326
Pascal Cotret, Jérémie Crenne, Guy Gogniat, Jean-Philippe Diguet, Lubos Gaspar,
and Guillaume Duc
On a Hybrid and General Control Scheme for Algorithms Represented as a Polytope .............................................330
Roberto Pérez-Andrade, César Torres-Huitzil, René Cumplido, and Juan M. Campos
A Multi-level Reconfigurable Architecture for a Wireless Sensor Node Coprocessing
Unit .............................................................................................................................................................................334
François Philipp and Manfred Glesner
Workshop on Nature Inspired Distributed Computing (NIDISC 2011)
NIDISC Introduction ..................................................................................................................................................338
Pascal Bouvry, Francisek Seredynski, and El-Ghazali Talbi
Session 1: Swarm Intelligence
Parallelization Strategies for Ant Colony Optimisation on GPUs ..............................................................................339
José M. Cecilia, José M. García, Manuel Ujaldón, Andy Nisbet, and Martyn Amos
Multiple Particle Collision Algorithm Applied to Radiative Transference and Pollutant
Localization Inverse Problems ...................................................................................................................................347
Eduardo Fávero Pacheco da Luz, José Carlos Becceneri, and Haroldo Fraga de Campos Velho
A Shared-Memory ACO-Based Algorithm for Numerical Optimization ..................................................................352
Peter Korošec, Jurij Šilc, Marian Vajteršic, and Rade Kutil
Ant Colony Optimization Applied to Route Planning Using Link Travel Time
Predictions ..................................................................................................................................................................358
Rutger Claes and Tom Holvoet
ix
Session 2: Modern Metaheuristics and Real-world Applications
Nature-Inspired Evaluation of Data Classes for Trust Management in MANETs .....................................................366
Marcin Seredynski and Pascal Bouvry
Use of Meta-Heuristics for Design of Fuel Loading Pattern in Light Water Reactors
Comprising Some Radial and Axial Heterogeneities .................................................................................................374
Jean-Michel Do, Jean-Jacques Lautard, Anne-Marie Baudron, Siegfried Douce, and Gilles Arnaud
A Genetic Algorithm with a Penalty Function in the Selective Travelling Salesman
Problem on a Road Network ......................................................................................................................................381
Anna Piwonska and Franciszek Seredynski
Adaptive Neighborhoods for Cellular Genetic Algorithms ........................................................................................388
Bernabé Dorronsoro and Pascal Bouvry
Session 3: Parallel Metaheuristics and Large-scale Problems
Multi-environmental Cooperative Parallel Metaheuristics for Solving Dynamic
Optimization Problems ...............................................................................................................................................395
Mostepha Redouane Khouadjia, Briseida Sarasola, Enrique Alba, Laetitia Jourdan,
and El-Ghazali Talbi
An Energy-Efficient Kernel Framework for Large-Scale Data Modeling
and Classification .......................................................................................................................................................404
Paul D. Yoo, Jason W.P. Ng, and Albert Y. Zomaya
On the Resilience of [Distributed] EAs against Cheaters in Global Computing
Platforms .....................................................................................................................................................................409
Sebastien Varrette, Emilia Tantar, and Pascal Bouvry
Hybrid MPI/OpenMP Strategy for Biological Multiple Sequence Alignment
with DIALIGN-TX in Heterogeneous Multicore Clusters .........................................................................................418
Emerson de Araujo Macedo and Azzedine Boukerche
Workshop on High Performance Computational Biology (HiCOMB 2011)
HiCOMB Introduction ................................................................................................................................................426
Ananth Kalyanaraman, Srinivas Aluru, and David A. Bader
Session I: Genome-scale Analysis
An Ultrafast Scalable Many-Core Motif Discovery Algorithm for Multiple GPUs ..................................................428
Yongchao Liu, Bertil Schmidt, and Douglas L. Maskell
Parallel Mapping Approaches for GNUMAP ............................................................................................................435
Nathan L. Clement, Mark J. Clement, Quinn Snell, and W. Evan Johnson
Computing the Phylogenetic Likelihood Function Out-of-Core ................................................................................444
Fernando Izquierdo-Carrasco and Alexandros Stamatakis
x
Session II: Modeling and Structure Prediction
Algorithms for a Parallel Implementation of Hidden Markov Models with a Small
State Space ..................................................................................................................................................................452
Jesper Nielsen and Andreas Sand
Efficient Nonserial Polyadic Dynamic Programming on the Cell Processor .............................................................460
Li Liu, Mu Wang, Jinlei Jiang, Ruizhe Li, and Guangwen Yang
FENZI: GPU-Enabled Molecular Dynamics Simulations of Large Membrane Regions
Based on the CHARMM Force Field and PME .........................................................................................................472
Narayan Ganesan, Michela Taufer, Brad Bauer, and Sandeep Patel
Session III: Sequence Analysis
Parallelizing BLAST and SOM Algorithms with MapReduce-MPI Library .............................................................481
Seung-Jin Sul and Andrey Tovchigrechko
Improving CUDASW++, a Parallelization of Smith-Waterman for CUDA Enabled
Devices .......................................................................................................................................................................490
Doug Hains, Zach Cashero, Mark Ottenberg, Wim Bohm, and Sanjay Rajopadhye
Session IV: Systems Biology and Functional Genomics
Divide-and-Conquer Approach to the Parallel Computation of Elementary Flux Modes
in Metabolic Networks ...............................................................................................................................................502
Dimitrije Jevremovic, Daniel Boley, and Carlos P. Sosa
Sensitive and Specific Identification of Protein Complexes in “Perturbed” Protein
Interaction Networks from Noisy Pull-Down Data ....................................................................................................512
William Hendrix, Tatiana Karpinets, Byung-Hoon Park, Eric Schendel, Alok Choudhary,
and Nagiza F. Samatova
An Efficient and Scalable Implementation of SNP-Pair Interaction Testing for Genetic
Association Studies .....................................................................................................................................................523
Lars Koesterke, Dan Stanzione, Matt Vaughn, Stephen M. Welch, Waclaw Kusnierczyk,
Jinliang Yang, Cheng-Ting Yeh, Dan Nettleton, and Patrick S. Schnable
Advances in Parallel and Distributed Computing Models (APDCM 2011)
APDCM Introduction .................................................................................................................................................531
Oscar H. Ibarra
Session 1
CRT-Based DSP Decryption Using Montgomery Modular Multiplication on the FPGA .........................................532
Bo Song, Yasuaki Ito, and Koji Nakano
GALS-Based LPSP: Implementation of a Novel Architecture for Low Power High
Performance Security Processors ...............................................................................................................................542
Hala A. Farouk, Mahmoud T. El-Hadidi, and Ahmed Abou El Farag
xi
Using Memory Access Traces to Map Threads and Data on Hierarchical Multi-core
Platforms .....................................................................................................................................................................551
Eduardo Henrique Molina da Cruz, Marco Antonio Zanata Alves, Alexandre Carissimi,
Philippe Olivier Alexandre Navaux, Christiane Pousa Ribeiro, and Jean-François Méhaut
On the Performance of Window-Based Contention Managers for Transactional
Memory ......................................................................................................................................................................559
Gokarna Sharma and Costas Busch
Session 2
Performance Acceleration of Kernel Polynomial Method Applying Graphics
Processing Units .........................................................................................................................................................569
Shixun Zhang, Shinichi Yamagiwa, Masahiko Okumura, and Seiji Yunoki
Efficient Work-Stealing Strategies for Fine-Grain Task Parallelism .........................................................................577
Adnan and Mitsuhisa Sato
A Model for Coherent Distributed Memory for Race Condition Detection ...............................................................584
Franck Butelle and Camille Coti
Exploring Weak Dependencies in DAG Scheduling ..................................................................................................591
Nam Ma, Yinglong Xia, and Viktor K. Prasanna
Session 3
Heuristics and Evaluations of Energy-Aware Task Mapping on Heterogeneous
Multiprocessors ..........................................................................................................................................................599
Wei Sun and Tomoyoshi Sugawara
Reducing Download Times in Peer-to-Peer File Sharing Systems with Stochastic
Service Capacities .......................................................................................................................................................608
Keqin Li
Self-Stabilizing Master-Slave Token Circulation and Efficient Topology Computation
in a Tree of Arbitrary Size ..........................................................................................................................................618
Wayne Goddard and Pradip K. Srimani
Dynamic Monopolies in Colored Tori ........................................................................................................................626
Sara Brunetti, Elena Lodi, and Walter Quattrociocchi
A Matching Based Automata for Distributed Graph Algorithms ...............................................................................632
J. Paul Daigle and Sushil K. Prasad
Session 4
Execution and Time Models for Pervasive Sensor Networks ....................................................................................639
Ajay D. Kshemkalyani, Ashfaq A. Khokhar, and Min Shen
A Packet Aggregation Mechanism for Real Time Applications over Wireless
Networks .....................................................................................................................................................................648
Paulo H. Azevêdo, Marcos F. Caetano, and Jacir L. Bordim
xii
Alphabets: An Extended Polyhedral Equational Language ........................................................................................656
S. Rajopadhye, S. Gupta, and D.-G. Kim
Session 5
From Simulation to Experiment: A Case Study on Multiprocessor Task Scheduling ...............................................665
Sascha Hunold, Henri Casanova, and Frédéric Suter
Distributed Construction of Nested Clusters with Inter-cluster Routing ....................................................................673
Alain Bui, Simon Clavière, and Devan Sohier
Towards an Effective Unified Programming Model for Many-Cores ........................................................................681
Ana Lucia Varbanescu, Pieter Hijma, Rob van Nieuwpoort, and Henri Bal
Analysis of a Memory Bandwidth Limited Scenario for NUMA and GPU Systems ................................................693
Jens Breitbart
The Impact of Non-coherent Buffers on Lazy Hardware Transactional Memory
Systems .......................................................................................................................................................................700
Anurag Negi, Rubén Titos-Gil, Manuel E. Acacio, José M. García, and Per Stenstrom
Elimination Techniques of Redundant Data Transfers Among GPUs and CPU
on Recursive Stream-Based Applications ..................................................................................................................708
Pablo Lamilla Álvarez, Shinichi Yamagiwa, Masahiro Arai, and Koichi Wada
Communication Architecture for Scalable Systems (CASS 2011)
CASS Introduction .....................................................................................................................................................716
José Flich, Scott Pakin, and Craig Stunkel
Session I
Improving Performance of Deterministic Single-Path Routing on 2-Level Generalized
Fat-Trees .....................................................................................................................................................................718
Wickus Nienaber, Santosh Mahapatra, and Xin Yuan
A Unified Algorithm for Both Randomized Deterministic and Adaptive Routing
in Torus Networks ......................................................................................................................................................723
Keith D. Underwood and Eric Borch
Session II
Profile Guided MPI Protocol Selection for Point-to-Point Communication Calls .....................................................733
Aniruddha Marathe, David K. Lowenthal, Zheng Gu, Matthew Small, and Xin Yuan
Dynamic Time-Variant Connection Management for PGAS Models on InfiniBand ................................................740
Abhinav Vishnu, Manoj Krishnan, and Pavan Balaji
Audit: New Synchronization for the GET/PUT Protocol ...........................................................................................747
Atsushi Hori, Jinpil Lee, and Mitsuhisa Sato
xiii
Session III
Efficient and Contention-Free Virtualisation of Fat-Trees .........................................................................................754
Frank Olaf Sem-Jacobsen, Åshild Grønstad Solheim, Olav Lysne, Tor Skeie, and Thomas Sødring
Fat-Trees Routing and Node Ordering Providing Contention Free Traffic for MPI
Global Collectives ......................................................................................................................................................761
Eitan Zahavi
Session IV
Optimizing MPI Collectives Using Efficient Intra-node Communication Techniques
over the Blue Gene/P Supercomputer ........................................................................................................................771
Amith R. Mamidala, Daniel Faraj, Sameer Kumar, Douglas Miller, Michael Blocksome,
Thomas Gooding, Philip Heidelberger, and Gabor Dozsa
ConnectX-2 CORE-Direct Enabled Asynchronous Broadcast Collective
Communications .........................................................................................................................................................781
Manjunath Gorentla Venkata, Richard L. Graham, Joshua S. Ladd, Pavel Shamis,
Ishai Rabinovitz, Vasily Filipov, and Gilad Shainer
Dodging Non-uniform I/O Access in Hierarchical Collective Operations for Multicore
Clusters .......................................................................................................................................................................788
Brice Goglin and Stéphanie Moreaud
High-Performance, Power-Aware Computing (HPPAC 2011)
HPPAC Introduction ...................................................................................................................................................795
Rong Ge and Roberto Gioiosa
Session 1: Power Aware Scheduling
Temperature Aware Load Balancing for Parallel Applications: Preliminary Work ..................................................796
Osman Sarood, Abhishek Gupta, and Laxmikant V. Kalé
Design and Analysis of Heuristic Algorithms for Power-Aware Scheduling
of Precedence Constrained Tasks ...............................................................................................................................804
Keqin Li
Rack Aware Scheduling in HPC Data Centers: An Energy Conservation Strategy ...................................................814
Vikas Ashok Patil and Vipin Chaudhary
Session 2: Applications and Trends
Emerging Trends on the Evolving Green500: Year Three .........................................................................................822
Tom Scogland, Balaji Subramaniam, and Wu-chun Feng
Power Consumption of Mixed Precision in the Iterative Solution of Sparse Linear
Systems .......................................................................................................................................................................829
Hartwig Anzt, Vincent Heuveline, Björn Rocker, Maribel Castillo, Juan C. Fernández,
Rafael Mayo, and Enrique S. Quintana-Ortí
xiv
Dynamic Frequency Scaling and Energy Saving in Quantum Chemistry Applications ............................................837
Vaibhav Sundriyal, Masha Sosonkina, Fang Liu, and Michael W. Schmidt
Evaluation of the Energy Performance of Dense Linear Algebra Kernels on Multi-core
and Many-Core Processors .........................................................................................................................................846
Maribel Castillo, Manel Dolz, Juan C. Fernández, Rafael Mayo, Enrique S. Quintana-Ortí,
and Vicente Roca
Session 3: Low Power Hardware Components
LAPP: A Low Power Array Accelerator with Binary Compatibility .........................................................................854
Naveen Devisetti, Takuya Iwakami, Kazuhiro Yoshimura, Takashi Nakada, Jun Yao,
and Yasuhiko Nakashima
Performance, Power, and Thermal Analysis of Low-Power Processors for Scale-Out
Systems .......................................................................................................................................................................863
Phillip Stanley-Marbell and Victoria Caparrós Cabezas
PEARL and PEACH: A Novel PCI Express Direct Link and Its Implementation ....................................................871
Toshihiro Hanawa, Taisuke Boku, Shin’ichi Miura, Mitsuhisa Sato, and Kazutami Arimoto
High-Performance Grid and Cloud Computing Workshop (HPGC 2011)
HPGC Introduction .....................................................................................................................................................880
Eric Aubanel, Virendra C. Bhavsar, and Michael Alex Frumkin
Cloud Management: Challenges and Opportunities ...................................................................................................881
Tim Forell, Dejan Milojicic, and Vanish Talwar
Session 1: Grid/Cloud Infrastructure I
Energy-Aware Application-Centric VM Allocation for HPC Workloads .................................................................890
H. Viswanathan, E.K. Lee, I. Rodero, D. Pompili, M. Parashar, and M. Gamell
ENIGMA: Distributed Virtual Disks for Cloud Computing ......................................................................................898
Matteo Zola, Valerio Bioglio, Cosimo Anglano, Rossano Gaeta, Marco Grangetto,
and Matteo Sereno
Managed GridFTP ......................................................................................................................................................907
John Bresnahan, Michael Link, Rajkumar Kettimuthu, and Ian Foster
Session 2: Grid/Cloud Infrastructure II
RBAC for High Performance Computing Systems Integration in Grid Computing
and Cloud Computing .................................................................................................................................................914
Anil L. Pereira
e-Science Infrastructure Integration Invariants to Enable HTC and HPC
Interoperability Applications ......................................................................................................................................922
M. Riedel, M.S. Memon, A.S. Memon, D. Mallmann, Th. Lippert, D. Kranzlmüller, and A. Streit
xv
Cyberinfrastructure Usage Modalities on the TeraGrid .............................................................................................932
Daniel S. Katz, David Hart, Chris Jordan, Amit Majumdar, J.P. Navarro, Warren Smith,
John Towns, Von Welch, and Nancy Wilkins-Diehr
Session 3: Scheduling
A Framework for Elastic Execution of Existing MPI Programs ................................................................................940
Aarthi Raveendran, Tekin Bicer, and Gagan Agrawal
Optimal Load Distribution for Multiple Heterogeneous Blade Servers in a Cloud
Computing Environment ............................................................................................................................................948
Keqin Li
Using Hierarchical Dependency Data Flows to Enable Dynamic Scalability on Parallel
Patterns .......................................................................................................................................................................958
Jeremy Villalobos and Barry Wilkinson
Hybrid Heuristic for Scheduling Data Analytics Workflow Applications in Hybrid
Cloud Environment .....................................................................................................................................................966
Mustafizur Rahman, Xiaorong Li, and Henry Palit
Workshop on System Management Techniques, Processes, and Services (SMTPS 2011)
SMTPS Introduction ...................................................................................................................................................975
Kyung Dong Ryu
Session 1: Computing Platform Management
Parallax - A New Operating System for Scalable, Distributed, and Parallel Computing ..........................................976
Rao Mikkilineni and Ian Seyler
Reducing Shared Cache Contention by Scheduling Order Adjustment on Commodity
Multi-cores ..................................................................................................................................................................984
Yingxin Wang, Yan Cui, Pin Tao, Haining Fan, Yu Chen, and Yuanchun Shi
Session 2: Workload Management
Evaluating Load Generation in Virtualized Environments for Software Performance
Testing ........................................................................................................................................................................993
Marco A.S. Netto, Suzane Menon, Hugo V. Vieira, Leandro T. Costa, Flavio M. de Oliveira,
Rodrigo Saad, and Avelino Zorzo
New Metrics for Scheduling Jobs on Cluster of Virtual Machines ..........................................................................1001
Yanbin Liu, Norman Bobroff, Liana Fong, Seetharami Seelam, and Javier Delgado
Secondary Job Scheduling in the Cloud with Deadlines ..........................................................................................1009
Shiyao Chen, Ting He, Ho Yin Starsky Wong, Kang-Won Lee, and Lang Tong
xvi
Session 3: Communication and Security
Characterization of the Communication Patterns of Scientific Applications on Blue
Gene/P ......................................................................................................................................................................1017
Pier Giorgio Raponi, Fabrizio Petrini, Robert Walkup, and Fabio Checconi
Privacy Protection in Service Discovery for Large-Scale Distributed Computing
Systems .....................................................................................................................................................................1025
Jun Yeol Choi, Zhong Yuan Li, Hee Yong Youn, and Ohyoung Song
International Workshop on Data-Intensive Computing in the Clouds (DataCloud 2011)
DataCloud Introduction ............................................................................................................................................1033
Tevfik Kosar and Ioan Raicu
Session 1: Shared I/O
The Gfarm File System on Compute Clouds ............................................................................................................1034
Kenji Kobayashi, Shunsuke Mikami, Hiroki Kimura, and Osamu Tatebe
Evaluating Adaptive Compression to Mitigate the Effects of Shared I/O in Clouds ...............................................1042
Matthias Hovestadt, Odej Kao, Andreas Kliem, and Daniel Warneke
Session II: Workflows
Towards Intelligent Data Placement for Scientific Workflows in Collaborative Cloud
Environment .............................................................................................................................................................1052
Xin Liu and Anwitaman Datta
Designing the Cloud-Based DOE Systems Biology Knowledgebase ......................................................................1062
Carina Lansing, Yan Liu, Jian Yin, Abigail Corrigan, Zoe Guillen, Kerstin Kleese van Dam,
and Ian Gorton
XML Processing in the Cloud: Large-Scale Digital Preservation in Small Institutions ..........................................1072
Peter Wittek, Thierry Jacquin, Hervé Déjean, Jean-Pierre Chanod, and Sándor Darányi
Session III: Transaction Processing
Transactions a la carte - Implementation and Performance Evaluation of Transactional
Support on Top of Amazon S3 .................................................................................................................................1082
Francis Gropengießer and Kai-Uwe Sattler
Performance Evaluation of Range Queries in Key Value Stores .............................................................................1092
Pouria Pirzadeh, Junichi Tatemura, and Hakan Hacigumus
Towards Scalable One-Pass Analytics Using MapReduce ......................................................................................1102
Edward Mazur, Boduo Li, Yanlei Diao, and Prashant Shenoy
xvii
Session IV: MapReduce
iMapReduce: A Distributed Computing Framework for Iterative Computation ......................................................1112
Yanfeng Zhang, Qinxin Gao, Lixin Gao, and Cuirong Wang
An Adaptive Framework for the Execution of Data-Intensive MapReduce Applications
in the Cloud ..............................................................................................................................................................1122
Martin Koehler, Yuriy Kaniovskyi, and Siegfried Benkner
CloudClustering: Toward an Iterative Data Processing Pattern on the Cloud .........................................................1132
Ankur Dave, Wei Lu, Jared Jackson, and Roger Barga
NSF/TCPP Workshop on Parallel and Distributed Computing Education (EduPar-11)
NSF/TCPP Introduction ...........................................................................................................................................1138
Sushil K. Prasad
Workshop on High-Level Parallel Programming Models and Supportive Environments(HIPS 2011)
HIPS Introduction .....................................................................................................................................................1139
Torsten Hoefler
HIPS Keynote ...........................................................................................................................................................1141
John Mellor-Crummey
Message-Driven and Task-Based Environment
Using Shared Arrays in Message-Driven Parallel Programs ....................................................................................1142
Phil Miller, Aaron Becker, and Laxmikant Kalé
DAGuE: A Generic Distributed DAG Engine for High Performance Computing ..................................................1151
George Bosilca, Aurelien Bouteiller, Anthony Danalis, Thomas Herault, Pierre Lemarinier,
and Jack Dongarra
The Essence of Synchronisation in Asynchronous Data Flow .................................................................................1159
Clemens Grelck
GPU Computing and Compilation
Auto-generation of Parallel Finite-Differencing Code for MPI, TBB and CUDA ..................................................1168
D.P. Playne and K.A. Hawick
SkelCL - A Portable Skeleton Library for High-Level GPU Programming ............................................................1176
Michel Steuwer, Philipp Kegel, and Sergei Gorlatch
Harnessing the Power of GPUs without Losing Abstractions in SAC and ArrayOL: A
Comparative Study ...................................................................................................................................................1183
Jing Guo, Wendell Rodrigues, Jeyarajan Thiyagalingam, Frédéric Guyomarc’h, Pierre Boulet,
and Sven-Bodo Scholz
xviii
PGAS and MPI
Address Translation Optimization for Unified Parallel C Multi-dimensional Arrays .............................................1191
Olivier Serres, Ahmad Anbar, Saumil G. Merchant, Abdullah Kayi, and Tarek El-Ghazawi
A (Radical) Proposal Addressing the Non-scalability of the Irregular MPI Collective
Interfaces ..................................................................................................................................................................1199
Jesper Larsson Träff
Preserving Collective Performance across Process Failure for a Fault Tolerant MPI .............................................1208
Joshua Hursey and Richard L. Graham
Exploiting Hierarchical Parallelism Using UPC ......................................................................................................1216
Lingyuan Wang, Saumil Merchant, and Tarek El-Ghazawi
High-Level Programming
Parallel Object Contracts for High Performance Computing ...................................................................................1225
Kostadin Damevski and Tamara Dahlgren
Automatically Inserting Synchronization Statements in Divide-and-Conquer Programs ........................................1233
Pieter Hijma, Rob V. van Nieuwpoort, Ceriel J.H. Jacobs, and Henri E. Bal
Translating Chapel to Use FREERIDE: A Case Study in Using an HPC Language
for Data-Intensive Computing ..................................................................................................................................1242
Bin Ren, Gagan Agrawal, Brad Chamberlain, and Steve Deitz
Large Volume Testing for Executable Formal Specification Using Hadoop ...........................................................1250
Shigeru Kusakabe
Performance Analysis and Debugging
Towards a Methodology for Deliberate Sample-Based Statistical Performance
Analysis ....................................................................................................................................................................1258
Geoff Stoker and Jeffrey K. Hollingsworth
Monitoring and Debugging DryadLINQ Applications with Daphne .......................................................................1266
Vilas Jagannath, Zuoning Yin, and Mihai Budiu
Data Centric Techniques for Mapping Performance Measurements ........................................................................1274
Nick Rutar and Jeffrey K. Hollingsworth
Many- and Multicore Architectures and Optimizations
Toolchain for Programming, Simulating and Studying the XMT Many-Core
Architecture ..............................................................................................................................................................1282
Fuat Keceli, Alexandros Tzannes, George C. Caragea, Rajeev Barua, and Uzi Vishkin
Explicit Platform Descriptions for Heterogeneous Many-Core Architectures .........................................................1292
Martin Sandrieser, Siegfried Benkner, and Sabri Pllana
xix
An Optimized Reduction Design to Minimize Atomic Operations in Shared Memory
Multiprocessors ........................................................................................................................................................1300
Ettore Speziale, Andrea di Biagio, and Giovanni Agosta
Workshop on Parallel and Distributed Scientific and Engineering Computing (PDSEC2011)
PDSEC Introduction .................................................................................................................................................1310
Peter Strazdins, Thomas Rauber, Gudula Rünger, John O’Donnell, and Laurence T. Yang
PDSEC Keynote .......................................................................................................................................................1312
Robert L. Clay
Session 1: Applications
A Graph Model for Minimizing the Storage Overhead of Distributing Data for
the Parallel Solution of Two-Phase Flows ...............................................................................................................1313
Oliver Fortmeier, Alin A. Bastea, and H. Martin Bücker
Profiling Methodology and Performance Tuning of the Met Office Unified Model
for Weather and Climate Simulations .......................................................................................................................1322
Peter E. Strazdins, Margaret Kahn, Joerg Henrichs, Tim Pugh, and Mike Rezny
Dynamic Adaptations in ab-initio Nuclear Physics Calculations on Multicore
Computer Architectures ............................................................................................................................................1332
Avinash Srinivasa, Masha Sosonkina, Pieter Maris, and James P. Vary
Session 2: Applications on GPUs
Efficiently Computing Tensor Eigenvalues on a GPU .............................................................................................1340
Grey Ballard, Tamara Kolda, and Todd Plantenga
Evaluation of Likelihood Functions for Data Analysis on Graphics Processing Units ............................................1349
Sverre Jarp, Alfio Lazzaro, Julien Leduc, Andrzej Nowak, and Felice Pantaleo
Deployment on GPUs of an Application in Computational Atomic Physics ...........................................................1359
P. Fortin, R. Habel, F. Jézéquel, J.L. Lamotte, and N.S. Scott
Parallel Automatic Registration of Large Scale Microscopic Images on Multiprocessor
CPUs and GPUs ........................................................................................................................................................1367
Lee Cooper, Kun Huang, and Manuel Ujaldon
Session 3: Performance Analysis and GPUs
Comprehensive Performance Monitoring for GPU Cluster Systems .......................................................................1377
Karl Fürlinger, Nicholas J. Wright, and David Skinner
GPU Accelerating for Rapid Multi-core Cache Simulation .....................................................................................1387
Wan Han, Long Xiang, Gao Xiaopeng, and Li Yi
xx
Session 4: Multicore and Manycore Processing
A Fast Heuristic for Scheduling Parallel Software with Respect to Energy and Timing
Constraints ................................................................................................................................................................1397
Margarete Sackmann, Peter Ebraert, and Dirk Janssens
Self-Adaptive Evidence Propagation on Manycore Processors ...............................................................................1407
Yinglong Xia and Viktor K. Prasanna
Speculative Contention Avoidance in Software Transactional Memory .................................................................1417
Ehsan Atoofian
Session 5: Algorithms
Hyperspectral Data Processing in a High Performance Computing Environment: A
Parallel Best Band Selection Algorithm ...................................................................................................................1424
Stefan A. Robila and Gerald Busardo
Flexible Development of Dense Linear Algebra Algorithms on Massively Parallel
Architectures with DPLASMA ................................................................................................................................1432
George Bosilca, Aurelien Bouteiller, Anthony Danalis, Mathieu Faverge, Azzam Haidar,
Thomas Herault, Jakub Kurzak, Julien Langou, Pierre Lemarinier, Hatem Ltaief, Piotr Luszczek,
Asim YarKhan, and Jack Dongarra
Exploiting Multi-core Architectures in Clusters for Enhancing the Performance of
the Parallel Bootstrap Simulation Algorithm ...........................................................................................................1442
César A.F. De Rose, Paulo Fernandes, Antonio M. Lima, Afonso Sales, and Thais Webber
Session 6: Distributed Computing and Middleware
An Efficient Search Algorithm without Memory for Peer-to-Peer Cloud Computing
Networks ...................................................................................................................................................................1452
Naixue Xiong, Yuhua Liu, Shishun Wu, Laurence T. Yang, and Kaihua Xu
Asynchronous Peer-to-peer Distributed Computing for Financial Applications .....................................................1458
Thierry Garcia, Ming Chau, The Tung Nguyen, Didier El-Baz, and Pierre Spiteri
The Large Scale Data Facility: Data Intensive Computing for Scientific Experiments ...........................................1467
Ariel O. García, Serguei Bourov, Ahmad Hammad, Jos van Wezel, Bernhard Neumair,
Achim Streit, Volker Hartmann, Thomas Jejkal, Patrick Neuberger, and Rainer Stotzka
Design and Performance of a Scalable, Parallel Statistics Toolkit ...........................................................................1475
Philippe Pébay, David Thompson, Janine Bennett, and Ajith Mascarenhas
Dependable Parallel, Distributed and Network-Centric Systems (DPDNS 2011)
DPDNS Introduction ................................................................................................................................................1485
Dimiter Avresky and Erik Maehle
DPDNS Keynote .......................................................................................................................................................1487
Franck Cappello
xxi
Network Algorithms
Solving k-Set Agreement with Stable Skeleton Graphs ...........................................................................................1488
Martin Biely, Peter Robinson, and Ulrich Schmid
Compact Route Computation: Improving Parallel BGP Route Processing for Scalable
Routers ......................................................................................................................................................................1496
Xuezhi Jiang, Mingwei Xu, and Qi Li
Towards Persistent Connections Using Failure Detectors ........................................................................................1502
Naohiro Hayashibara
Cloud Computing
A Monitoring and Audit Logging Architecture for Data Location Compliance
in Federated Cloud Infrastructures ...........................................................................................................................1510
Philippe Massonet, Syed Naqvi, Christophe Ponsard, Joseph Latanicki, Benny Rochwerger,
and Massimo Villari
Dependable Autonomic Cloud Computing with Information Proxies .....................................................................1518
D. Cenk Erdil
A Fault-Tolerant High Performance Cloud Strategy for Scientific Computing .......................................................1525
Ekpe Okorafor
High Performance Computing
Evaluation of Simple Causal Message Logging for Large-Scale Fault Tolerant HPC
Systems .....................................................................................................................................................................1533
Esteban Meneses, Greg Bronevetsky, and Laxmikant V. Kalé
Algorithm-Based Recovery for Newton’s Method without Checkpointing .............................................................1541
Hui Liu, Teresa Davies, Chong Ding, Christer Karlsson, and Zizhong Chen
Building a Fault Tolerant MPI Application: A Ring Communication Example ......................................................1549
Joshua Hursey and Richard L. Graham
Failure Analysis
Predicting Node Failure in High Performance Computing Systems from Failure
and Usage Logs ........................................................................................................................................................1557
Nithin Nakka, Ankit Agrawal, and Alok Choudhary
Achieving Target MTTF by Duplicating Reliability-Critical Components in High
Performance Computing Systems ............................................................................................................................1567
Nithin Nakka, Alok Choudhary, Gary Grider, John Bent, James Nunez, and Satsangat Khalsa
xxii
International Workshop on Hot Topics in Peer-to-Peer Systems (HOTP2P 2011)
HOTP2P Introduction ...............................................................................................................................................1577
Jiannong Cao
P2P Applications
Controlling P2P Applications via Address Harvesting: The Skype Story ...............................................................1579
Anat Bremler-Barr, Omer Dekel, Ran Goldschmidt, and Hanoch Levy
SeederTrading: Trading Swarm Capacity for Improving Content Distribution .......................................................1587
HyunYong Lee, Masahiro Yoshida, and Akihiro Nakao
Virtual Direction Multicast for Overlay Networks ...................................................................................................1595
Suat Mercan and Murat Yuksel
P2P Optimization I
Parallel Processing Framework on a P2P System Using Map and Reduce Primitives ............................................1602
Kyungyong Lee, Tae Woong Choi, Arijit Ganguly, David I. Wolinsky, P. Oscar Boykin,
and Renato Figueiredo
SPUN: A P2P Probabilistic Search Algorithm Based on Successful Paths
in Unstructured Networks .........................................................................................................................................1610
D.M. Rasanjalee Himali and Sushil K. Prasad
P2P Optimization II
Performance Prediction in a Decentralized Environment for Peer-to-Peer Computing ...........................................1618
Bogdan Florin Cornea, Julien Bourgeois, The Tung Nguyen, and Didier El-Baz
Betweenness Centrality Approximations for an Internet Deployed P2P Reputation
System ......................................................................................................................................................................1627
Dimitra Gkorou, Johan Pouwelse, and Dick Epema
When KAD Meets BitTorrent - Building a Stronger P2P Network .........................................................................1635
Juan Pablo Timpanaro, Thibault Cholez, Isabelle Chrisment, and Olivier Festor
P2P Security
An Adaptive Response Routing Mechanism to Improve DHT Performance in
the Presence of NATs ...............................................................................................................................................1643
David A. Bryan
A New Protocol to Determine the NAT Characteristics of a Host ...........................................................................1651
Sebastian Holzapfel, Matthäus Wander, Arno Wacker, Lorenz Schwittmann, and Torben Weis
xxiii
Workshop on Multi-Threaded Architectures and Applications (MTAAP 2011)
MTAAP Introduction ...............................................................................................................................................1659
Luiz DeRose and Jeffrey Vetter
Programming Models
Using Hardware Transactional Memory for High-Performance Computing ...........................................................1660
Karl Ljungkvist, Martin Tillenius, David Black-Schaffer, Sverker Holmgren, Martin Karlsson,
and Elisabeth Larsson
A Micro Threading Based Concurrency Model for Parallel Computing .................................................................1668
Qiang Yang, C.R. Jesshope, and Jian Fu
Comparison of Parallel Programming Models for Multicore Architectures ............................................................1675
Deepak Shekhar T.C., Kiran Varaganti, Rahul Suresh, Rahul Garg, and Ramalingam Ramamoorthy
Frameworks
An Extended Work-Stealing Framework for Mixed-Mode Parallel Applications ...................................................1683
Martin Wimmer and Jesper Larsson Träff
Tracking Structure of Streaming Social Networks ...................................................................................................1691
David Ediger, Jason Riedy, David A. Bader, and Henning Meyerhenke
Toward Parallel Document Clustering .....................................................................................................................1700
Jace A. Mogill and David J. Haglin
Algorithms and Applications
Evaluating In-Clique and Topological Parallelism Strategies for Junction Tree-Based
Bayesian Network Inference Algorithm on the Cray XMT .....................................................................................1710
George Chin Jr., Sutanay Choudhury, Lars Kangas, Sally McFarlane, and Andres Marquez
High Performance Data Mining Using R on Heterogeneous Platforms ...................................................................1720
Prabhat Kumar, Berkin Ozisikyilmaz, Wei-Keng Liao, Gokhan Memik, and Alok Choudhary
Bandwidth Reduction through Multithreaded Compression of Seismic Images ......................................................1730
Ahmed A. Aqrawi and Anne C. Elster
An Efficient k-Means Algorithm on CUDA ............................................................................................................1740
Jiadong Wu and Bo Hong
Workshop on Large-Scale Parallel Processing (LSPP 2011)
LSPP Introduction ....................................................................................................................................................1750
Darren J. Kerbyson, Ram Rajamony, and Charles Weems
xxiv
Session 1: Hybrid and I/O
Parallel Sparse Matrix-Vector Multiplication as a Test Case for Hybrid MPI+OpenMP
Programming ............................................................................................................................................................1751
Gerald Schubert, Georg Hager, Holger Fehske, and Gerhard Wellein
HIO: A Library for High Performance I/O and Data Management ..........................................................................1759
William W. Dai
Session 2: Multi-threaded Processors
Parallel Processor Core for Semantic Search Engines .............................................................................................1767
Suneil Mohan, Aalap Tripathy, Amitava Biswas, and Rabi Mahapatra
MTASC: A Multithreaded Associative SIMD Processor .........................................................................................1776
Kevin Schaffer and Robert A. Walker
High Precision Integer Multiplication with a GPU ..................................................................................................1781
Niall Emmart and Charles Weems
Session 3: Distributed Systems
Distributed B&B: A Pure Peer-to-Peer Approach ....................................................................................................1788
Mathieu Djamaï, Bilel Derbel, and Nouredine Melab
An Adaptive Framework for Large-Scale State Space Search .................................................................................1798
Yanhua Sun, Gengbin Zheng, Pritish Jetley, and Laxmikant V. Kalé
Fault-Tolerant Mechanism for Hierarchical Branch and Bound Algorithm ............................................................1806
A. Bendjoudi, N. Melab, and E.-G. Talbi
Session 4: Large-Scale Systems
Hierarchical Mapping for HPC Applications ...........................................................................................................1815
I-Hsin Chung, Che-Rung Lee, Jiazheng Zhou, and Yeh-Ching Chung
A Performance Model of Direct Numerical Simulation for Analyzing Large-Scale
Systems .....................................................................................................................................................................1824
Darren J. Kerbyson and Kevin J. Barker
Investigating the Impact of the Cielo Cray XE6 Architecture on Scientific Application
Codes ........................................................................................................................................................................1831
Courtenay Vaughan, Mahesh Rajan, Richard Barrett, Doug Doerfler, and Kevin Pedretti
Workshop on Desktop Grids and Volunteer Computing Systems (PCGRID 2011)
PCGRID Introduction ...............................................................................................................................................1838
Gilles Fedak and Derrick Kondo
xxv
Session I: Resource Management and Scheduling
Emulating Volunteer Computing Scheduling Policies .............................................................................................1839
David P. Anderson
Distributed Results Checking for MapReduce in Volunteer Computing .................................................................1847
Mircea Moca, Gheorghe Cosmin Silaghi, and Gilles Fedak
Volunteer Cloud Computing: MapReduce over the Internet ....................................................................................1855
Fernando Costa, Luis Silva, and Michael Dahlin
Attic: A Case Study for Distributing Data in BOINC Projects ................................................................................1863
AbdelHamid Elwaer, Andrew Harrison, Ian Kelley, and Ian Taylor
Session II: Security and Participation
How to Make BOINC-Based Desktop Grids Even More Popular? .........................................................................1871
Peter Kacsuk
Increasing Participation in Volunteer Computing ....................................................................................................1878
David Toth, Russell Mayer, and Wendy Nichols
DGVCS Security from a Different Perspective: Challenges and Hurdles ...............................................................1883
Tobias Dussa
Session III: Applications and Infrastructure
Evolving N-Body Simulations to Determine the Origin and Structure of the Milky
Way Galaxy’s Halo Using Volunteer Computing ....................................................................................................1888
Travis Desell, Malik Magdon-Ismail, Boleslaw Szymanski, Carlos A. Varela, Benjamin A. Willett,
Matthew Arsenault, and Heidi Newberg
Computing the Tree of Life: Leveraging the Power of Desktop and Service Grids ................................................1896
Adam L. Bazinet and Michael P. Cummings
libboincexec: A Generic Virtualization Approach for the BOINC Middleware ......................................................1903
Diogo Ferreira, Filipe Araujo, and Patricio Domingues
Building an Online Computing Service over Volunteer Grid Resources .................................................................1909
Mark Silberstein
Parallel Computing and Optimization (PCO 2011)
PCO Introduction ......................................................................................................................................................1918
Didier El Baz
Session I: Issues in Optimization of Parallel and Distributed Computing Systems
On-Line Optimization of Publish/Subscribe Overlays .............................................................................................1919
Eddy Caron, Benjamin Depardon, Ajoy K. Datta, and Lawrence L. Larmore
xxvi
Performance Optimization with Energy Constraint in Heterogeneous Multiple
Computer Systems ....................................................................................................................................................1930
Keqin Li
Session II: Parallel Optimization Algorithms and Applications
A Parallel Exact Solver for the Three-Index Quadratic Assignment Problem .........................................................1940
François Galea and Bertrand Le Cun
Distributed Bees Foraging-Based Algorithm for Large-Scale Problems .................................................................1950
Antonio Gómez-Iglesias, Francisco Castejón, and Miguel A. Vega-Rodríguez
Communication in Parallel Algorithms for Constraint-Based Local Search ............................................................1961
Yves Caniou and Philippe Codognet
Session III: Combinatorial Scientific Computing
Distributed-Memory Parallel Algorithms for Matching and Coloring .....................................................................1971
Ümit V. Çatalyürek, Florin Dobrian, Assefaw Gebremedhin, Mahantesh Halappanavar,
and Alex Pothen
Obtaining Simultaneous Equation Models through a Unified Shared-Memory Scheme
of Metaheuristics ......................................................................................................................................................1981
Francisco Almeida, Domingo Giménez, and Jose J. López-Espín
Session IV: New Trends in Parallel Computing for Linear Programming and GlobalOptimization
Automated, Parallel Optimization Algorithms for Stochastic Functions .................................................................1989
Dheeraj Chahal, Steven J. Stuart, Sebastian Goasguen, and Colin J. Trout
Efficient Implementation of the Simplex Method on a CPU-GPU System .............................................................1999
Mohamed Esseghir Lalami, Vincent Boyer, and Didier El-Baz
Workshop on Future Approaches to Data Centric Programming for Exascale (DCPM2011)
DCPM Introduction ..................................................................................................................................................2007
Adolfy Hoisie, Darren J. Kerbyson, and T.P. Straatsma
PhD Forum
Introduction ..............................................................................................................................................................2008
Luc Bougé
An Algorithm-Based Recovery Scheme for Extreme Scale Computing ..................................................................2010
Hui Liu
Lightweight Methods for Automated Design of Self-Stabilization ..........................................................................2014
Aly Farahat and Ali Ebnenasir
Communication Optimization Beyond MPI .............................................................................................................2018
Andrew Friedley and Andrew Lumsdaine
xxvii
Efficient Agreement Protocols in Asynchronous Distributed Systems ....................................................................2022
Izabela Moise
Efficient Verification Solutions for Message Passing Systems ................................................................................2026
Subodh Sharma and Ganesh Gopalakrishnan
Decentralized Network Bandwidth Prediction and Node Search .............................................................................2030
Sukhyun Song
Large-Scale Parallel Monte Carlo Tree Search on GPU ..........................................................................................2034
Kamil Rocki and Reiji Suda
Memory-Aware Algorithms and Scheduling Techniques: From Multicore Processors
to Petascale Supercomputers ....................................................................................................................................2038
Mathias Jacquelin
Memory Hierarchy Aware Parallel Priority Based Data Structures .........................................................................2042
Dinesh Agarwal
Parallel Algorithms for Bayesian Networks Structure Learning with Applications
to Systems Biology ...................................................................................................................................................2045
Olga Nikolova
Fault Tolerant Data Acquisition through Dynamic Load Balancing ........................................................................2049
Michal Simon
A Codesigned Fault Tolerance System for Heterogeneous Many-Core Processors ................................................2053
Keun Soo Yim and Ravishankar K. Iyer
Towards a Storage Backend Optimized for Atomic MPI-I/O for Parallel Scientific
Applications ..............................................................................................................................................................2057
Viet-Trung Tran
Programming Heterogeneous Systems .....................................................................................................................2061
David M. Kunzman and Laxmikant V. Kalé
Data Parallel Programming Model for Many-Core Architectures ...........................................................................2065
Yongpeng Zhang
Detection and Correction of Silent Data Corruption for Large-Scale High-Performance
Computing ................................................................................................................................................................2069
David Fiala
Improving Job Scheduling on Production Supercomputers .....................................................................................2073
Wei Tang, Zhiling Lan, and Narayan Desai
Towards a Self-Adaptive Data Management System for Cloud Environments .......................................................2077
Alexandra Carpen-Amarie
An Integrated Scratch Management Service for HPC Centers .................................................................................2081
Henry M. Monti
Policy Based Data Placement in High Performance Scientific Computing .............................................................2085
Muhammad Ali Amer
xxviii
Automatic Generation of Executable Communication Specifications from Parallel
Applications ..............................................................................................................................................................2089
Xing Wu, Frank Mueller, and Scott Pakin
Scout: High-Performance Heterogeneous Computing Made Simple .......................................................................2093
James A. Jablin, Patrick McCormick, and Maurice Herlihy
Building Dynamic Computing Infrastructures over Distributed Clouds ..................................................................2097
Pierre Riteau
Error Correction and Clustering Algorithms for Next Generation Sequencing .......................................................2101
Xiao Yang
Performance Analysis of Long-Running Applications ............................................................................................2105
Zoltán Szebenyi, Felix Wolf, and Brian J.N. Wylie
p_2Matlab: Productive Parallel Matlab for the Exascale .........................................................................................2109
Vipin Sachdeva
A Framework for Automated Performance Tuning and Code Verification on GPU
Computing Platforms ................................................................................................................................................2113
Allison S. Gehrke, Ilkyeun Ra, and Daniel A. Connors
Author Index - Workshops
xxix