2011 ieee international symposium on parallel and ...toc.proceedings.com/12697webtoc.pdf · 2011...

26
IEEE Catalog Number: ISBN: CFP1151J-PRT 978-1-61284-425-1 2011 IEEE International Symposium on Parallel and Distributed Processing Workshops and Phd Forum (IPDPSW 2011) Anchorage, Alaska, USA 16-20 May 2011 Pages /

Upload: hoangnhan

Post on 17-Nov-2018

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 2011 IEEE International Symposium on Parallel and ...toc.proceedings.com/12697webtoc.pdf · 2011 IEEE International Symposium on Parallel and ... Roberto Pérez-Andrade, ... and Haroldo

Volume 2

IEEE Catalog Number: ISBN:

CFP1151J-PRT 978-1-61284-425-1

2011 IEEE International Symposium on Parallel and Distributed Processing Workshops and Phd Forum (IPDPSW 2011)

Anchorage, Alaska, USA 16-20 May 2011

Pages 1-715

1/3

Page 2: 2011 IEEE International Symposium on Parallel and ...toc.proceedings.com/12697webtoc.pdf · 2011 IEEE International Symposium on Parallel and ... Roberto Pérez-Andrade, ... and Haroldo

2011 IEEE International Parallel &

Distributed Processing Symposium

IPDPS 2011

Table of Contents

Volume - 2

Message from the IPDPS General Chair to the Workshops..............................................................xxx

Message from the Workshops Chair.........................................................................................................xxxii

Message from the PhD Forum Chair.......................................................................................................xxxiii

Heterogeneity in Computing Workshop (HCW 2011)

HCW Introduction ..........................................................................................................................................................1

Anne Benoit

Message from the HCW Steering Committee Chair ......................................................................................................3

H.J. Siegel

Message from the HCW General Chair ..........................................................................................................................4

Anne Benoit

Message from the HCW Program Chair .........................................................................................................................5

Behrooz Shirazi

HCW Keynote ................................................................................................................................................................6

Yves Robert

Session I: Optimization and Evaluation of Heterogeneous Environments

Time Utility Functions for Modeling and Evaluating Resource Allocations in

a Heterogeneous Computing System ..............................................................................................................................7

Luis Diego Briceño, Bhavesh Khemka, Howard Jay Siegel, Anthony A. Maciejewski,

Christopher Groër, Gregory Koenig, Gene Okonski, and Steve Poole

Optimized Barriers for Heterogeneous Systems Using MPI ........................................................................................20

Jan C. Meyer and Anne C. Elster

Characterizing Task-Machine Affinity in Heterogeneous Computing Environments .................................................34

Abdulla M. Al-Qawasmeh, Anthony A. Maciejewski, Rodney G. Roberts, and Howard Jay Siegel

v

Page 3: 2011 IEEE International Symposium on Parallel and ...toc.proceedings.com/12697webtoc.pdf · 2011 IEEE International Symposium on Parallel and ... Roberto Pérez-Andrade, ... and Haroldo

Session II: Scheduling and Resource Allocation in Heterogeneous Environments

Scheduling on Unspecified Heterogeneous Distributed Resources ..............................................................................45

Daniel Millot and Christian Parrot

MO-Greedy: An Extended Beam-Search Approach for Solving a Multi-criteria

Scheduling Problem on Heterogeneous Machines .......................................................................................................57

Louis-Claude Canon and Emmanuel

A Model-Based Schedule Representation for Heterogeneous Mapping of Dataflow

Graphs ...........................................................................................................................................................................70

Hsiang-Huang Wu, Chung-Ching Shen, Nimish Sane, William Plishker, and Shuvra S. Bhattacharyya

Session III: Heterogeneity: GPUs, Tools, and Data Management

A Waterfall Model to Achieve Energy Efficient Tasks Mapping for Large Scale GPU

Clusters .........................................................................................................................................................................82

Wenjie Liu, Zhihui Du, Yu Xiao, David A. Bader, and Chen Xu

An Agent-Based Approach to Reconciling Data Heterogeneity in Cyber-Physical

Systems .........................................................................................................................................................................93

Jing Lin, Sahra Sedigh, and Ali R. Hurson

NVCR: A Transparent Checkpoint-Restart Library for NVIDIA CUDA ..................................................................104

Akira Nukada, Hiroyuki Takizawa, and Satoshi Matsuoka

Use of Internet Embedding Tools for Heterogeneous Resources Aggregation ..........................................................114

Olivier Beaumont, Nicolas Bonichon, Philippe Duchon, and Hubert Larchevêque

Reconfigurable Architectures Workshop (RAW 2011)

RAW Introduction ......................................................................................................................................................125

Juergen Becker, Pascal Benoit, and René Cumplido

Session 1: Novel Approaches for MPSoC and Multicore Architectures

A High-Level Power Model for MPSoC on FPGA ....................................................................................................128

Roberta Piscitelli and Andy D. Pimentel

Modular Framework for Multi-level Multi-device MPSoC Simulation .....................................................................136

Christoph Roth, Gabriel Marchesan Almeida, Oliver Sander, Luciano Ost, Nicolas Hébert,

Gilles Sassatelli, Pascal Benoit, Lionel Torres, and Jürgen Becker

A Heterogeneous Multicore System on Chip with Run-Time Reconfigurable Virtual

FPGA Architecture .....................................................................................................................................................143

M. Hübner, P. Figuli, R. Girardey, D. Soudris, K. Siozios, and J. Becker

vi

Page 4: 2011 IEEE International Symposium on Parallel and ...toc.proceedings.com/12697webtoc.pdf · 2011 IEEE International Symposium on Parallel and ... Roberto Pérez-Andrade, ... and Haroldo

Session 2: New Architectures for Reconfigurable Computing

A Scalable Microarchitecture Design that Enables Dynamic Code Execution

for Variable-Issue Clustered Processors .....................................................................................................................150

Ralf Koenig, Timo Stripf, Jan Heisswolf, and Juergen Becker

CRM: Configurable Range Memory for Fast Reconfigurable Computing ................................................................158

Jongkyung Paek, Jongeun Lee, and Kiyoung Choi

Session 3: Methods and Design Techniques for Partially Reconfigurable Architectures

Tuple Spaces in Hardware for Accelerated Implicit Routing .....................................................................................166

Zachary K. Baker and Justin L. Tripp

High Speed Partial Run-Time Reconfiguration Using Enhanced ICAP Hard Macro ................................................174

Simen Gimle Hansen, Dirk Koch, and Jim Torresen

Online Routing of FPGA Clock Networks for Module Relocation in Partial

Reconfigurable Multi Clock Designs .........................................................................................................................181

Christian Schuck, Bastian Haetzer, Michael Hübner, and Jürgen Becker

Session 4: Improving Security of Reconfigurable Systems

Securing Boot of an Embedded Linux on FPGA .......................................................................................................189

Florian Devic, Lionel Torres, and Benoît Badrignans

Hyperelliptic Curve Cryptoarchitecture for Fast Execution of Schnorr and Okamoto

Authentication Protocols ............................................................................................................................................196

Alexander Klimm, Sebastian Vogel, and Jürgen Becker

A Reconfigurable Multi-core Cryptoprocessor for Multi-channel Communication

Systems .......................................................................................................................................................................204

Michael Grand, Lilian Bossuet, Guy Gogniat, Bertrand Le Gal, Jean-Philippe Delahaye,

and Dominique Dallet

Session 5: Tools for Partially Reconfigurable FPGAs

Migrating Static Systems to Partially Reconfigurable Systems on Spartan-6 FPGAs ...............................................212

Christian Beckhoff, Dirk Koch, and Jim Torresen

ReBit: A Tool to Manage and Analyse FPGA-Based Reconfigurable Systems ........................................................220

M.D. Santambrogio, A. Cazzaniga, A. Bonetto, and Donatella Sciuto

OpenPR: An Open-Source Partial-Reconfiguration Toolkit for Xilinx FPGAs ........................................................228

Ali Asgar Sohanghpurwala, Peter Athanas, Tannous Frangieh, and Aaron Wood

vii

Page 5: 2011 IEEE International Symposium on Parallel and ...toc.proceedings.com/12697webtoc.pdf · 2011 IEEE International Symposium on Parallel and ... Roberto Pérez-Andrade, ... and Haroldo

Session 6: Emerging Methods for Coarse Grain Reconfigurable Architectures

Occam-pi as a High-Level Language for Coarse-Grained Reconfigurable Architectures .........................................236

Zain-ul-Abdin and Bertil Svensson

Dynamic Reconfiguration for Irregular Code Using FNC-PAE Processor Cores .....................................................244

Eberhard Schüler, Martin Vorbach, Frank May, and Markus Weinhardt

Session 7: Task Management in Reconfigurable Systems

A Replacement Technique to Maximize Task Reuse in Reconfigurable Systems .....................................................250

Juan Antonio Clemente, Daniel Mozos, and Javier Resano

Integrated Temporal Planning, Module Selection and Placement of Tasks for Dynamic

Networks-on-Chip ......................................................................................................................................................258

Philipp Mahr, Steffen Christgau, Christian Haubelt, and Christophe Bobda

Enhancing Resource Utilization with Design Alternatives in Runtime Reconfigurable

Systems .......................................................................................................................................................................264

Alexander Wold, Dirk Koch, and Jim Torresen

Session 8: Improving Computing Efficiency in FPGAs Based Systems

Asymmetric Large Size Signed Multipliers Using Embedded Blocks in FPGAs ......................................................271

Shuli Gao, Dhamin Al-Khalili, and Noureddine Chabini

Just-in-Time Instruction Set Extension - Feasibility and Limitations for

an FPGA-Based Reconfigurable ASIP Architecture ..................................................................................................278

Mariusz Grad and Christian Plessl

Poster Session 1

Improving Reconfigurable Hardware Energy Efficiency and Robustness

via DVFS-Scaled Homogeneous MP-SoC .................................................................................................................286

Roberto Airoldi, Fabio Garzia, and Jari Nurmi

Compression Based Efficient and Agile Configuration Mechanism for Coarse Grained

Reconfigurable Architectures .....................................................................................................................................290

Syed. M.A.H. Jafri, Ahmed Hemani, Kolin Paul, Juha Plosila, and Hannu Tenhunen

Fast Hardware Computation of x Mod z ....................................................................................................................294

J.T. Butler and T. Sasao

Native Double Precision LINPACK Implementation on a Hybrid Reconfigurable CPU ..........................................298

Thang Viet Huynh, Manfred Mücke, and Wilfried N. Gansterer

FeatureVerilog: Extending Verilog to Support Feature-Oriented Programming .......................................................302

Jun Ye, Qingping Tan, Tun Li, and Guorong Cao

viii

Page 6: 2011 IEEE International Symposium on Parallel and ...toc.proceedings.com/12697webtoc.pdf · 2011 IEEE International Symposium on Parallel and ... Roberto Pérez-Andrade, ... and Haroldo

An FPGA-Based Accelerator to Speed-Up Matrix Multiplication of Floating Point

Operations ...................................................................................................................................................................306

B. Holanda, R. Pimentel, J. Barbosa, R. Camarotti, A. Silva-Filho, L. João, V. Souza, J. Ferraz,

and M. Lima

Poster Session 2

The SecretBlaze: A Configurable and Cost-Effective Open-Source Soft-Core

Processor .....................................................................................................................................................................310

Lyonel Barthe, Luís Vitório Cargnini, Pascal Benoit, and Lionel Torres

Bitonic Sorting on Dynamically Reconfigurable Architectures .................................................................................314

J. Angermeier, E. Sibirko, R. Wanka, and J. Teich

Distributed Resource Reservation in Massively Parallel Processor Arrays ...............................................................318

Vahid Lari, Frank Hannig, and Jürgen Teich

Reconfigurable Instruction Decoding for a Wide-Control-Word Processor ..............................................................322

Alen Bardizbanyan, Magnus Själander, and Per Larsson-Edefors

Distributed Security for Communications and Memories in a Multiprocessor

Architecture ................................................................................................................................................................326

Pascal Cotret, Jérémie Crenne, Guy Gogniat, Jean-Philippe Diguet, Lubos Gaspar,

and Guillaume Duc

On a Hybrid and General Control Scheme for Algorithms Represented as a Polytope .............................................330

Roberto Pérez-Andrade, César Torres-Huitzil, René Cumplido, and Juan M. Campos

A Multi-level Reconfigurable Architecture for a Wireless Sensor Node Coprocessing

Unit .............................................................................................................................................................................334

François Philipp and Manfred Glesner

Workshop on Nature Inspired Distributed Computing (NIDISC 2011)

NIDISC Introduction ..................................................................................................................................................338

Pascal Bouvry, Francisek Seredynski, and El-Ghazali Talbi

Session 1: Swarm Intelligence

Parallelization Strategies for Ant Colony Optimisation on GPUs ..............................................................................339

José M. Cecilia, José M. García, Manuel Ujaldón, Andy Nisbet, and Martyn Amos

Multiple Particle Collision Algorithm Applied to Radiative Transference and Pollutant

Localization Inverse Problems ...................................................................................................................................347

Eduardo Fávero Pacheco da Luz, José Carlos Becceneri, and Haroldo Fraga de Campos Velho

A Shared-Memory ACO-Based Algorithm for Numerical Optimization ..................................................................352

Peter Korošec, Jurij Šilc, Marian Vajteršic, and Rade Kutil

Ant Colony Optimization Applied to Route Planning Using Link Travel Time

Predictions ..................................................................................................................................................................358

Rutger Claes and Tom Holvoet

ix

Page 7: 2011 IEEE International Symposium on Parallel and ...toc.proceedings.com/12697webtoc.pdf · 2011 IEEE International Symposium on Parallel and ... Roberto Pérez-Andrade, ... and Haroldo

Session 2: Modern Metaheuristics and Real-world Applications

Nature-Inspired Evaluation of Data Classes for Trust Management in MANETs .....................................................366

Marcin Seredynski and Pascal Bouvry

Use of Meta-Heuristics for Design of Fuel Loading Pattern in Light Water Reactors

Comprising Some Radial and Axial Heterogeneities .................................................................................................374

Jean-Michel Do, Jean-Jacques Lautard, Anne-Marie Baudron, Siegfried Douce, and Gilles Arnaud

A Genetic Algorithm with a Penalty Function in the Selective Travelling Salesman

Problem on a Road Network ......................................................................................................................................381

Anna Piwonska and Franciszek Seredynski

Adaptive Neighborhoods for Cellular Genetic Algorithms ........................................................................................388

Bernabé Dorronsoro and Pascal Bouvry

Session 3: Parallel Metaheuristics and Large-scale Problems

Multi-environmental Cooperative Parallel Metaheuristics for Solving Dynamic

Optimization Problems ...............................................................................................................................................395

Mostepha Redouane Khouadjia, Briseida Sarasola, Enrique Alba, Laetitia Jourdan,

and El-Ghazali Talbi

An Energy-Efficient Kernel Framework for Large-Scale Data Modeling

and Classification .......................................................................................................................................................404

Paul D. Yoo, Jason W.P. Ng, and Albert Y. Zomaya

On the Resilience of [Distributed] EAs against Cheaters in Global Computing

Platforms .....................................................................................................................................................................409

Sebastien Varrette, Emilia Tantar, and Pascal Bouvry

Hybrid MPI/OpenMP Strategy for Biological Multiple Sequence Alignment

with DIALIGN-TX in Heterogeneous Multicore Clusters .........................................................................................418

Emerson de Araujo Macedo and Azzedine Boukerche

Workshop on High Performance Computational Biology (HiCOMB 2011)

HiCOMB Introduction ................................................................................................................................................426

Ananth Kalyanaraman, Srinivas Aluru, and David A. Bader

Session I: Genome-scale Analysis

An Ultrafast Scalable Many-Core Motif Discovery Algorithm for Multiple GPUs ..................................................428

Yongchao Liu, Bertil Schmidt, and Douglas L. Maskell

Parallel Mapping Approaches for GNUMAP ............................................................................................................435

Nathan L. Clement, Mark J. Clement, Quinn Snell, and W. Evan Johnson

Computing the Phylogenetic Likelihood Function Out-of-Core ................................................................................444

Fernando Izquierdo-Carrasco and Alexandros Stamatakis

x

Page 8: 2011 IEEE International Symposium on Parallel and ...toc.proceedings.com/12697webtoc.pdf · 2011 IEEE International Symposium on Parallel and ... Roberto Pérez-Andrade, ... and Haroldo

Session II: Modeling and Structure Prediction

Algorithms for a Parallel Implementation of Hidden Markov Models with a Small

State Space ..................................................................................................................................................................452

Jesper Nielsen and Andreas Sand

Efficient Nonserial Polyadic Dynamic Programming on the Cell Processor .............................................................460

Li Liu, Mu Wang, Jinlei Jiang, Ruizhe Li, and Guangwen Yang

FENZI: GPU-Enabled Molecular Dynamics Simulations of Large Membrane Regions

Based on the CHARMM Force Field and PME .........................................................................................................472

Narayan Ganesan, Michela Taufer, Brad Bauer, and Sandeep Patel

Session III: Sequence Analysis

Parallelizing BLAST and SOM Algorithms with MapReduce-MPI Library .............................................................481

Seung-Jin Sul and Andrey Tovchigrechko

Improving CUDASW++, a Parallelization of Smith-Waterman for CUDA Enabled

Devices .......................................................................................................................................................................490

Doug Hains, Zach Cashero, Mark Ottenberg, Wim Bohm, and Sanjay Rajopadhye

Session IV: Systems Biology and Functional Genomics

Divide-and-Conquer Approach to the Parallel Computation of Elementary Flux Modes

in Metabolic Networks ...............................................................................................................................................502

Dimitrije Jevremovic, Daniel Boley, and Carlos P. Sosa

Sensitive and Specific Identification of Protein Complexes in “Perturbed” Protein

Interaction Networks from Noisy Pull-Down Data ....................................................................................................512

William Hendrix, Tatiana Karpinets, Byung-Hoon Park, Eric Schendel, Alok Choudhary,

and Nagiza F. Samatova

An Efficient and Scalable Implementation of SNP-Pair Interaction Testing for Genetic

Association Studies .....................................................................................................................................................523

Lars Koesterke, Dan Stanzione, Matt Vaughn, Stephen M. Welch, Waclaw Kusnierczyk,

Jinliang Yang, Cheng-Ting Yeh, Dan Nettleton, and Patrick S. Schnable

Advances in Parallel and Distributed Computing Models (APDCM 2011)

APDCM Introduction .................................................................................................................................................531

Oscar H. Ibarra

Session 1

CRT-Based DSP Decryption Using Montgomery Modular Multiplication on the FPGA .........................................532

Bo Song, Yasuaki Ito, and Koji Nakano

GALS-Based LPSP: Implementation of a Novel Architecture for Low Power High

Performance Security Processors ...............................................................................................................................542

Hala A. Farouk, Mahmoud T. El-Hadidi, and Ahmed Abou El Farag

xi

Page 9: 2011 IEEE International Symposium on Parallel and ...toc.proceedings.com/12697webtoc.pdf · 2011 IEEE International Symposium on Parallel and ... Roberto Pérez-Andrade, ... and Haroldo

Using Memory Access Traces to Map Threads and Data on Hierarchical Multi-core

Platforms .....................................................................................................................................................................551

Eduardo Henrique Molina da Cruz, Marco Antonio Zanata Alves, Alexandre Carissimi,

Philippe Olivier Alexandre Navaux, Christiane Pousa Ribeiro, and Jean-François Méhaut

On the Performance of Window-Based Contention Managers for Transactional

Memory ......................................................................................................................................................................559

Gokarna Sharma and Costas Busch

Session 2

Performance Acceleration of Kernel Polynomial Method Applying Graphics

Processing Units .........................................................................................................................................................569

Shixun Zhang, Shinichi Yamagiwa, Masahiko Okumura, and Seiji Yunoki

Efficient Work-Stealing Strategies for Fine-Grain Task Parallelism .........................................................................577

Adnan and Mitsuhisa Sato

A Model for Coherent Distributed Memory for Race Condition Detection ...............................................................584

Franck Butelle and Camille Coti

Exploring Weak Dependencies in DAG Scheduling ..................................................................................................591

Nam Ma, Yinglong Xia, and Viktor K. Prasanna

Session 3

Heuristics and Evaluations of Energy-Aware Task Mapping on Heterogeneous

Multiprocessors ..........................................................................................................................................................599

Wei Sun and Tomoyoshi Sugawara

Reducing Download Times in Peer-to-Peer File Sharing Systems with Stochastic

Service Capacities .......................................................................................................................................................608

Keqin Li

Self-Stabilizing Master-Slave Token Circulation and Efficient Topology Computation

in a Tree of Arbitrary Size ..........................................................................................................................................618

Wayne Goddard and Pradip K. Srimani

Dynamic Monopolies in Colored Tori ........................................................................................................................626

Sara Brunetti, Elena Lodi, and Walter Quattrociocchi

A Matching Based Automata for Distributed Graph Algorithms ...............................................................................632

J. Paul Daigle and Sushil K. Prasad

Session 4

Execution and Time Models for Pervasive Sensor Networks ....................................................................................639

Ajay D. Kshemkalyani, Ashfaq A. Khokhar, and Min Shen

A Packet Aggregation Mechanism for Real Time Applications over Wireless

Networks .....................................................................................................................................................................648

Paulo H. Azevêdo, Marcos F. Caetano, and Jacir L. Bordim

xii

Page 10: 2011 IEEE International Symposium on Parallel and ...toc.proceedings.com/12697webtoc.pdf · 2011 IEEE International Symposium on Parallel and ... Roberto Pérez-Andrade, ... and Haroldo

Alphabets: An Extended Polyhedral Equational Language ........................................................................................656

S. Rajopadhye, S. Gupta, and D.-G. Kim

Session 5

From Simulation to Experiment: A Case Study on Multiprocessor Task Scheduling ...............................................665

Sascha Hunold, Henri Casanova, and Frédéric Suter

Distributed Construction of Nested Clusters with Inter-cluster Routing ....................................................................673

Alain Bui, Simon Clavière, and Devan Sohier

Towards an Effective Unified Programming Model for Many-Cores ........................................................................681

Ana Lucia Varbanescu, Pieter Hijma, Rob van Nieuwpoort, and Henri Bal

Analysis of a Memory Bandwidth Limited Scenario for NUMA and GPU Systems ................................................693

Jens Breitbart

The Impact of Non-coherent Buffers on Lazy Hardware Transactional Memory

Systems .......................................................................................................................................................................700

Anurag Negi, Rubén Titos-Gil, Manuel E. Acacio, José M. García, and Per Stenstrom

Elimination Techniques of Redundant Data Transfers Among GPUs and CPU

on Recursive Stream-Based Applications ..................................................................................................................708

Pablo Lamilla Álvarez, Shinichi Yamagiwa, Masahiro Arai, and Koichi Wada

Communication Architecture for Scalable Systems (CASS 2011)

CASS Introduction .....................................................................................................................................................716

José Flich, Scott Pakin, and Craig Stunkel

Session I

Improving Performance of Deterministic Single-Path Routing on 2-Level Generalized

Fat-Trees .....................................................................................................................................................................718

Wickus Nienaber, Santosh Mahapatra, and Xin Yuan

A Unified Algorithm for Both Randomized Deterministic and Adaptive Routing

in Torus Networks ......................................................................................................................................................723

Keith D. Underwood and Eric Borch

Session II

Profile Guided MPI Protocol Selection for Point-to-Point Communication Calls .....................................................733

Aniruddha Marathe, David K. Lowenthal, Zheng Gu, Matthew Small, and Xin Yuan

Dynamic Time-Variant Connection Management for PGAS Models on InfiniBand ................................................740

Abhinav Vishnu, Manoj Krishnan, and Pavan Balaji

Audit: New Synchronization for the GET/PUT Protocol ...........................................................................................747

Atsushi Hori, Jinpil Lee, and Mitsuhisa Sato

xiii

Page 11: 2011 IEEE International Symposium on Parallel and ...toc.proceedings.com/12697webtoc.pdf · 2011 IEEE International Symposium on Parallel and ... Roberto Pérez-Andrade, ... and Haroldo

Session III

Efficient and Contention-Free Virtualisation of Fat-Trees .........................................................................................754

Frank Olaf Sem-Jacobsen, Åshild Grønstad Solheim, Olav Lysne, Tor Skeie, and Thomas Sødring

Fat-Trees Routing and Node Ordering Providing Contention Free Traffic for MPI

Global Collectives ......................................................................................................................................................761

Eitan Zahavi

Session IV

Optimizing MPI Collectives Using Efficient Intra-node Communication Techniques

over the Blue Gene/P Supercomputer ........................................................................................................................771

Amith R. Mamidala, Daniel Faraj, Sameer Kumar, Douglas Miller, Michael Blocksome,

Thomas Gooding, Philip Heidelberger, and Gabor Dozsa

ConnectX-2 CORE-Direct Enabled Asynchronous Broadcast Collective

Communications .........................................................................................................................................................781

Manjunath Gorentla Venkata, Richard L. Graham, Joshua S. Ladd, Pavel Shamis,

Ishai Rabinovitz, Vasily Filipov, and Gilad Shainer

Dodging Non-uniform I/O Access in Hierarchical Collective Operations for Multicore

Clusters .......................................................................................................................................................................788

Brice Goglin and Stéphanie Moreaud

High-Performance, Power-Aware Computing (HPPAC 2011)

HPPAC Introduction ...................................................................................................................................................795

Rong Ge and Roberto Gioiosa

Session 1: Power Aware Scheduling

Temperature Aware Load Balancing for Parallel Applications: Preliminary Work ..................................................796

Osman Sarood, Abhishek Gupta, and Laxmikant V. Kalé

Design and Analysis of Heuristic Algorithms for Power-Aware Scheduling

of Precedence Constrained Tasks ...............................................................................................................................804

Keqin Li

Rack Aware Scheduling in HPC Data Centers: An Energy Conservation Strategy ...................................................814

Vikas Ashok Patil and Vipin Chaudhary

Session 2: Applications and Trends

Emerging Trends on the Evolving Green500: Year Three .........................................................................................822

Tom Scogland, Balaji Subramaniam, and Wu-chun Feng

Power Consumption of Mixed Precision in the Iterative Solution of Sparse Linear

Systems .......................................................................................................................................................................829

Hartwig Anzt, Vincent Heuveline, Björn Rocker, Maribel Castillo, Juan C. Fernández,

Rafael Mayo, and Enrique S. Quintana-Ortí

xiv

Page 12: 2011 IEEE International Symposium on Parallel and ...toc.proceedings.com/12697webtoc.pdf · 2011 IEEE International Symposium on Parallel and ... Roberto Pérez-Andrade, ... and Haroldo

Dynamic Frequency Scaling and Energy Saving in Quantum Chemistry Applications ............................................837

Vaibhav Sundriyal, Masha Sosonkina, Fang Liu, and Michael W. Schmidt

Evaluation of the Energy Performance of Dense Linear Algebra Kernels on Multi-core

and Many-Core Processors .........................................................................................................................................846

Maribel Castillo, Manel Dolz, Juan C. Fernández, Rafael Mayo, Enrique S. Quintana-Ortí,

and Vicente Roca

Session 3: Low Power Hardware Components

LAPP: A Low Power Array Accelerator with Binary Compatibility .........................................................................854

Naveen Devisetti, Takuya Iwakami, Kazuhiro Yoshimura, Takashi Nakada, Jun Yao,

and Yasuhiko Nakashima

Performance, Power, and Thermal Analysis of Low-Power Processors for Scale-Out

Systems .......................................................................................................................................................................863

Phillip Stanley-Marbell and Victoria Caparrós Cabezas

PEARL and PEACH: A Novel PCI Express Direct Link and Its Implementation ....................................................871

Toshihiro Hanawa, Taisuke Boku, Shin’ichi Miura, Mitsuhisa Sato, and Kazutami Arimoto

High-Performance Grid and Cloud Computing Workshop (HPGC 2011)

HPGC Introduction .....................................................................................................................................................880

Eric Aubanel, Virendra C. Bhavsar, and Michael Alex Frumkin

Cloud Management: Challenges and Opportunities ...................................................................................................881

Tim Forell, Dejan Milojicic, and Vanish Talwar

Session 1: Grid/Cloud Infrastructure I

Energy-Aware Application-Centric VM Allocation for HPC Workloads .................................................................890

H. Viswanathan, E.K. Lee, I. Rodero, D. Pompili, M. Parashar, and M. Gamell

ENIGMA: Distributed Virtual Disks for Cloud Computing ......................................................................................898

Matteo Zola, Valerio Bioglio, Cosimo Anglano, Rossano Gaeta, Marco Grangetto,

and Matteo Sereno

Managed GridFTP ......................................................................................................................................................907

John Bresnahan, Michael Link, Rajkumar Kettimuthu, and Ian Foster

Session 2: Grid/Cloud Infrastructure II

RBAC for High Performance Computing Systems Integration in Grid Computing

and Cloud Computing .................................................................................................................................................914

Anil L. Pereira

e-Science Infrastructure Integration Invariants to Enable HTC and HPC

Interoperability Applications ......................................................................................................................................922

M. Riedel, M.S. Memon, A.S. Memon, D. Mallmann, Th. Lippert, D. Kranzlmüller, and A. Streit

xv

Page 13: 2011 IEEE International Symposium on Parallel and ...toc.proceedings.com/12697webtoc.pdf · 2011 IEEE International Symposium on Parallel and ... Roberto Pérez-Andrade, ... and Haroldo

Cyberinfrastructure Usage Modalities on the TeraGrid .............................................................................................932

Daniel S. Katz, David Hart, Chris Jordan, Amit Majumdar, J.P. Navarro, Warren Smith,

John Towns, Von Welch, and Nancy Wilkins-Diehr

Session 3: Scheduling

A Framework for Elastic Execution of Existing MPI Programs ................................................................................940

Aarthi Raveendran, Tekin Bicer, and Gagan Agrawal

Optimal Load Distribution for Multiple Heterogeneous Blade Servers in a Cloud

Computing Environment ............................................................................................................................................948

Keqin Li

Using Hierarchical Dependency Data Flows to Enable Dynamic Scalability on Parallel

Patterns .......................................................................................................................................................................958

Jeremy Villalobos and Barry Wilkinson

Hybrid Heuristic for Scheduling Data Analytics Workflow Applications in Hybrid

Cloud Environment .....................................................................................................................................................966

Mustafizur Rahman, Xiaorong Li, and Henry Palit

Workshop on System Management Techniques, Processes, and Services (SMTPS 2011)

SMTPS Introduction ...................................................................................................................................................975

Kyung Dong Ryu

Session 1: Computing Platform Management

Parallax - A New Operating System for Scalable, Distributed, and Parallel Computing ..........................................976

Rao Mikkilineni and Ian Seyler

Reducing Shared Cache Contention by Scheduling Order Adjustment on Commodity

Multi-cores ..................................................................................................................................................................984

Yingxin Wang, Yan Cui, Pin Tao, Haining Fan, Yu Chen, and Yuanchun Shi

Session 2: Workload Management

Evaluating Load Generation in Virtualized Environments for Software Performance

Testing ........................................................................................................................................................................993

Marco A.S. Netto, Suzane Menon, Hugo V. Vieira, Leandro T. Costa, Flavio M. de Oliveira,

Rodrigo Saad, and Avelino Zorzo

New Metrics for Scheduling Jobs on Cluster of Virtual Machines ..........................................................................1001

Yanbin Liu, Norman Bobroff, Liana Fong, Seetharami Seelam, and Javier Delgado

Secondary Job Scheduling in the Cloud with Deadlines ..........................................................................................1009

Shiyao Chen, Ting He, Ho Yin Starsky Wong, Kang-Won Lee, and Lang Tong

xvi

Page 14: 2011 IEEE International Symposium on Parallel and ...toc.proceedings.com/12697webtoc.pdf · 2011 IEEE International Symposium on Parallel and ... Roberto Pérez-Andrade, ... and Haroldo

Session 3: Communication and Security

Characterization of the Communication Patterns of Scientific Applications on Blue

Gene/P ......................................................................................................................................................................1017

Pier Giorgio Raponi, Fabrizio Petrini, Robert Walkup, and Fabio Checconi

Privacy Protection in Service Discovery for Large-Scale Distributed Computing

Systems .....................................................................................................................................................................1025

Jun Yeol Choi, Zhong Yuan Li, Hee Yong Youn, and Ohyoung Song

International Workshop on Data-Intensive Computing in the Clouds (DataCloud 2011)

DataCloud Introduction ............................................................................................................................................1033

Tevfik Kosar and Ioan Raicu

Session 1: Shared I/O

The Gfarm File System on Compute Clouds ............................................................................................................1034

Kenji Kobayashi, Shunsuke Mikami, Hiroki Kimura, and Osamu Tatebe

Evaluating Adaptive Compression to Mitigate the Effects of Shared I/O in Clouds ...............................................1042

Matthias Hovestadt, Odej Kao, Andreas Kliem, and Daniel Warneke

Session II: Workflows

Towards Intelligent Data Placement for Scientific Workflows in Collaborative Cloud

Environment .............................................................................................................................................................1052

Xin Liu and Anwitaman Datta

Designing the Cloud-Based DOE Systems Biology Knowledgebase ......................................................................1062

Carina Lansing, Yan Liu, Jian Yin, Abigail Corrigan, Zoe Guillen, Kerstin Kleese van Dam,

and Ian Gorton

XML Processing in the Cloud: Large-Scale Digital Preservation in Small Institutions ..........................................1072

Peter Wittek, Thierry Jacquin, Hervé Déjean, Jean-Pierre Chanod, and Sándor Darányi

Session III: Transaction Processing

Transactions a la carte - Implementation and Performance Evaluation of Transactional

Support on Top of Amazon S3 .................................................................................................................................1082

Francis Gropengießer and Kai-Uwe Sattler

Performance Evaluation of Range Queries in Key Value Stores .............................................................................1092

Pouria Pirzadeh, Junichi Tatemura, and Hakan Hacigumus

Towards Scalable One-Pass Analytics Using MapReduce ......................................................................................1102

Edward Mazur, Boduo Li, Yanlei Diao, and Prashant Shenoy

xvii

Page 15: 2011 IEEE International Symposium on Parallel and ...toc.proceedings.com/12697webtoc.pdf · 2011 IEEE International Symposium on Parallel and ... Roberto Pérez-Andrade, ... and Haroldo

Session IV: MapReduce

iMapReduce: A Distributed Computing Framework for Iterative Computation ......................................................1112

Yanfeng Zhang, Qinxin Gao, Lixin Gao, and Cuirong Wang

An Adaptive Framework for the Execution of Data-Intensive MapReduce Applications

in the Cloud ..............................................................................................................................................................1122

Martin Koehler, Yuriy Kaniovskyi, and Siegfried Benkner

CloudClustering: Toward an Iterative Data Processing Pattern on the Cloud .........................................................1132

Ankur Dave, Wei Lu, Jared Jackson, and Roger Barga

NSF/TCPP Workshop on Parallel and Distributed Computing Education (EduPar-11)

NSF/TCPP Introduction ...........................................................................................................................................1138

Sushil K. Prasad

Workshop on High-Level Parallel Programming Models and Supportive Environments(HIPS 2011)

HIPS Introduction .....................................................................................................................................................1139

Torsten Hoefler

HIPS Keynote ...........................................................................................................................................................1141

John Mellor-Crummey

Message-Driven and Task-Based Environment

Using Shared Arrays in Message-Driven Parallel Programs ....................................................................................1142

Phil Miller, Aaron Becker, and Laxmikant Kalé

DAGuE: A Generic Distributed DAG Engine for High Performance Computing ..................................................1151

George Bosilca, Aurelien Bouteiller, Anthony Danalis, Thomas Herault, Pierre Lemarinier,

and Jack Dongarra

The Essence of Synchronisation in Asynchronous Data Flow .................................................................................1159

Clemens Grelck

GPU Computing and Compilation

Auto-generation of Parallel Finite-Differencing Code for MPI, TBB and CUDA ..................................................1168

D.P. Playne and K.A. Hawick

SkelCL - A Portable Skeleton Library for High-Level GPU Programming ............................................................1176

Michel Steuwer, Philipp Kegel, and Sergei Gorlatch

Harnessing the Power of GPUs without Losing Abstractions in SAC and ArrayOL: A

Comparative Study ...................................................................................................................................................1183

Jing Guo, Wendell Rodrigues, Jeyarajan Thiyagalingam, Frédéric Guyomarc’h, Pierre Boulet,

and Sven-Bodo Scholz

xviii

Page 16: 2011 IEEE International Symposium on Parallel and ...toc.proceedings.com/12697webtoc.pdf · 2011 IEEE International Symposium on Parallel and ... Roberto Pérez-Andrade, ... and Haroldo

PGAS and MPI

Address Translation Optimization for Unified Parallel C Multi-dimensional Arrays .............................................1191

Olivier Serres, Ahmad Anbar, Saumil G. Merchant, Abdullah Kayi, and Tarek El-Ghazawi

A (Radical) Proposal Addressing the Non-scalability of the Irregular MPI Collective

Interfaces ..................................................................................................................................................................1199

Jesper Larsson Träff

Preserving Collective Performance across Process Failure for a Fault Tolerant MPI .............................................1208

Joshua Hursey and Richard L. Graham

Exploiting Hierarchical Parallelism Using UPC ......................................................................................................1216

Lingyuan Wang, Saumil Merchant, and Tarek El-Ghazawi

High-Level Programming

Parallel Object Contracts for High Performance Computing ...................................................................................1225

Kostadin Damevski and Tamara Dahlgren

Automatically Inserting Synchronization Statements in Divide-and-Conquer Programs ........................................1233

Pieter Hijma, Rob V. van Nieuwpoort, Ceriel J.H. Jacobs, and Henri E. Bal

Translating Chapel to Use FREERIDE: A Case Study in Using an HPC Language

for Data-Intensive Computing ..................................................................................................................................1242

Bin Ren, Gagan Agrawal, Brad Chamberlain, and Steve Deitz

Large Volume Testing for Executable Formal Specification Using Hadoop ...........................................................1250

Shigeru Kusakabe

Performance Analysis and Debugging

Towards a Methodology for Deliberate Sample-Based Statistical Performance

Analysis ....................................................................................................................................................................1258

Geoff Stoker and Jeffrey K. Hollingsworth

Monitoring and Debugging DryadLINQ Applications with Daphne .......................................................................1266

Vilas Jagannath, Zuoning Yin, and Mihai Budiu

Data Centric Techniques for Mapping Performance Measurements ........................................................................1274

Nick Rutar and Jeffrey K. Hollingsworth

Many- and Multicore Architectures and Optimizations

Toolchain for Programming, Simulating and Studying the XMT Many-Core

Architecture ..............................................................................................................................................................1282

Fuat Keceli, Alexandros Tzannes, George C. Caragea, Rajeev Barua, and Uzi Vishkin

Explicit Platform Descriptions for Heterogeneous Many-Core Architectures .........................................................1292

Martin Sandrieser, Siegfried Benkner, and Sabri Pllana

xix

Page 17: 2011 IEEE International Symposium on Parallel and ...toc.proceedings.com/12697webtoc.pdf · 2011 IEEE International Symposium on Parallel and ... Roberto Pérez-Andrade, ... and Haroldo

An Optimized Reduction Design to Minimize Atomic Operations in Shared Memory

Multiprocessors ........................................................................................................................................................1300

Ettore Speziale, Andrea di Biagio, and Giovanni Agosta

Workshop on Parallel and Distributed Scientific and Engineering Computing (PDSEC2011)

PDSEC Introduction .................................................................................................................................................1310

Peter Strazdins, Thomas Rauber, Gudula Rünger, John O’Donnell, and Laurence T. Yang

PDSEC Keynote .......................................................................................................................................................1312

Robert L. Clay

Session 1: Applications

A Graph Model for Minimizing the Storage Overhead of Distributing Data for

the Parallel Solution of Two-Phase Flows ...............................................................................................................1313

Oliver Fortmeier, Alin A. Bastea, and H. Martin Bücker

Profiling Methodology and Performance Tuning of the Met Office Unified Model

for Weather and Climate Simulations .......................................................................................................................1322

Peter E. Strazdins, Margaret Kahn, Joerg Henrichs, Tim Pugh, and Mike Rezny

Dynamic Adaptations in ab-initio Nuclear Physics Calculations on Multicore

Computer Architectures ............................................................................................................................................1332

Avinash Srinivasa, Masha Sosonkina, Pieter Maris, and James P. Vary

Session 2: Applications on GPUs

Efficiently Computing Tensor Eigenvalues on a GPU .............................................................................................1340

Grey Ballard, Tamara Kolda, and Todd Plantenga

Evaluation of Likelihood Functions for Data Analysis on Graphics Processing Units ............................................1349

Sverre Jarp, Alfio Lazzaro, Julien Leduc, Andrzej Nowak, and Felice Pantaleo

Deployment on GPUs of an Application in Computational Atomic Physics ...........................................................1359

P. Fortin, R. Habel, F. Jézéquel, J.L. Lamotte, and N.S. Scott

Parallel Automatic Registration of Large Scale Microscopic Images on Multiprocessor

CPUs and GPUs ........................................................................................................................................................1367

Lee Cooper, Kun Huang, and Manuel Ujaldon

Session 3: Performance Analysis and GPUs

Comprehensive Performance Monitoring for GPU Cluster Systems .......................................................................1377

Karl Fürlinger, Nicholas J. Wright, and David Skinner

GPU Accelerating for Rapid Multi-core Cache Simulation .....................................................................................1387

Wan Han, Long Xiang, Gao Xiaopeng, and Li Yi

xx

Page 18: 2011 IEEE International Symposium on Parallel and ...toc.proceedings.com/12697webtoc.pdf · 2011 IEEE International Symposium on Parallel and ... Roberto Pérez-Andrade, ... and Haroldo

Session 4: Multicore and Manycore Processing

A Fast Heuristic for Scheduling Parallel Software with Respect to Energy and Timing

Constraints ................................................................................................................................................................1397

Margarete Sackmann, Peter Ebraert, and Dirk Janssens

Self-Adaptive Evidence Propagation on Manycore Processors ...............................................................................1407

Yinglong Xia and Viktor K. Prasanna

Speculative Contention Avoidance in Software Transactional Memory .................................................................1417

Ehsan Atoofian

Session 5: Algorithms

Hyperspectral Data Processing in a High Performance Computing Environment: A

Parallel Best Band Selection Algorithm ...................................................................................................................1424

Stefan A. Robila and Gerald Busardo

Flexible Development of Dense Linear Algebra Algorithms on Massively Parallel

Architectures with DPLASMA ................................................................................................................................1432

George Bosilca, Aurelien Bouteiller, Anthony Danalis, Mathieu Faverge, Azzam Haidar,

Thomas Herault, Jakub Kurzak, Julien Langou, Pierre Lemarinier, Hatem Ltaief, Piotr Luszczek,

Asim YarKhan, and Jack Dongarra

Exploiting Multi-core Architectures in Clusters for Enhancing the Performance of

the Parallel Bootstrap Simulation Algorithm ...........................................................................................................1442

César A.F. De Rose, Paulo Fernandes, Antonio M. Lima, Afonso Sales, and Thais Webber

Session 6: Distributed Computing and Middleware

An Efficient Search Algorithm without Memory for Peer-to-Peer Cloud Computing

Networks ...................................................................................................................................................................1452

Naixue Xiong, Yuhua Liu, Shishun Wu, Laurence T. Yang, and Kaihua Xu

Asynchronous Peer-to-peer Distributed Computing for Financial Applications .....................................................1458

Thierry Garcia, Ming Chau, The Tung Nguyen, Didier El-Baz, and Pierre Spiteri

The Large Scale Data Facility: Data Intensive Computing for Scientific Experiments ...........................................1467

Ariel O. García, Serguei Bourov, Ahmad Hammad, Jos van Wezel, Bernhard Neumair,

Achim Streit, Volker Hartmann, Thomas Jejkal, Patrick Neuberger, and Rainer Stotzka

Design and Performance of a Scalable, Parallel Statistics Toolkit ...........................................................................1475

Philippe Pébay, David Thompson, Janine Bennett, and Ajith Mascarenhas

Dependable Parallel, Distributed and Network-Centric Systems (DPDNS 2011)

DPDNS Introduction ................................................................................................................................................1485

Dimiter Avresky and Erik Maehle

DPDNS Keynote .......................................................................................................................................................1487

Franck Cappello

xxi

Page 19: 2011 IEEE International Symposium on Parallel and ...toc.proceedings.com/12697webtoc.pdf · 2011 IEEE International Symposium on Parallel and ... Roberto Pérez-Andrade, ... and Haroldo

Network Algorithms

Solving k-Set Agreement with Stable Skeleton Graphs ...........................................................................................1488

Martin Biely, Peter Robinson, and Ulrich Schmid

Compact Route Computation: Improving Parallel BGP Route Processing for Scalable

Routers ......................................................................................................................................................................1496

Xuezhi Jiang, Mingwei Xu, and Qi Li

Towards Persistent Connections Using Failure Detectors ........................................................................................1502

Naohiro Hayashibara

Cloud Computing

A Monitoring and Audit Logging Architecture for Data Location Compliance

in Federated Cloud Infrastructures ...........................................................................................................................1510

Philippe Massonet, Syed Naqvi, Christophe Ponsard, Joseph Latanicki, Benny Rochwerger,

and Massimo Villari

Dependable Autonomic Cloud Computing with Information Proxies .....................................................................1518

D. Cenk Erdil

A Fault-Tolerant High Performance Cloud Strategy for Scientific Computing .......................................................1525

Ekpe Okorafor

High Performance Computing

Evaluation of Simple Causal Message Logging for Large-Scale Fault Tolerant HPC

Systems .....................................................................................................................................................................1533

Esteban Meneses, Greg Bronevetsky, and Laxmikant V. Kalé

Algorithm-Based Recovery for Newton’s Method without Checkpointing .............................................................1541

Hui Liu, Teresa Davies, Chong Ding, Christer Karlsson, and Zizhong Chen

Building a Fault Tolerant MPI Application: A Ring Communication Example ......................................................1549

Joshua Hursey and Richard L. Graham

Failure Analysis

Predicting Node Failure in High Performance Computing Systems from Failure

and Usage Logs ........................................................................................................................................................1557

Nithin Nakka, Ankit Agrawal, and Alok Choudhary

Achieving Target MTTF by Duplicating Reliability-Critical Components in High

Performance Computing Systems ............................................................................................................................1567

Nithin Nakka, Alok Choudhary, Gary Grider, John Bent, James Nunez, and Satsangat Khalsa

xxii

Page 20: 2011 IEEE International Symposium on Parallel and ...toc.proceedings.com/12697webtoc.pdf · 2011 IEEE International Symposium on Parallel and ... Roberto Pérez-Andrade, ... and Haroldo

International Workshop on Hot Topics in Peer-to-Peer Systems (HOTP2P 2011)

HOTP2P Introduction ...............................................................................................................................................1577

Jiannong Cao

P2P Applications

Controlling P2P Applications via Address Harvesting: The Skype Story ...............................................................1579

Anat Bremler-Barr, Omer Dekel, Ran Goldschmidt, and Hanoch Levy

SeederTrading: Trading Swarm Capacity for Improving Content Distribution .......................................................1587

HyunYong Lee, Masahiro Yoshida, and Akihiro Nakao

Virtual Direction Multicast for Overlay Networks ...................................................................................................1595

Suat Mercan and Murat Yuksel

P2P Optimization I

Parallel Processing Framework on a P2P System Using Map and Reduce Primitives ............................................1602

Kyungyong Lee, Tae Woong Choi, Arijit Ganguly, David I. Wolinsky, P. Oscar Boykin,

and Renato Figueiredo

SPUN: A P2P Probabilistic Search Algorithm Based on Successful Paths

in Unstructured Networks .........................................................................................................................................1610

D.M. Rasanjalee Himali and Sushil K. Prasad

P2P Optimization II

Performance Prediction in a Decentralized Environment for Peer-to-Peer Computing ...........................................1618

Bogdan Florin Cornea, Julien Bourgeois, The Tung Nguyen, and Didier El-Baz

Betweenness Centrality Approximations for an Internet Deployed P2P Reputation

System ......................................................................................................................................................................1627

Dimitra Gkorou, Johan Pouwelse, and Dick Epema

When KAD Meets BitTorrent - Building a Stronger P2P Network .........................................................................1635

Juan Pablo Timpanaro, Thibault Cholez, Isabelle Chrisment, and Olivier Festor

P2P Security

An Adaptive Response Routing Mechanism to Improve DHT Performance in

the Presence of NATs ...............................................................................................................................................1643

David A. Bryan

A New Protocol to Determine the NAT Characteristics of a Host ...........................................................................1651

Sebastian Holzapfel, Matthäus Wander, Arno Wacker, Lorenz Schwittmann, and Torben Weis

xxiii

Page 21: 2011 IEEE International Symposium on Parallel and ...toc.proceedings.com/12697webtoc.pdf · 2011 IEEE International Symposium on Parallel and ... Roberto Pérez-Andrade, ... and Haroldo

Workshop on Multi-Threaded Architectures and Applications (MTAAP 2011)

MTAAP Introduction ...............................................................................................................................................1659

Luiz DeRose and Jeffrey Vetter

Programming Models

Using Hardware Transactional Memory for High-Performance Computing ...........................................................1660

Karl Ljungkvist, Martin Tillenius, David Black-Schaffer, Sverker Holmgren, Martin Karlsson,

and Elisabeth Larsson

A Micro Threading Based Concurrency Model for Parallel Computing .................................................................1668

Qiang Yang, C.R. Jesshope, and Jian Fu

Comparison of Parallel Programming Models for Multicore Architectures ............................................................1675

Deepak Shekhar T.C., Kiran Varaganti, Rahul Suresh, Rahul Garg, and Ramalingam Ramamoorthy

Frameworks

An Extended Work-Stealing Framework for Mixed-Mode Parallel Applications ...................................................1683

Martin Wimmer and Jesper Larsson Träff

Tracking Structure of Streaming Social Networks ...................................................................................................1691

David Ediger, Jason Riedy, David A. Bader, and Henning Meyerhenke

Toward Parallel Document Clustering .....................................................................................................................1700

Jace A. Mogill and David J. Haglin

Algorithms and Applications

Evaluating In-Clique and Topological Parallelism Strategies for Junction Tree-Based

Bayesian Network Inference Algorithm on the Cray XMT .....................................................................................1710

George Chin Jr., Sutanay Choudhury, Lars Kangas, Sally McFarlane, and Andres Marquez

High Performance Data Mining Using R on Heterogeneous Platforms ...................................................................1720

Prabhat Kumar, Berkin Ozisikyilmaz, Wei-Keng Liao, Gokhan Memik, and Alok Choudhary

Bandwidth Reduction through Multithreaded Compression of Seismic Images ......................................................1730

Ahmed A. Aqrawi and Anne C. Elster

An Efficient k-Means Algorithm on CUDA ............................................................................................................1740

Jiadong Wu and Bo Hong

Workshop on Large-Scale Parallel Processing (LSPP 2011)

LSPP Introduction ....................................................................................................................................................1750

Darren J. Kerbyson, Ram Rajamony, and Charles Weems

xxiv

Page 22: 2011 IEEE International Symposium on Parallel and ...toc.proceedings.com/12697webtoc.pdf · 2011 IEEE International Symposium on Parallel and ... Roberto Pérez-Andrade, ... and Haroldo

Session 1: Hybrid and I/O

Parallel Sparse Matrix-Vector Multiplication as a Test Case for Hybrid MPI+OpenMP

Programming ............................................................................................................................................................1751

Gerald Schubert, Georg Hager, Holger Fehske, and Gerhard Wellein

HIO: A Library for High Performance I/O and Data Management ..........................................................................1759

William W. Dai

Session 2: Multi-threaded Processors

Parallel Processor Core for Semantic Search Engines .............................................................................................1767

Suneil Mohan, Aalap Tripathy, Amitava Biswas, and Rabi Mahapatra

MTASC: A Multithreaded Associative SIMD Processor .........................................................................................1776

Kevin Schaffer and Robert A. Walker

High Precision Integer Multiplication with a GPU ..................................................................................................1781

Niall Emmart and Charles Weems

Session 3: Distributed Systems

Distributed B&B: A Pure Peer-to-Peer Approach ....................................................................................................1788

Mathieu Djamaï, Bilel Derbel, and Nouredine Melab

An Adaptive Framework for Large-Scale State Space Search .................................................................................1798

Yanhua Sun, Gengbin Zheng, Pritish Jetley, and Laxmikant V. Kalé

Fault-Tolerant Mechanism for Hierarchical Branch and Bound Algorithm ............................................................1806

A. Bendjoudi, N. Melab, and E.-G. Talbi

Session 4: Large-Scale Systems

Hierarchical Mapping for HPC Applications ...........................................................................................................1815

I-Hsin Chung, Che-Rung Lee, Jiazheng Zhou, and Yeh-Ching Chung

A Performance Model of Direct Numerical Simulation for Analyzing Large-Scale

Systems .....................................................................................................................................................................1824

Darren J. Kerbyson and Kevin J. Barker

Investigating the Impact of the Cielo Cray XE6 Architecture on Scientific Application

Codes ........................................................................................................................................................................1831

Courtenay Vaughan, Mahesh Rajan, Richard Barrett, Doug Doerfler, and Kevin Pedretti

Workshop on Desktop Grids and Volunteer Computing Systems (PCGRID 2011)

PCGRID Introduction ...............................................................................................................................................1838

Gilles Fedak and Derrick Kondo

xxv

Page 23: 2011 IEEE International Symposium on Parallel and ...toc.proceedings.com/12697webtoc.pdf · 2011 IEEE International Symposium on Parallel and ... Roberto Pérez-Andrade, ... and Haroldo

Session I: Resource Management and Scheduling

Emulating Volunteer Computing Scheduling Policies .............................................................................................1839

David P. Anderson

Distributed Results Checking for MapReduce in Volunteer Computing .................................................................1847

Mircea Moca, Gheorghe Cosmin Silaghi, and Gilles Fedak

Volunteer Cloud Computing: MapReduce over the Internet ....................................................................................1855

Fernando Costa, Luis Silva, and Michael Dahlin

Attic: A Case Study for Distributing Data in BOINC Projects ................................................................................1863

AbdelHamid Elwaer, Andrew Harrison, Ian Kelley, and Ian Taylor

Session II: Security and Participation

How to Make BOINC-Based Desktop Grids Even More Popular? .........................................................................1871

Peter Kacsuk

Increasing Participation in Volunteer Computing ....................................................................................................1878

David Toth, Russell Mayer, and Wendy Nichols

DGVCS Security from a Different Perspective: Challenges and Hurdles ...............................................................1883

Tobias Dussa

Session III: Applications and Infrastructure

Evolving N-Body Simulations to Determine the Origin and Structure of the Milky

Way Galaxy’s Halo Using Volunteer Computing ....................................................................................................1888

Travis Desell, Malik Magdon-Ismail, Boleslaw Szymanski, Carlos A. Varela, Benjamin A. Willett,

Matthew Arsenault, and Heidi Newberg

Computing the Tree of Life: Leveraging the Power of Desktop and Service Grids ................................................1896

Adam L. Bazinet and Michael P. Cummings

libboincexec: A Generic Virtualization Approach for the BOINC Middleware ......................................................1903

Diogo Ferreira, Filipe Araujo, and Patricio Domingues

Building an Online Computing Service over Volunteer Grid Resources .................................................................1909

Mark Silberstein

Parallel Computing and Optimization (PCO 2011)

PCO Introduction ......................................................................................................................................................1918

Didier El Baz

Session I: Issues in Optimization of Parallel and Distributed Computing Systems

On-Line Optimization of Publish/Subscribe Overlays .............................................................................................1919

Eddy Caron, Benjamin Depardon, Ajoy K. Datta, and Lawrence L. Larmore

xxvi

Page 24: 2011 IEEE International Symposium on Parallel and ...toc.proceedings.com/12697webtoc.pdf · 2011 IEEE International Symposium on Parallel and ... Roberto Pérez-Andrade, ... and Haroldo

Performance Optimization with Energy Constraint in Heterogeneous Multiple

Computer Systems ....................................................................................................................................................1930

Keqin Li

Session II: Parallel Optimization Algorithms and Applications

A Parallel Exact Solver for the Three-Index Quadratic Assignment Problem .........................................................1940

François Galea and Bertrand Le Cun

Distributed Bees Foraging-Based Algorithm for Large-Scale Problems .................................................................1950

Antonio Gómez-Iglesias, Francisco Castejón, and Miguel A. Vega-Rodríguez

Communication in Parallel Algorithms for Constraint-Based Local Search ............................................................1961

Yves Caniou and Philippe Codognet

Session III: Combinatorial Scientific Computing

Distributed-Memory Parallel Algorithms for Matching and Coloring .....................................................................1971

Ümit V. Çatalyürek, Florin Dobrian, Assefaw Gebremedhin, Mahantesh Halappanavar,

and Alex Pothen

Obtaining Simultaneous Equation Models through a Unified Shared-Memory Scheme

of Metaheuristics ......................................................................................................................................................1981

Francisco Almeida, Domingo Giménez, and Jose J. López-Espín

Session IV: New Trends in Parallel Computing for Linear Programming and GlobalOptimization

Automated, Parallel Optimization Algorithms for Stochastic Functions .................................................................1989

Dheeraj Chahal, Steven J. Stuart, Sebastian Goasguen, and Colin J. Trout

Efficient Implementation of the Simplex Method on a CPU-GPU System .............................................................1999

Mohamed Esseghir Lalami, Vincent Boyer, and Didier El-Baz

Workshop on Future Approaches to Data Centric Programming for Exascale (DCPM2011)

DCPM Introduction ..................................................................................................................................................2007

Adolfy Hoisie, Darren J. Kerbyson, and T.P. Straatsma

PhD Forum

Introduction ..............................................................................................................................................................2008

Luc Bougé

An Algorithm-Based Recovery Scheme for Extreme Scale Computing ..................................................................2010

Hui Liu

Lightweight Methods for Automated Design of Self-Stabilization ..........................................................................2014

Aly Farahat and Ali Ebnenasir

Communication Optimization Beyond MPI .............................................................................................................2018

Andrew Friedley and Andrew Lumsdaine

xxvii

Page 25: 2011 IEEE International Symposium on Parallel and ...toc.proceedings.com/12697webtoc.pdf · 2011 IEEE International Symposium on Parallel and ... Roberto Pérez-Andrade, ... and Haroldo

Efficient Agreement Protocols in Asynchronous Distributed Systems ....................................................................2022

Izabela Moise

Efficient Verification Solutions for Message Passing Systems ................................................................................2026

Subodh Sharma and Ganesh Gopalakrishnan

Decentralized Network Bandwidth Prediction and Node Search .............................................................................2030

Sukhyun Song

Large-Scale Parallel Monte Carlo Tree Search on GPU ..........................................................................................2034

Kamil Rocki and Reiji Suda

Memory-Aware Algorithms and Scheduling Techniques: From Multicore Processors

to Petascale Supercomputers ....................................................................................................................................2038

Mathias Jacquelin

Memory Hierarchy Aware Parallel Priority Based Data Structures .........................................................................2042

Dinesh Agarwal

Parallel Algorithms for Bayesian Networks Structure Learning with Applications

to Systems Biology ...................................................................................................................................................2045

Olga Nikolova

Fault Tolerant Data Acquisition through Dynamic Load Balancing ........................................................................2049

Michal Simon

A Codesigned Fault Tolerance System for Heterogeneous Many-Core Processors ................................................2053

Keun Soo Yim and Ravishankar K. Iyer

Towards a Storage Backend Optimized for Atomic MPI-I/O for Parallel Scientific

Applications ..............................................................................................................................................................2057

Viet-Trung Tran

Programming Heterogeneous Systems .....................................................................................................................2061

David M. Kunzman and Laxmikant V. Kalé

Data Parallel Programming Model for Many-Core Architectures ...........................................................................2065

Yongpeng Zhang

Detection and Correction of Silent Data Corruption for Large-Scale High-Performance

Computing ................................................................................................................................................................2069

David Fiala

Improving Job Scheduling on Production Supercomputers .....................................................................................2073

Wei Tang, Zhiling Lan, and Narayan Desai

Towards a Self-Adaptive Data Management System for Cloud Environments .......................................................2077

Alexandra Carpen-Amarie

An Integrated Scratch Management Service for HPC Centers .................................................................................2081

Henry M. Monti

Policy Based Data Placement in High Performance Scientific Computing .............................................................2085

Muhammad Ali Amer

xxviii

Page 26: 2011 IEEE International Symposium on Parallel and ...toc.proceedings.com/12697webtoc.pdf · 2011 IEEE International Symposium on Parallel and ... Roberto Pérez-Andrade, ... and Haroldo

Automatic Generation of Executable Communication Specifications from Parallel

Applications ..............................................................................................................................................................2089

Xing Wu, Frank Mueller, and Scott Pakin

Scout: High-Performance Heterogeneous Computing Made Simple .......................................................................2093

James A. Jablin, Patrick McCormick, and Maurice Herlihy

Building Dynamic Computing Infrastructures over Distributed Clouds ..................................................................2097

Pierre Riteau

Error Correction and Clustering Algorithms for Next Generation Sequencing .......................................................2101

Xiao Yang

Performance Analysis of Long-Running Applications ............................................................................................2105

Zoltán Szebenyi, Felix Wolf, and Brian J.N. Wylie

p_2Matlab: Productive Parallel Matlab for the Exascale .........................................................................................2109

Vipin Sachdeva

A Framework for Automated Performance Tuning and Code Verification on GPU

Computing Platforms ................................................................................................................................................2113

Allison S. Gehrke, Ilkyeun Ra, and Daniel A. Connors

Author Index - Workshops

xxix