high performance computing with the cell broadband...

3
Scientific Programming 17 (2009) 1–2 1 DOI 10.3233/SPR-2009-0279 IOS Press Guest Editorial High Performance Computing with the Cell Broadband Engine The Cell Broadband Engine was conceived to enable the design of novel and highly efficient systems for compute-intensive applications. The Cell/B.E. departs from prior architectures by adopting a heterogeneous chip multiprocessor architecture with novel accelera- tor cores and an explicitly managed memory hierar- chy. The increased computing density of the design im- proves peak performance as well as power efficiency. Beyond the Sony PlayStation 3, its first target sys- tem, the Cell Broadband Engine architecture has been used in systems ranging from compute-intensive em- bedded systems to high-end compute servers, demon- strating the versatility and power of its new concept. In 2008 the IBM RoadRunner supercomputer, incor- porating nearly 13,000 Cell/B.E. processors, became the first system in the world to exceed the 1 Petaflop per second performance barrier as measured by the TOP500 list of the world’s most powerful supercom- puters. In the same year, Cell-based systems were also the world’s most power-efficient systems as measured by the Green 500 list. The Cell/B.E.’s departure from a traditional homo- geneous SMP system design with hardware-managed caches change algorithms, programming models, com- pilers and run-time systems. The Cell/B.E. architec- ture offers new ways for software to control hardware resources that challenge software architects and algo- rithm researchers to rethink algorithms and computing paradigms to exploit its greatly increased performance potential. Peak performance is obtained when vector opera- tions are fully utilized in each accelerator core (SPE), all cores are simultaneously fully engaged, and by overlapping communication with computation using bulk DMA transfers between main memory and the SPEs. This benefits from a contiguous data layout of arrays as bulk DMA transfers are most efficient on contiguous data residing in memory. Such a layout for a 2-D array is square block format. This choice leads to very high Cell/B.E. performance of kernel rou- tines that operate on these square block sub arrays si- multaneously in the SPEs. In particular, this issue de- scribes nearly perfectly performing matrix multiplica- tion kernel routines. By exploiting these array layouts and overlapping Cell/B.E. bulk DMA transfers with computation to ensure that kernel routines will always have their new sub matrix operands in their SPE local stores ready for their next invocations, the Cell/B.E. usually reaches its full potential on the most important dense linear algebra algorithms. The ultimate measure of the success of an architec- ture is the applications that it enables and the scientific breakthroughs that become possible with those appli- cations. Thus, the spotlight in this issue is on innova- tive algorithms, programming models and application architectures to exploit this new hardware design. The papers included in this special issue fall in three categories, covering performance engineering for key algorithms and library functions; programming models and runtime systems; and, most importantly, applica- tions that build on these foundations to enable scien- tific discoveries across a wide range of disciplines. We received 42 submissions from 16 countries in re- sponse to our call for papers. To select the best of these papers, we relied on an Editorial Board consisting of experts covering all areas of high-performance com- puting as well as on external subject matter experts. Together, they produced 138 reviews, or an average of over 3 reviews per paper. Based on these reviews, we selected twelve papers for this special issue. In the performance engineering and library develop- ment category: Implementing a Parallel Matrix Factorization Li- brary on the Cell Broadband Engine QR Factorization for the CELL Processor 1058-9244/09/$17.00 © 2009 – IOS Press and the authors. All rights reserved

Upload: others

Post on 11-Sep-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: High Performance Computing with the Cell Broadband Enginedownloads.hindawi.com/journals/sp/2009/979236.pdf · tem, the Cell Broadband Engine architecture has been used in systems

Scientific Programming 17 (2009) 1–2 1DOI 10.3233/SPR-2009-0279IOS Press

Guest Editorial

High Performance Computing with the CellBroadband Engine

The Cell Broadband Engine was conceived to enablethe design of novel and highly efficient systems forcompute-intensive applications. The Cell/B.E. departsfrom prior architectures by adopting a heterogeneouschip multiprocessor architecture with novel accelera-tor cores and an explicitly managed memory hierar-chy. The increased computing density of the design im-proves peak performance as well as power efficiency.

Beyond the Sony PlayStation 3, its first target sys-tem, the Cell Broadband Engine architecture has beenused in systems ranging from compute-intensive em-bedded systems to high-end compute servers, demon-strating the versatility and power of its new concept.In 2008 the IBM RoadRunner supercomputer, incor-porating nearly 13,000 Cell/B.E. processors, becamethe first system in the world to exceed the 1 Petaflopper second performance barrier as measured by theTOP500 list of the world’s most powerful supercom-puters. In the same year, Cell-based systems were alsothe world’s most power-efficient systems as measuredby the Green 500 list.

The Cell/B.E.’s departure from a traditional homo-geneous SMP system design with hardware-managedcaches change algorithms, programming models, com-pilers and run-time systems. The Cell/B.E. architec-ture offers new ways for software to control hardwareresources that challenge software architects and algo-rithm researchers to rethink algorithms and computingparadigms to exploit its greatly increased performancepotential.

Peak performance is obtained when vector opera-tions are fully utilized in each accelerator core (SPE),all cores are simultaneously fully engaged, and byoverlapping communication with computation usingbulk DMA transfers between main memory and theSPEs. This benefits from a contiguous data layout ofarrays as bulk DMA transfers are most efficient oncontiguous data residing in memory. Such a layout

for a 2-D array is square block format. This choiceleads to very high Cell/B.E. performance of kernel rou-tines that operate on these square block sub arrays si-multaneously in the SPEs. In particular, this issue de-scribes nearly perfectly performing matrix multiplica-tion kernel routines. By exploiting these array layoutsand overlapping Cell/B.E. bulk DMA transfers withcomputation to ensure that kernel routines will alwayshave their new sub matrix operands in their SPE localstores ready for their next invocations, the Cell/B.E.usually reaches its full potential on the most importantdense linear algebra algorithms.

The ultimate measure of the success of an architec-ture is the applications that it enables and the scientificbreakthroughs that become possible with those appli-cations. Thus, the spotlight in this issue is on innova-tive algorithms, programming models and applicationarchitectures to exploit this new hardware design.

The papers included in this special issue fall in threecategories, covering performance engineering for keyalgorithms and library functions; programming modelsand runtime systems; and, most importantly, applica-tions that build on these foundations to enable scien-tific discoveries across a wide range of disciplines.

We received 42 submissions from 16 countries in re-sponse to our call for papers. To select the best of thesepapers, we relied on an Editorial Board consisting ofexperts covering all areas of high-performance com-puting as well as on external subject matter experts.Together, they produced 138 reviews, or an average ofover 3 reviews per paper. Based on these reviews, weselected twelve papers for this special issue.

In the performance engineering and library develop-ment category:

– Implementing a Parallel Matrix Factorization Li-brary on the Cell Broadband Engine

– QR Factorization for the CELL Processor

1058-9244/09/$17.00 © 2009 – IOS Press and the authors. All rights reserved

Page 2: High Performance Computing with the Cell Broadband Enginedownloads.hindawi.com/journals/sp/2009/979236.pdf · tem, the Cell Broadband Engine architecture has been used in systems

2 Guest Editorial

– Programming the Linpack Benchmark for theIBM PowerXCell 8i Processor

In the programming models and runtime systemscategory:

– Available Task-level Parallelism on the Cell/B.E.– CellSs: scheduling techniques to better exploit

memory hierarchy

In the applications category:

– High Performance Protein Sequence DatabaseScanning on the Cell/B.E. Processor

– Building High-Resolution Sky Images using theCell/B.E.

– Implementation of scientific computing applica-tions on the Cell Broadband Engine

– Efficient SIMDization and Data Management ofthe Lattice QCD Computation on the Cell Broad-band Engine

– Streaming Model Based Volume Ray Casting Im-plementation for Cell Broadband Engine

– 3D Seismic Imaging through Reverse-TimeMigration on Homogeneous and HeterogeneousMulti-core Processors

– Implementation and Performance Modeling ofDeterministic Particle Transport (Sweep3d) onthe IBM Cell/B.E.

We want to thank all authors for their excellent con-tributions, only a few of which we were able to accom-modate in this special issue. We would like to thankthe members of the Editorial Board and external ex-perts who guided us in the selection of these papers fortheir many excellent reviews, expert suggestions andinsights.

Finally, we want to thank the Editor-in-Chief of Sci-entific Programming, Prof. Boleslaw Szymanski, whoguided it us through the entire process of preparing theissue with encouragement and expert advice.

We hope you have as much enjoyment in readingthis issue as we had in preparing it!

Michael GschwindFred Gustavson

Jan F. Prins

December 2008

Page 3: High Performance Computing with the Cell Broadband Enginedownloads.hindawi.com/journals/sp/2009/979236.pdf · tem, the Cell Broadband Engine architecture has been used in systems

Submit your manuscripts athttp://www.hindawi.com

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttp://www.hindawi.com Volume 2014

Hindawi Publishing Corporationhttp://www.hindawi.com Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttp://www.hindawi.com

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation http://www.hindawi.com Volume 2014

Hindawi Publishing Corporationhttp://www.hindawi.com Volume 2014

Applied Computational Intelligence and Soft Computing

 Advances in 

Artificial Intelligence

Hindawi Publishing Corporationhttp://www.hindawi.com Volume 2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttp://www.hindawi.com Volume 2014

Hindawi Publishing Corporationhttp://www.hindawi.com Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttp://www.hindawi.com Volume 2014

Hindawi Publishing Corporation

http://www.hindawi.com Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttp://www.hindawi.com Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttp://www.hindawi.com Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttp://www.hindawi.com Volume 2014

Hindawi Publishing Corporationhttp://www.hindawi.com Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttp://www.hindawi.com Volume 2014

Modelling & Simulation in EngineeringHindawi Publishing Corporation http://www.hindawi.com Volume 2014

The Scientific World JournalHindawi Publishing Corporation http://www.hindawi.com Volume 2014

Hindawi Publishing Corporationhttp://www.hindawi.com Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttp://www.hindawi.com Volume 2014