distribution category - digital library/67531/metadc282793/m2/1/high... · distribution category:...

22
Distribution Category: Mathematics and Computers (UC-32) ANL--8t6-34 DE87 000282 MASTER ARGONNE NATIONAL LABORATORY 97(X) South Cass Avenue Argonne, Illinois 60439 Activities and Operations of the Advanced Computing Research Facility Tina Mihaly and Gail W. Pieper Mathematics and Computer Science Division Argonne National Laboratory Argonne, IllinoiV January 1985 - July 1986 This work was supported by the Applied Mathematical Sciences subprogram of the Office of Energy Research of the U.S. Department of Energy, under contract W-31-109-Eng-38.

Upload: dangdien

Post on 10-Mar-2018

218 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: Distribution Category - Digital Library/67531/metadc282793/m2/1/high... · Distribution Category: Mathematics and Computers ... were installed in the ... would enable groups of processors

Distribution Category:

Mathematics and Computers

(UC-32)

ANL--8t6-34

DE87 000282 MASTER

ARGONNE NATIONAL LABORATORY

97(X) South Cass Avenue

Argonne, Illinois 60439

Activities and Operations of theAdvanced Computing Research Facility

Tina Mihaly and Gail W. Pieper

Mathematics and Computer Science DivisionArgonne National Laboratory

Argonne, IllinoiV

January 1985 - July 1986

This work was supported by the Applied Mathematical Sciences subprogram of the Office of EnergyResearch of the U.S. Department of Energy, under contract W-31-109-Eng-38.

Page 2: Distribution Category - Digital Library/67531/metadc282793/m2/1/high... · Distribution Category: Mathematics and Computers ... were installed in the ... would enable groups of processors

Contents

1. Summary ................................................................................................................................. 1

2. Installations..............................................................................................................................2

2.1 Four New Multiprocessors ............................................................................................... 22.2 Other Computer Systems..................................................................................................3

3. Operations................................................................................................................................3

4. User Facility Activities............................................................................................................4

4.1 Projects........ . . . . . ..................................................................................... 4

4.2 Proposals .......................................................................................................................... 64.3 User Education.................................................................................................................74.4 Visitors ............................................................................................................................. 84.5 Seminars...........................................................................................................................8

5. Advanced Scientific Computing Research in the ACRF.....................................................8

5.1 Algorithms and Software..................................................................................................85.2 Parallel Programming Methodologies...............................................................................95.3 Programming Languages................................................................................................105.4 Advanced Computer Architectures.................................................................................10

6. Publications ........................................................................................................................... 11

References...................................................................................................................................12

Appendix.....................................................................................................................................13

ii

Page 3: Distribution Category - Digital Library/67531/metadc282793/m2/1/high... · Distribution Category: Mathematics and Computers ... were installed in the ... would enable groups of processors

Staff of the Advanced Computing Research Facility

I'

aK 1 4 -.

I

,a

From left to right, the ACRF staff members are as fellows:

Ross Overheek. An expert in automated deduction systems, Ross is actively involved indeveloping a parallel Warren Abstract Machine for logic programming. Together with his col-leagues, he has extended Warren's original design to exploit the availability of multiprocessingsystems and the design of new inference engines.

'ack Dongarra, ACRF Scientitic Director. Jack's primary research interests are devisinglinear algebra algorithms and developing Ivriahle algorithms for high performance computers.Jack is also working with researchers at the Center for Supercomputing Research and Development at the University of Illinois. Ile has Iven involved in the LINPACK and EISPACK pro-jects and is codesigner of the second-order I3LAS, which exploit the special features of ad-vanced architectures.

iii

f

Page 4: Distribution Category - Digital Library/67531/metadc282793/m2/1/high... · Distribution Category: Mathematics and Computers ... were installed in the ... would enable groups of processors

Eugene Rackow. Gene is responsible for maintaining the software and hardware for theACRF computers and the Division's VAX 11/780. Working with the computer vendors, heresolves problems that may arise with the new systems. Gene answers technical questions andhelps set up new accounts for ACRF computer users.

Ewing Lusk, ACRF Deputy Scientific Director. Focusing on parallel programming metho-dologies, Ewing was one of the developers of a new approach that implements synchronizationthrough monitors written as macros. At present, he is investigating the use of logic program-ming as a language to control parallelism in numerical programs.

Teri Huml. As ACRF secretary, Teri coordinates the numerous seminars, workshops, andclasses that staff members offer to encourage use of the ACRF computers. She also handlesrequests for information about the ACRF and mails out appropriate documentation to interestedusers.

Danny Sorensen. Instrumental in starting the Division's research program in advanced com-puting methods, Danny's primary concern has been to devise algorithms that achieve both por-tability and high performance. Toward this goal, he and a colleague have designed new algo-rithms and restructured existing ones for parallel computers. Danny is also working withresearchers at the University of Illinois Center for Supercomputing Research and Developmenton applications such as circuit simulation on advanced-computer architectures.

Rick Stevens (not pictured). Rick oversees the daily operations of the ACRF. His particularconcerns are documentation and system utilities. Rick also conducts performance measure-ments of the machines in the ACRF; for example, he is currently benchmarking the Sequent21000, for which Argonne is serving as a beta test site.

CRFCRF

Advanced Computing Research Facility

iv

Page 5: Distribution Category - Digital Library/67531/metadc282793/m2/1/high... · Distribution Category: Mathematics and Computers ... were installed in the ... would enable groups of processors

Activities and Operations of theAdvanced Computing Research Facility

prepared by

Tina Mihaly and Gail W. Pieper

1. Summary

This report discusses research activities and operations of the Advanced ComputingResearch Facility (ACRF) at Argonne National Laboratory from January 1985 through June1986. During this period, the Mathematics and Computer Science Division (MCS) at Argonnereceived incremental funding from the Applied Mathematical Sciences program of the DOEOffice of Energy Research to operate computers with innovative designs that promise to beuseful for advanced scientific research. Over a five-month period, four new commercial mul-tiprocessors (an Encore Multimax, a Sequent Balance 21000, an Alliant FX/8, and an InteliPSC/d5) were installed in the ACRF, creating a new wave of research projects concerningcomputer systems with parallel and vector architectures. The Lemur, a locally designed mul-tiprocessor, continues to be available for experimentation; the HEP, used for numerous studiesduring 1984-1985, was returned to Denelcor, Inc., in October 1985.

To ensure effective use of the ACRF computers, MCS established the positions ofscientific director and deputy scientific director, and added support staff to coordinate ACRFactivities.

Since the inception of the ACRF, the Division has sponsored a variety of classes,workshops, and seminars to train researchers on computing techniques for the advanced com-puter systems at the Advanced Computing Research Facility. In 1985, we taught courses onthe use of parallel MIMD (multiple-instruction multiple-data) computers, focusing on the HEPin particular. More recently, we have offered three classes on writing programs for parallelcomputer systems and a symposium for Chicago-area scientists and engineers active in

Page 6: Distribution Category - Digital Library/67531/metadc282793/m2/1/high... · Distribution Category: Mathematics and Computers ... were installed in the ... would enable groups of processors

2

computer systems and a symposium for Chicago-area scientists and engineers active inscientific computing research or management. A two-day workshop on programminglanguages for advanced architecture computers was held on June 30 and July 1, 1986. Thisforum for information exchange among commercial vendors of advanced computers was aimedat facilitating the rapid evolution of languages for advanced computers with parallel or vectorcapabilities.

2. Installations

Since establishing the ACRF in 1984, we have continued to select and operate computerswith innovative designs that are likely to be effective for a wide range of scientific computingtasks. We will continue to upgrade our existing computers and plan to add new ones withunique capabilities as they become available.

2.1. Four New Multiprocessors

Since late 1985, the Advanced Computing Research Facility has acquired four new mul-tiprocessors. The Encore Multimax, installed in December, features 20 processors sharing 20megabytes of memory. Its system architecture will permit adding a "cluster" feature, whichwould enable groups of processors that directly share memory to be further connected to eachother. A Sequent Balance 8000, with 12 processors and 16 megabytes of shared memory, wasinstalled in January 1986. Recently, the Sequent Balance 8000 was upgraded to a Sequent21000, with 24 processors sharing 16 megabytes of memory. Both the Encore and the Sequentare used to investigate parallel programming methodologies, to study automated reasoning, todevelop performance measurement techniques, and to develop tools for program transforma-tion.

Figure 1. The Encore Multimax, installed in December 1985, is one offour new multiprocessor systems added to the ACRF in this past year.

An Intel iPSC/d5, installed in March 1986, features a five-dimensional hypercube architecturewith 32 nodes, each having a .5 megabyte memory. The hypercube design enables researchersto develop algorithms for processors that communicate through message passing rather thanshared memory. This system will be enhanced in the fall of 1986 with 16 vector processors.The fourth multiprocessor, added to ACRF in March 1986, is an Alliant FXIS system. TheAlliant features 8 vector processors sharing 32 megabytes of memory. The vector capability of

Page 7: Distribution Category - Digital Library/67531/metadc282793/m2/1/high... · Distribution Category: Mathematics and Computers ... were installed in the ... would enable groups of processors

3-

this machine makes it ideal for research in high-performance numerical computations. All themultiprocessors run a version of Unix, with Fortran and C compilers. In addition, the SequentBalance 21000 is equipped with Pascal and Ada compilers, and the Encore Multimax has aPascal compiler.

The machines are connected to the Argonne intralaboratory network, enabling users totake advantage of the available graphics output equipment, tape drives, and mass storage. Allfour experimental computers are linked to each other and to the MCS Division's VAX 11/780dual-processor system on an Ethernet-based local area network (LAN) using the TCP/IP proto-cols. Several other MCS Division systems (e.g., Sun Model 3 and Ridge 32) and Argonne'sMFENET Gateway VAX are also on this LAN. Access to TYMNET is available through ter-minal concentrators. The VAX is als> a host on the ARPANET/MILNET. In addition, theVAX is connected via local network Lo Argonne's central computing facility.

Modems

ENCORE SEQUENARPANET/MILNET VA118 ULTI- BALANCE

MAX 21000

ME VAX

N 11750ET T

YALLIANT INTELM Concentrators FX/8 iPSCNE

ARGONNE NETWORKDirect Modems

BITNET

Figure 2. The four ACRF multiprocessors are linked directly to the MCS Division'sVAX 11/780 and indirectly to ARPANET/MILNET, TYMNET, and MFENET.

2.2. Other Computer Systems

As described in earlier progress reports (see References 2 and 3), the Division installed aDenelcor HEP computer in 1984. The HEP, used for studies in linear algebra, automated rea-soning, graphics, transformation systems, and fluid flow modeling, was returned to Denelcor inOctober of 1985.

In 1985, we added the Lemur, a locally designed and built parallel computer with eightprocessors sharing eight megabytes of memory. The Lemur, attached to the MCS Division'sVAX 11/780, successfully ran demonstration programs verifying full parallelism and synchron-ization via interlocked memory-to-memory transfers. We wrote a small operating system forthe Lemur to run on the host. At present, the Lemur is available for experimental use.

3. Operations

During January 1986, the Mathematics and Computer Science Division establishedseveral new positions in the ACRF. Jack Dongarra and Ewing Lusk, computer scientists in theMCS Division, were appointed scientific director and deputy scientific director, respectively.Dongarra and Lusk formulate the research directions of the Advanced Computing Research

Page 8: Distribution Category - Digital Library/67531/metadc282793/m2/1/high... · Distribution Category: Mathematics and Computers ... were installed in the ... would enable groups of processors

-4-

Facility, plan for the selection and installation of other computers in ACRF, encourage interac-tions with outside researchers, establish guidelines and procedures for obtaining time on theACRF computers, and formulate policy for support services (e.g., consulting and documenta-tion). Other duties include arranging seminars and coordinating symposia on high-performancecomputing.

Eugene Rackow and Rick Stevens oversee the day-to-day operations of the computingfacility, such as maintaining software and hardware, preparing documentation, establishing net-work connections, and coordinating system administration (e.g., user accounts and file alloca-tion). Acting as the liaison between the various computer vendors and the ACRF, Rackowresolves problems that arise with the hardware and software, and answers technical questionsthat ACRF system users may have. Stevens keeps track of the documentation and conductsperformance measurements of the new machines.

4. User Facility Activities

The Mathematics and Computer Science Division encourages the use of ACRF's special-ized resources. The ACRF is intended to be a national user facility focused on carrying outadvanced computing research. Judging from the breadth of research being conducted at presentand the variety and scope of the new proposals the MCS Division has received from comput-ing scientists interested in studying advanced computers, the Advanced Computing ResearchFacility is well on its way to becoming an international resource.

4.1. Projects

During the past eighteen months, ACRF gained many new users. The following listgives the names, affiliations, and research projects of these new ACRF participants. Refer totwo previous reports (References 2 and 3) for information on completed research projects.

K. A. Ariyawansa - Washington State UniversityParallel algorithms for stochastic linear programming

R. Babb - Oregon Graduate CenterProgramming system for multiprocessors

M. W. Berry - University of Illinois at UrbanaParallel algorithms for finite element structural analysis

Shahid Bokhari - Institute for Computer Applications in Science and EngineeringA parallel tree search algorithm for the bin packing problem

Barbara Bonar - Queens University of Belfast, IrelandChecking example applications programs on the hypercube machine

Ralph Butler - University of North FloridaParallel version of the Warren Abstract Machine

R. M. Butler and A. R. DeKock - University of North FloridaDesign and implementation of parallel subsumption algorithms

Julio Diaz - University of OklahomaParallel algorithms for oil reservoir simulation

Page 9: Distribution Category - Digital Library/67531/metadc282793/m2/1/high... · Distribution Category: Mathematics and Computers ... were installed in the ... would enable groups of processors

-5-

Terry Disz - University of Illinois at ChicagoParallel graphics

Jeremy DuCroz - Numerical Algorithm Group Ltd., Oxford, EnglandDesign of the Level 2 BLAS

lain Duff - Atomic Engineering Research Establishment, Harwell, EnglandParallel Implementation of Multifrontal Schemes

David A. Friedman - High Energy Physics Division, Argonne National LaboratoryInteraction of a GUT magnetic monopole with a hydrogen atom

M. Heath - Oak Ridge National LaboratoryPipelined Givens method for the QR factorization of a sparse matrix

Lennart Johnsson - Yale UniversitySolving Banded Systems of Linear Equations in Parallel

James Kohl - Purdue UniversityBenchmarking timing studies

Brad Lucier - Purduc UniversityParallel algorithms for nonlinear evolution equations

Peter Mayes - Numerical Algorithm Group, Ltd.Portable software for optimization and Fast Fourier Transformsfor parallel machines

William Newman - Hastings CollegeParallel algorithms for number theoretic quadrature rules

Jorge Nocedal - Northwestern UniversityEigenvalue problems

Robert Olson - Lisle Sr. High SchoolParallel version of the Warren Abstract Machine

Lothar Reichel - University of KentuckyDeveloping algorithms for the unitary eigenvalue problem

Jeff Scroggs - University of Illinois at ChicagoParallel algorithms for computational fluid dynamics

Chris Thompson - Numerical Algorithm Group, Inc.Resource allocation within a prototypical linear algebra library

N. S. Vlachos - University of Illinois at UrbanaNumerical modeling of turbulent transport problems

Page 10: Distribution Category - Digital Library/67531/metadc282793/m2/1/high... · Distribution Category: Mathematics and Computers ... were installed in the ... would enable groups of processors

-6-

4.2. Proposals

As classes and workshops familiarize researchers with the new machines in the ACRF,potential users propose new applications and techniques to implement on our advanced com-puters. The first step in obtaining access to the high-performance computers of the ACRF isto submit an informal proposal to the reviewers in the MCS Division. Contact the ACRFscientific director for more information on the submission of proposals. Listed below are thetitles of the latest proposals received by the MCS Division, along with the authors and theiraffiliations.

Argonne Users

Sam Bowen - Material Science and Technology DivisionProperties of Electronic States in Solids

Ciarlotte Fischer - Physics DivisionTesting Multiprocessor Algorithms

James Kennedy - Reactor Analysis and Safety DivisionExplicit Finite Element Structural Codes on the Hypercube

Off-Site Users

Kenneth Summers - Air Force Weapons Laboratory/SIPApplicability of Multiprocessors to Air Force Weapons LaboratoryComputation Needs

Daniel Lin - Bell Communications ResearchAnalysis and Reconstruction of Auditory Signals

John Lewis - Boeing Computer ServicesSparse Matrix Operations on Parallel Architectures

Paul Concus - Lawrence Berkeley LaboratoryChoosing Multiprocessors for DOE AMS Research

Dan Pierce - North Carolina State UniversityParallel Methods to Solve the Least Squares Problem

R. Bruce Mattingly - North Carolina State UniversityDevelopment of Parallel Numerical Linear Algebra Algorithms

Karl Knapp and Jeff Kvam - Numerical Algorithms Group, Inc.The NAG Library for Multiprocessors

Robert Babb - Oregon Graduate CenterUse of Alliant for Teaching Parallel Processing

Vivek Sarkar - Stanford University, Computer Systems LaboratoryPartitioning Program Graphs for Execution on Multiprocessors

Page 11: Distribution Category - Digital Library/67531/metadc282793/m2/1/high... · Distribution Category: Mathematics and Computers ... were installed in the ... would enable groups of processors

7

John de Pillis - University of CaliforniaDevelopment of a Theory for Solving Large Sparse Linear Systems

Nora Sabelli - University of Illinois at ChicagoModifying Large Quantum Chemistry Codes with New Algorithms

M. Muralidharan - University of KentuckyDesign and Development of Combinatorial Algorithms

Paul Reynolds - University of VirginiaIncremental Detection of Synchronization Errors

Richard Borgioli - Vitesse ElectronicsParallel Enhancement Mechanisms

4.3. User Education

Classes on parallel computing were held March 17-19, May 12-14, and June 16-18. Theattendees, totaling over twenty per class, represented universities, industry, and various researchlaboratories located in Canada, North Carolina, Louisiana, California, Virginia, New York,Kentucky, and other states. The intent of the classes was to familiarize the attendees with theACRF environment, to offer ample hands-on experience on the parallel computer systems, andto apply parallel programming to each attendee's area of research. During the classes, theattendees were taught how to write and run several programs, with Fortran and C being theprimary programming languages. Session topics that were addressed included monitors andtheir implementation with macros, and the environment of multiprocessors such as the Sequentand Alliant. Additional classes will be scheduled for late summer and fall.

Figure 3. Participants in a recent parallel computing class listen as Ewing Luskdiscusses the unique capabilities of the ACRF multiprocessors.

Page 12: Distribution Category - Digital Library/67531/metadc282793/m2/1/high... · Distribution Category: Mathematics and Computers ... were installed in the ... would enable groups of processors

8-

from fourteen different major computer corporations, represented industry; the othersrepresented Argonne, universities, and various research centers. Other workshops planned byMCS include a summer institute on parallel programming for graduate and post-doctoralresearchers and a workshop on performance evaluation of high-performance computers.

A one-day symposium on advanced computing for leaders in industrial and academicresearch in the Chicago area was held May 6, 1986. In ' arious sessions, ACRF scientists dis-cussed topics relating to advanced architectures, including portability issues in parallel pro-gramming, reasons why supercomputers are not as fast as advertised, the direction of researchin the Mathematics and Compiter Science Division, and computational physics on supercom-puters. Of the nearly 60 persons who attended the symposium, 30 represented nearby universi-tics, 10 represented research laboratories (AT&T Bell Laboratories, Fermilab, Borg-WarnerResearch Center, and Los Alamos National Laboratory), eight represented divisions at ArgonneNational Laboratory, and nine participants were from industry (AT&T Information Systems,Amphenol Producrs, Cray Research, Illinois Bell Telephone Company, and Numerical Algo-rithms Group, Inc.).

4.4. Visitors

The MCS Division invites scientists from industry, universities, and other researchlaboratories to participate in the various ACRF research projects. Graduate and undergraduatestudents, postdoctoral candidates, and faculty stay for periods of time ranging from two weeksto several months. Since January 1985, approximately 33 researchers have participated inACRF-related projects. At the present time, twelve visiting scientists are carrying outadvanced computing research at the ACRF. The visitors program creates an opportunity forscientists from outside of Argonne to contribute to the development of software, programmingmethodologies, and algorithms for advanced computer architectures. Most important, MCSbenefits from these interactions, sharing ideas and expertise wikn researchers throughout theworld.

4.5. Seminars

We continue to offer a series of seminars on high-performance computing. Jointly, MCSand Computing Services have sponsored approximately 50 speakers during the past eighteenmonths. Topics have included attempts at parallelizing multigrid-type methods, solution of non-linear collocation equations, asymptotics of eigenvalues for indefinite problems, algorithms forcomputational fluid dynamics of the hypercube, and implicit methods for numerical weatherprediction. The Appendix contains a complete listing of seminar titles and speakers.

5. Advanced Scienti'c Computing Research in the ACRF

The advanced computing research program, focusing on parallel architecture, aims tocreate portable algorithms, software, and programming techniques for both numeric and reason-ing tasks. The following sections highlight the advanced scientific computing research in theACRF, which is divided into four areas: (1) algorithms and software, (2) parallel programmingmethodologies, (3) programming languages, and (4) advanced computer architectures.

5.1. Algorithms and Software

A major goal of our research is to create algorithms and software that achieve high per-formance and portability on advanced computer architectures. Part of our efforts have focusedon designing new parallel algorithms. For example, one significant achievement during thispast year was creating an algorithm for solving the symmetric tridiagonal eigenvalue problem

Page 13: Distribution Category - Digital Library/67531/metadc282793/m2/1/high... · Distribution Category: Mathematics and Computers ... were installed in the ... would enable groups of processors

-9-

in a parallel setting. Not only is this algorithm excellent on parallel architectures, but its per-formance is superior to the standard algorithm on sequential machines.

In addition to designing new parallel algorithms, we analyzed the most frequently usedand time-consuming algorithms relating to eigenvalue problems and linear equations for densematrices, then reorganized them to utilize matrix-vector operations. The results were impres-sive: we achieved supervector speeds on a CRAY X-MP and an Alliant FX/1 and, withminimal reprogramming, have been able to port the algorithms to diverse parallel processorssuch as the Sequent Balance 21000, the Encore Multimax, and the Alliant FX/8.

We also restructured algorithms for dealing with banded systems of linear equations, then

implemented them successfully on the CRAY X-MP and the Alliant FX/8.

5.2. Parallel Programming Methodologies

Closely associated with our work on algorithms and software is research on parallel pro-gramming methodologies.

We developed a monitors/macros approach that implements synchronization through mon-itors written as macros, which are in turn recoded for each machine. We demonstrated theportability of the approach by successfully moving to the Lemur a number of our algorithmsdeveloped on the HEP. During the past year, we implemented Fortran and C versions of theoriginal monitors/macros package for the Encore Multimax, the Sequent Balance 8000, and theAlliant FX/8 machines, as well as tie CRAY-2 and CRAY X-MP-4.

Another area investigated was the dynamic allocation of resources to a library on a paral-lel computer. In 1985, we developed a package (called SCHEDULE) of a Fortran-callablesubroutine that aids in programming explicitly parallel algorithms in Fortran. One significantadvantage of SCHEDULE is that no machine-dependent statements or extensions are requiredin the user's code.

On several parallel computers, we implemented the transformer component of theTAMPR automated program transformation system by automatically transforming the pure Lispspecification to parallel Fortran. The parallel version of the transformer was highlysuccessful-it achieved a speedup of 12.5 with 16 processes for a real application.

We approached the problem by first defining a sequence of language levels, or models ofcomputation, leading from pure Lisp to parallel Fortran. We chose "parallel Fortran with anunbounded number of processes" as the central level. Although having an unbounded numberof processes is unrealistic in terms of real machines, this model is useful for proving that pureLisp programs executed in parallel do not "deadlock." We then developed program transforma-tions that convert the pure Lisp program to this level in several stages. Next, to get an execut-able parallel program, we wrote additional transformations that implement the unbounded-number-of-processes model in terms of a process queue and a finite, fixed number of "server"processes.

We applied these transformations to the TAMPR transformation interpreter, a large list-processing program in which the available parallelism, and hence the speedup, depends on theinput data. The resulting Fortran program was run on a Denelcor HEP. For data havingminimal parallelism, we obtained speedups of 2.4, while for data pemitting high parallelism,we obtained speedups of 12.8.

We also implemented the parallel version of the TAMPR transformer on the Encore Mul-timax and Sequent Balance 8000 computers. Only two transformations had to be modified toproduce these implementations. Again, good speedups were achieved: 6.6 for 8 processors,11.5 for 16 processors. We are currently investigating alternative implementations that permitlarger-grained parallelism.

Page 14: Distribution Category - Digital Library/67531/metadc282793/m2/1/high... · Distribution Category: Mathematics and Computers ... were installed in the ... would enable groups of processors

- 10 -

In a new study of the behavior of parallel programs, we developed a measurement tech-nique that requires only one run, instead of the usual two, to ascertain speedup. The techniquehas the additional virtue of reporting, in the same run, the speedup lost due to contention forlocks and lack of parallel work. In a related study, we determined the percentage of time thatvarious numbers of processes had work to do. The insight gained from these studies enabledus to appreciate better the influence of algorithm design on achievable parallelism and to devisestrategies for reducing synchronization bottlenecks and improving speedup.

In addition, we developed an approach using mathematical modeling, computer simula-tion, and graphical methods to analyze the performance of machines that achieve parallelismeither with multiple processors or with an execution pipeline. We showed that a finite-queuemodel adequately characterizes the performance of a Denelcor HEP and that an infinite-queuemodel describes the performance of an Alliant FX/8. We are now attempting to extend theseresults to other computer architectures.

5.3. Programming Languages

We designed and implemented a parallel version of the Warren Abstract Machine tostudy logic programming as a language suitable for parallel computers. We investigatedseveral dialects, primarily through extended visits by researchers from the groups at theWeizmann Institute (Concurrent Prolog), the University of Lisbon (Delta-Prolog), and ImperialCollege, London (Parlog). As part of this effort, the MCS Division held a workshop on theWarren Abstract Machine, which brought together the primary implementors of high-performance logic-programming systems.

We also began investigating the use of logic programming as a language for controllingparallelism in numerical programs whose floating-point calculations are expressed in C. Pre-liminary experiments on the Encore Multimax indicate that this mixed-language approach isfeasible; however, the large level of granularity this approach requires may be a problem.

Another language being analyzed for programming multiprocessor systems is Ada, whichis available on the Sequent Balance 21000. Unlike most other programming languages, Adahas built-in, high-level features for synchronization that make it especially attractive for mul-tiprocessing.

5.4. Advanced Computer Architectures

We have been studying various advanced computers to evaluate their software environ-ments and to gain an understanding of their performance potential.

One study analyzed the results of running MACHAR and the ELEFUNT suite of trans-portable Fortran test programs (Reference 1) on the Encore Multimax and the Alliant. Overall,the library proved to be accurate, but a weakness in error handling was evident. As a result ofthis work, Alliant Computer Systems Corporation and Encore Computer Corporation haveagreed to modify their software libraries.

Another project focuses on developing a Fortran-oriented programming environmentcomprising Unix, Toolpack, and graphics software, running on a workstation with graphicscapability, and accessing one or more advanced computers. The initial prototype involved aVAX connected to advanced-architecture machines; we are now working on a more realisticprototype using a Sun workstation connected to the machines in the ACRF.

In collaboration with scientists at the University of Illinois Center for SupercomputerResearch and Development, we have been involved in the CEDAR project. Our main role inthe CEDAR project, aimed at constructing a multiprocessor for large-scale scientific computa-tion, is to devise numerical algorithms and mathematical software for this architecture.

Page 15: Distribution Category - Digital Library/67531/metadc282793/m2/1/high... · Distribution Category: Mathematics and Computers ... were installed in the ... would enable groups of processors

- 11 -

We have been studying various application programs for circuit simulation, computational fluiddynamics, and traffic equilibrium problems.

Finally, to assimilate the available information on commercial and experimental high-performance computer systems, we prepared an extensive survey of advanced-architecturemachines. The report has been widely distributed throughout the United States and Europe.

6. Publications

Listed below are publications relating to advanced computing research that are authoredby Mathematics and Computer Science Division staff.

R. Butler, E. Lusk, W. McCune, and R. Overbeek, "Parallel Logic Programming for NumericApplications," MCS-TM-72, Argonne National Laboratory (April 1986).

W. J. Cody, "ELEFUNT Test Results under X1.4 on the Encore Multimax," MCS-TM-68,Argonne National Laboratory (April 1986).

W. R. Cowell, and C. P. Thompson, "Transforming Fortran DO Loops to Improve Perfor-mance on Vector Architectures," ANL-85-63, Argonne National Laboratory(May 1986).

R. L. Crane, M. Minkoff, K. E. Hillstrom, and S. D. King, "Performance Modelling of Large-Grained Parallelism, MCS-TM-63, Argonne National Laboratory (March 1986).

J. J. Dongarra, "How Do the 'Minisupers' Stack Up?" Computer, 19(3), March 1986,p. 93, 100.

J. J. Dongarra, "Performance of Various Computers Using Standard Linear Equations Softwarein a Fortran Environment," MCS-TM-23, Argonne National Laboratory (June 1986).

J. J. Dongarra and I. S. Duff, "Advanced Architecture Computers," MCS-TM-57, ArgonneNational Laboratory (October 1985).

J. J. Dongarra and T. Hewitt, "Implementing Dense Algebra Algorithms Using Multitasking onthe CRAY X-MP-4," SIAM J. Sci. Stat. Comput., 7(1), January 1986, p. 347-350.

J. J. Dongarra and A. Hinds, "Comparison of the CRAY X-MP-4, Fujitsu VP-200, and HitachiS-810/20: An Argonne Perspective," ANL-85-19, Argonne National Laboratory(October 1985).

J. J. Dongarra, L. Kaufman, and S. Hammarling, "Squeezing the Most Out of EigenvalueSolvers on High-Performance Computers," Linear Algebra and Its Applications, 77,1986, p. 113-136.

J. Dongarra, B. T. Smith, and D. Sorensen, "Algorithm Design for Different Computer Archi-tectures," IEEE Software, 2(4), July 1985, p. 79-80.

J. Dongarra and D. Sorensen, "A Fully Parallel Algorithm for the Symmetrical EigenvalueProblem," MCS-TM-62, Argonne National Laboratory (January 1986).

Page 16: Distribution Category - Digital Library/67531/metadc282793/m2/1/high... · Distribution Category: Mathematics and Computers ... were installed in the ... would enable groups of processors

- 12 -

J. Dongarra and D. Sorensen, "A Fully Parallel Algorithm for the Symmetrical EigenvalueProblem," MCS-TM-62, Argonne National Laboratory (January 1986).

J. Dongarra and D. Sorensen, "A Parallel Linear Algebra Library for the Denelcor HEP,"Parallel MIMD Computation: The HEP Supercomputer and Its Applications,ed. J. S. Kowalik, The MIT Press, 1985.

J. J. Dongarra and D. C. Sorensen, "Linear Algebra on High-Performance Computers," Paral-lel Computing 85, ed. M. Feilmeier, G. Jonbert, and U. Schendel, Elsevier, 1986,p. 3-32.

I. S. Duff, "Parallel Implementation of Multifrontal Schemes,' MCS-TM-49, Argonne NationalLaboratory (March 1985).

J. Gabriel, T. Lindholm, E. L. Lusk, and R. A. Overbeek, "A Tutorial on the Warren AbstractMachine for Computational Logic," ANL-84-84, Argonne National Laboratory(June 1985).

B. W. Glickfeld and R. A. Overbeek, "Quasi-Automatic Parallelization: A Simplified Approachto Multiprocessing," ANL-85-70, Argonne National Laboratory (October 1985).

M. T. Heath, and D. C. Sorensen, "A Pipelined Givens Method for Computing the QR Factori-zation of a Sparse Matrix," Linear Algebra and Its Applications, 77 (1986), p. 189-203.

B. Lucier and R. Overbeek, "Parallel Adaptive Numerical Schemes for Hyperbolic Systems ofConservation Laws," Purdue University Technical Report 3 (November 1985).

E. Lusk, J. Gabriel, T. Lindholm, and R. Overbeek, "Logic Programming on the HEP," Paral-lel MIMD Computation: The HEP Supercomputer and Its Applications,ed. J. S. Kowalik, The MIT Press, 1985.

E. L. Lusk and R. A. Overbeek, "The Tradeoffs among Portability, Complexity, and Efficiencyin Multiprocessing Environments," Proceedings of the Workshop on Parallel ProcessingUsing the Heterogeneous Element Processor (Norman, Oklahoma, March 20-21, 1985),p. 245-260.

D. C. Sorensen, "Analysis of Pairwise Pivoting in Gaussian Elimination," IEEE Trans. onComputers, C-34, no. 3 (March 1985), p. 274-278.

References

1. W. J. Cody and W. Waite, Software Manual for the Elementary Functions. EnglewoodCliffs, New Jersey: Prentice-Hall, 1980.

2. P. C. Messina and D. C. Sorensen, unpublished information, 1986.

3. D. C. Sorensen, unpublished information, 1985.

Page 17: Distribution Category - Digital Library/67531/metadc282793/m2/1/high... · Distribution Category: Mathematics and Computers ... were installed in the ... would enable groups of processors

- 13 -

Appendix: High-Performance Computing SeminarsSponsored by the Mathematics and Computer Science Division

and Computing Services

R. D. RettbergThe Butterfly MultiprocessorBBN LaboratoriesJanuary 24, 1985

Sidney FernbachSupercomputer Systems: Present and FutureAlamo, CaliforniaJanuary 31, 1985

John Van RosendaleThe Blaze Language: A Parallel Language for

Scientific ProgrammingNASA Langley Research CenterFebruary 7, 1985

Use IpsenEfficient Parallel Solution of Linear Systems with

Hyperbolic RotationsYale UniversityFebruary 14, 1985

Lawrence SamartinThe FLEX/32 Multicomputing EnvironmentFlexible Computer CorporationFebruary 28, 1985

John GustafsonMulticomputing with FPS Scientific ComputersFloating Point Systems, Inc.May 2, 1985

Ian GladwellPortable Almost Block-Diagonal Solvers on Pipeline ProcessorsUniversity of Manchester, EnglandMay 3, 1985

Page 18: Distribution Category - Digital Library/67531/metadc282793/m2/1/high... · Distribution Category: Mathematics and Computers ... were installed in the ... would enable groups of processors

- 14 -

A. FaustiniCHRONOS, A Language for Formal Description of Real Time Systems

and Their ProgrammingArizona State UniversityMay 14, 1985

Nicholas GouldThe Accurate Determination of Search Directionsfor Simple Differentiable Penalty Functions

University of WaterlooMay 16, 1985

George BaderSolutions of Nonlinear Collocation EquationsSimon Fraser UniversityMay 17, 1985

Mark KonSobolev Smoothing Properties of Analytic Functions of

Elliptic OperatorsBoston UniversityMay 23, 1985

Jacqueline FleckingerAsymptotics of Eigenvalues for Indefinite ProblemsUniversity Paul SabatierJune 4, 1985

Silvia SolmiMathematical Aspects in the Adjustment of Astronomic Data

and the Reconstruction of the Celestial SphereStanford UniversityJuly 2, 1985

Gerard MeurantMultitasking Experiments on the CRAY X-MP-4Centre d'Etudes de Limeil, FranceJuly 15, 1985

Douglas R. SmithSemiautomatic Algorithm DesignKestrel InstituteJuly 18, 1985

Lennart JohnssonBand Matrix Solvers for Hypercube ArchitecturesYale UniversityJuly 30, 1985

Eric Van de VeldeAlgorithms for Computational Fluid Dynamics on the HypercubeCourant InstituteAugust 20, 1985

Page 19: Distribution Category - Digital Library/67531/metadc282793/m2/1/high... · Distribution Category: Mathematics and Computers ... were installed in the ... would enable groups of processors

- 15 -

Julio DiazDevelopment of a Block Preconditioned Conjugate GradientIterative Solver for Sparse Linear Systems

University of OklahomaAugust 22, 1985

Gordon BellThe Encore Continuum: A Complete Distributed Workstation-Multiprocessor

Computing EnvironmentEncore ComputerAugust 27, 1985

Herman te RieleVector Research at the CWICentrum voor Wiskunke en Informatica, AmsterdamSeptember 3, 1985

John RollwagenCRAY Research Today and TomorrowCray Research Inc.September 9, 1985

Anne GreenbaumAttempts at Parallelizing Multigrid-Type MethodsLawrence Livermore National LaboratorySeptember 18, 1985

E. Mulbaney, R. Hausman, and P. CannonST-100 Array Processor - Description and UsageSTAR Technologies, Inc.September 18, 1985

Russell R. BartonUsing Pseudo Functions for Testing Unconstrained Optimization

AlgorithmsRCA Research LaboratoriesSeptember 19, 1985

Jurgen BattThe Nonlinear Vlasov-Poisson System of Partial Differential Equations

in Stelar Dynamics and Plasma PhysicsMathematisches Institut der Universitaet Muenchen, West GermanySeptember 26, 1985

Jacob LevyConcurrent Prolog and Alternative DialectsWiezmann Institute of ScienceSeptember 27, 1985

Chris AndersonA Look at Domain Decomposition for Elliptic Partial Differential

EquationsStanford UniversityOctober 3, 1985

Page 20: Distribution Category - Digital Library/67531/metadc282793/m2/1/high... · Distribution Category: Mathematics and Computers ... were installed in the ... would enable groups of processors

- 16 -

Cleve MolerMatrix Computation on the Intel IypercubeIntel Scientific ComputersOctober 11, 1985

Joel H. FerzigerComputational Approaches to Turbulence ResearchStanford UniversityOctober 16, 1985

Florian PotraNewton-Like Methods for Non-Linear Boundary Value Problen'sUniversity of IowaOctober 22, 1985

Alexandru NicolauESP: An Environment for Scientific Parallel-ProgrammingCornell UniversityOctober 24, 198i

George ByrneExperiments in Numerical Methods for a Problem inCombustion Modeling

Exxon Research and Engineering Co.November 4, 1985

Steve GregoryThe Sequential PARLOG MachineImperial College of Science and Technology, University

of LondonNovember 22, 1985

Daniel A. ReedStencils and Problem Partitionings: Their Influence

on the Performance of Multiple Processor SystemsUniversity of IllinoisFebruary 13, 1986

Eric Van de VeldeHypercube Algorithms and ImplementationsCourant Institute of Mathematical SciencesFebruary 14, 1986

Avi LinOn a Parallel Algorithm for Linear Recurrence Systems

with Minimum CommunicationTechnion Institute of Technology, IsraelFebruary 26, 1986

Page 21: Distribution Category - Digital Library/67531/metadc282793/m2/1/high... · Distribution Category: Mathematics and Computers ... were installed in the ... would enable groups of processors

- 17 -

John BolstadA Multigrid Continuation Method for Elliptic Problems

with FoldsLawrence Livermore National LaboratoryMarch 11, 1986

David KamowitzTheoretical and Computational Results for MGR [v

Multigrid MethodsUniversity of WisconsinMarch 17, 1986

Gerald HedstromNumerical Methods for Multiple-Scale ProblemsLawrence Livermore National LaboratoryMarch 18, 1986

Patrick WorleyMinimal Information Algorithms and Parallel ComputationStanford UniversityMarch 26, 1986

lbraham K. Abu-ShumaysVectorization of Transport and Diffusion Computations

on the CDC CYBER 205Bettis Atomic Power LaboratoryMarch 27, 1986

Jeffrey M. AugenbaumImplicit Methods for Numerical Weather PredictionNASA/Goddard Space Flight CenterMarch 31, 1986

Steve OttoIrregular Finite Elements on a HypercubeCalifornia Institute of TechnologyApril 3, 1986

Tony F. ChanNumerical Algorithms on Hypercube MultiprocessorsYale UniversityApril 11, 1986

Ridgeway ScottWhat Do You Want to Say?University of MichiganMay 15, 1986

Page 22: Distribution Category - Digital Library/67531/metadc282793/m2/1/high... · Distribution Category: Mathematics and Computers ... were installed in the ... would enable groups of processors

- 18 -

Distribution for ANL-86-34

Internal: C. Adams G. W. Pieper (1148)L. Amiot E. PorlierJ. M. Beumer (2) J. UnikT. FieldsN. Goetz ANL Patent DepartmentK. L. Kliewer ANL Contract FileA. B. Krisciunas ANL LibrariesP. C. Messina TIS Files (5)T. M. Mihaly (8)

External: DOE-TIC, for distribution per UC-32 (168)Manager, Chicago Operations Office, DOEMathematics and Computer Science Division Review Committee:

J. L. Bona, Pennsylvania State U.T. L. Brown, U. of Illinois, UrbanaP. Concus, LBLK. Cramer, DOE-CHS. Gerhart, MCC, Austin, TexasG. Golub, Stanford UniversityW. C. Lynch, Xerox Corp., Palo AltoJ. A. Nohel, U. of Wisconsin, Madison

D. Austin, DOE-ERR. Barrow, DOE OADPMD. Baugatz, AlliantD. Cole, IntelR. Coppinger, IntelF. Darema-Rogers, IBMJ. Decker, DOE-ERK. Foote, SequentA. Hayes, LANLK. Hopper, AlliantR. Huddleston, LLNLA. Karp, IBMM. McNeill, EncoreG. Michael, LLNLD. Micciche, AlliantC. Moler, IntelC. Mundie, AlliantR. Parsons, SequentL. Petzold, LLNLG. Ringstad, EncoreF. Ris, IBMM. Scott, SNLAM. Scott, DOE OADPMC. Thomas, DOE OADPMR. Ward, ORNL