where's the real bottleneck in scientific...

3
MACROSCOPE Where's the Real Bottleneck in Scientific Computing? Gregory V. Wilson Scientists would do well to pick up some tools widely used in the software industry W HEN I FIRST started doing computational science in 1986, a new generation of fast, cheap chips had just ushered in the airrent era of low-cost supercomputers, in which multiple processors work in parallel on a single problem. Suddenly, it seemed as thougli e\'eryone who ttx)k number cmnching seriously was rewriting his or her software to take advantage of these new machines. Sure, it hurt—the compil- ers that translated programs to iTin on parallel computers were flaky, debug- ging tools were nonexistent, and thinking Greg Wilson is an adjunct professor of computer science at the University of Toronto. His course material is available at http://www.third-bit.com/ swc. On February 17, as part of the 2006 An- nual Meeting ofthe American Association for the Advancement of Science in St. Louis, Wilson will be running a workshop on scientific programming skills, how investigators should acquire them, and the changes needed in publication and tenure pro- cedures to ensure quality in computational work. Address: Rami 5230, Baheji Centre for Information Technology, University of Toronto, Toronto, Ontar- io, M5S 2E4. Internet: gi'[email protected] about how to solve problems in parallel was often like trying to solve a thousand crossword puzzles at once—but the po- tential payoff seemed enormous. Many investigators were positive that within a few years, computer modeling would let scientists investigate a whole range of phenomena that were ttxi big, too small, too fast, too slow, too dangerous or too complicated to examine in the lab or to analyze with pencil and paper. But by the mid-1990s, I had a nagging feeling that something was wrong. For every successful simulation of global climate, there were a dozen or more groups struggling just to get their program to run. Their work was never quite ready to showcase at conferences or on the cover of their kical supercomputing center's newsletter. Many struggled on for months or years, tweaking and tinkering until their code did something more interesting than grinding to a halt or dividing by zero. For some reason, getting to computational heaven was taking a lot longer than expected. I therefore started asking scientists how they wrote their programs. The answers were sobering. Whereas a few knew more than most of the commercial software developers I'd worked with, the overwhelming majority were still using ancient text editors like Vi and Notepad, sharing files with colleagues by emailing them around and testing by, well, actually, not testing their programs systematically at all. I finally asked a friend who was pursuing a doctorate in particle physics why he insisted on doing e\'erything the hard way. Why not use an integrated development environment with a symbolic debugger? Why not write unit tests? Why not use a version-control system? His answer was, "What's a version-control system?" A version-control system, I explained, is a piece of software that monitors changes to files—programs. Web pages, grant proposals and pretty much anything else. It works like the "undo" button on vour favorite editor: At anv www.americanscientist.org 2006 January-February 5

Upload: dangnhan

Post on 29-Jun-2018

219 views

Category:

Documents


0 download

TRANSCRIPT

MACROSCOPE

Where's the Real Bottleneckin Scientific Computing?

Gregory V. Wilson

Scientists woulddo well to pick upsome tools widelyused in the softwareindustry

WHEN I FIRST started doingcomputational science in

1986, a new generation of fast, cheapchips had just ushered in the airrent eraof low-cost supercomputers, in whichmultiple processors work in parallel ona single problem. Suddenly, it seemedas thougli e\'eryone who ttx)k numbercmnching seriously was rewriting his orher software to take advantage of thesenew machines. Sure, it hurt—the compil-ers that translated programs to iTin onparallel computers were flaky, debug-ging tools were nonexistent, and thinking

Greg Wilson is an adjunct professor of computerscience at the University of Toronto. His coursematerial is available at http://www.third-bit.com/swc. On February 17, as part of the 2006 An-nual Meeting ofthe American Association for theAdvancement of Science in St. Louis, Wilson willbe running a workshop on scientific programmingskills, how investigators should acquire them, andthe changes needed in publication and tenure pro-cedures to ensure quality in computational work.Address: Rami 5230, Baheji Centre for InformationTechnology, University of Toronto, Toronto, Ontar-io, M5S 2E4. Internet: gi'[email protected]

about how to solve problems in parallelwas often like trying to solve a thousandcrossword puzzles at once—but the po-tential payoff seemed enormous. Manyinvestigators were positive that withina few years, computer modeling wouldlet scientists investigate a whole range ofphenomena that were ttxi big, too small,too fast, too slow, too dangerous or toocomplicated to examine in the lab or toanalyze with pencil and paper.

But by the mid-1990s, I had a naggingfeeling that something was wrong.For every successful simulation ofglobal climate, there were a dozenor more groups struggling just toget their program to run. Their workwas never quite ready to showcase atconferences or on the cover of their kicalsupercomputing center's newsletter.Many struggled on for months or years,tweaking and tinkering until theircode did something more interestingthan grinding to a halt or dividingby zero. For some reason, getting tocomputational heaven was taking a lotlonger than expected.

I therefore started asking scientistshow they wrote their programs. Theanswers were sobering. Whereasa few knew more than most of thecommercial software developersI'd worked with, the overwhelmingmajority were still using ancient texteditors like Vi and Notepad, sharingfiles with colleagues by emailingthem around and testing by, well,actually, not testing their programssystematically at all.

I finally asked a friend who waspursuing a doctorate in particle physicswhy he insisted on doing e\'erything thehard way. Why not use an integrateddevelopment environment with asymbolic debugger? Why not write unittests? Why not use a version-controlsystem? His answer was, "What's aversion-control system?"

A version-control system, I explained,is a piece of software that monitorschanges to files—programs. Webpages, grant proposals and pretty muchanything else. It works like the "undo"button on vour favorite editor: At anv

www.americanscientist.org 2006 January-February 5

point, you can go back to an olderversion of the file or see the differencesbetween the way the file was then andthe way it is now. You can also determinewho else has edited the file or findconflicts between their changes and fheones you've just made. Version controlis as fundamental to programming asaccurate notes about lab procedures areto experimental science. It's what letsyou say, "This is how 1 produced theseresults," rather than, "Um, I think wewere using fhe new algorithm for thatgraph—I mean, the old new algorithm,not the new new algorithm."

My friend was intelligent andintimately familiar with the problemsof writing large programs—he hadinherited more than 100,000 lines ofcomputer code and had already added20,000 more. Discovering that he didn'teven know what s'ersion control meantwas like finding a chemist who didn'trealize she needed to clean her testtubes between experiments. It wasn'ta happy conversation for him either.Halfway through my explanation, hesighed and said, "Couldn't you havetold me this three years ago?"

As the Twig Is Bent...Once 1 knew to look, I saw this"computational illiteracy" everywhere.Most scientists had simply never beenshown how to program efficiently.After a generic freshman programmingcourse in C or Ja\'a, and possiblya course on statistics or numericalmethods in their junior or senior year,they were expected to discover or re-invent everything else themselves,which is about as reasonable asshowing someone how to differentiatepolynomials and then telling them togo and do some tensor calculus.

Yes, the relevant information was allon the Web, but it was, and is, scatteredacross hundreds of different sites. Moreimportant, people would have to investmonths or years acquiring backgroundknowledge before they could makesense of it all. As another physicist(somewhat older and more cynical thanmy friend) said to me when I suggestedfhat he take a couple of weeks and learnsome Perl, "Sure, just as soon as youtake a couple of weeks and leam somequantum chromodynamics so that youcan do/f/i/job."

His comment points at anotherreason why many scientists haven'tadopted better working practices. Afterbeing run over by one bandwagon

after another, these investigators arejustifiably skeptical when someone says,"I'm from computer science, and I'mhere to help you." From object-orientedlanguages to today's craze for "agile"programming, scientists have sufferedthrough one fad affer another withouttheir lives becoming noticeably better.

Scientists are also often frListratedby the "accidental complexity" of whatcomputer science has to offer. Eorexample, every modern programminglanguage provides a library for regularexpressions, which are patterns usedto find data in text files. However, eachlanguage's rules for how those patternsactually work are slightly differenf.When something as fundamental as theUnix operating system itself has threeor four slightly different notations forthe same concept, it's no wonder that somany scientists throw up their handsin despair and stick to lowest commondenominators.

Just how big an impact is thelack of programming savvy amongscientists having? To get a handle onthe answer, consider a variation on oneof the fundamental rules of computerarchitecture, known as Amdahl's Law.Suppose that it takes six months to writeand debug a program that then has torun for another six months on today'shardware to generate publishableresults. Even an infinitely fast computer(perhaps one thrown backward in timeby some future physics experiment gonewrong) would only cut the mean timebetween publications in half, becauseit would only eliminate one restrictionin the pipeline. Increasingly, the reallimit on what computational scientistscan accomplish is how quickly andreliably they can translate fheir ideasinto working code.

A Little KnowledgeIn 1998, Brent Gorda (now at LawrenceLivermore National Laboratory) and Istarted trying to address this issue byteaching a short course on software-development skills fo scientists at LosAlamos National Laboratory. Our aimwasn't to turn LANL's physicists andmetallurgists into computer scientists.Instead, we wanted to show themthe 10 percent of modern softwareengineering that would handle 90percent of their needs.

The first few rounds had their ups anddowns, buf from what participants said,and from what they did after the coursewas over, it was clear that we were on

the right track. A few techniques, and anintroduction to the tools thaf supportedthem, could save scientists immensefrustratit)n. What's more, we found thatmost scientists were \'ery open to theseideas, which probably shouldn't havesurprised us as much as it did. After all,the importance of being methodical hadbeen drilled into them from their firstundergraduate lab.

Six years and one dot-com boomlater, I received funding from thePython Software Foundation fo bringthat course up to date and to make itavailable on the Web under an openlicense so that anyone who wants touse it is free to do so. It covers tools andworking practices that can improveboth the quality of what scientificprogrammers produce, and the speedwith which they produce it, so that theycan spend less time wrestling withtheir programs and more doing theirresearch. Topics include version control,automating repetitive tasks, systematictesting, coding style and reading code,some basic data crunching and Webprogramming, and a quick survey ofhow to manage development in a small,geographically distributed team. Noneof this is rocket science—it's just theprogramming equivalent of knowinghow to titrate a solution or calibrate anoscilloscope.

On the Hard Drives of GiantsScience is much more than just a bodyof knowledge. It's a way of doing thingsthat lets people separated by oceans,decades, languages and ideologiesbuild on one another's discoveries.Computers are playing an ever-largerrole in research with every passingyear, but few scientific programs meetthe methodological standards thatpioneers like Lavoisier and Earaday setfor experimental science more than 200years ago.

Better education is obviously key toclosing this gap, buf if won't be enough onits own. Journals need to start insistingthat scientists' computational work meetfhe same qualify and reproducibilitystandards as their laboratory work. Atthe same time, we urgently need morejournals willing to publish descriptionsof how scientists develop software,and how that software functions.Faster chips and more sophisticatedalgorithms aren't enough—if we reallywant computational science to come intoits own, we have to tackle the bottleneckbetween our ears.

6 American Scientist, Volume 94