ascr overview and perspective - cxrocxro.lbl.gov/pdf/bill_harrod_ascr.pdfascr overview and...
TRANSCRIPT
ASCROverviewandPerspectiveon
SemiconductorTechnologies
BillHarrodDOE/ASCR
March24,2016
RelevantWebsitesASCR:science.energy.gov/ascr/ASCRWorkshopsandConferences:science.energy.gov/ascr/news-and-resources/workshops-and-conferences/SciDAC:www.scidac.govINCITE:science.energy.gov/ascr/facilities/incite/
AdvancedScientificComputingResearch(ASCR)ataGlance
Office ofAdvancedScientific ComputingResearch
AssociateDirector–SteveBinkleyPhone:301-903-7486
E-mail: [email protected]
Research
DivisionDirector–WilliamHarrodPhone:301-903-5800
E-mail: [email protected]
Facilities
DivisionDirector–BarbaraHellandPhone:301-903-9958
E-mail: [email protected]
1
ASCRResearchDivision• AppliedMathematics• Emphasizes scalable numerical methods forcomplex systems, uncertainty quantification, large-scale data
analysis and exascalealgorithms;• ComputerScience• Exascalecomputing (architecture, parallelism, power aware,faulttolerance), operating systems, compilers,
performance tools, productivity, scientific datamanagement, analysis andvisualization forpetabyte toexabytedatasets;
• Partnerships• Co-Design andpartnerships topioneer thefuture ofscientific applications;• NextGenerationNetworksforScience• Tools for thefuture ofdistributed science• ResearchandEvaluationPrototypes• FastForward andDesign Forwardpartnerships with Industry andNon-Recurring Engineering fortheplanned
facility upgrades
2
NATIONALSTRATEGICCOMPUTINGINITIATIVEJuly29,2015
EXECUTIVEORDER- - - - - - -
CREATINGANATIONALSTRATEGICCOMPUTINGINITIATIVE
BytheauthorityvestedinmeasPresidentbytheConstitutionandthelawsoftheUnitedStatesofAmerica,andtomaximizebenefitsofhigh-performancecomputing(HPC)research,development,anddeployment,itisherebyorderedasfollows:
TheNSCI isawhole-of-governmenteffortdesignedtocreateacohesive,multi-agencystrategicvisionandFederalinvestmentstrategy,executedincollaborationwithindustryandacademia, tomaximizethebenefitsofHPCfortheUnitedStates.
https://www.whitehouse.gov/the -pre ss -offi ce/2015/07/29/executive-order -creati ng-national -strategi c-computing -initiativehttps://www.whitehouse.gov/sites/de fault/ files/mi crosite s/ostp/nsci _fa ct_sheet.pdf
March31,2016 DOEEEEWorkshop 3
NSCIIntent• National
• “Whole-of-government”and“whole-of-Nation”approach• Public/privatepartnershipwith industryandacademia
• Strategic• Leveragebeyondindividualprograms(akey“platform”technology)• Longtimehorizon(decadeormore)
• Computing• HPC=mostadvanced,capablecomputingtechnologyavailableinagivenera• Multiplestylesofcomputingandallnecessaryinfrastructure• Scopeincludeseverythingnecessaryforafullyintegratedcapability
• Theoryandpractice,softwareandhardware
• Initiative• Abovebaselineeffort• Linkandliftefforts
Enhance U.S.strategicadvantage inHPC foreconomic competitiveness and scientific discovery
March31,2016 DOEEEEWorkshop 4
KeyThemes
• Striveforconvergenceofnumericallyintensiveanddata-intensivecomputing• KeeptheU.S.attheforefrontofHPCcapabilities• StreamlineHPCapplicationdevelopment• MakeHPCreadilyusableandaccessible• EstablishhardwaretechnologyforfutureHPCsystems
March31,2016 DOEEEEWorkshop 5
Genomics• Sequencer datavolume increasing 12x,next 3years• Sequencer costdeceasing by 10oversameperiod
HighEnergyPhysics• LHCExperiments produce petabytes ofdata/year• Peakdataratesincrease 3-5x over 5years
LightSources• Manydetectors onMoore’s Lawcurve• Datavolumes rendering previous models obsolete
Climate• By2020, climatedataexpected tobeexabytes• Significant challenges in datamanagement &analysis
“VeryfewlargescaleapplicationsofpracticalimportanceareNOTdataintensive.”Alok Choudhary, IESP,Kobe,Japan,April2012
• DOEmissionsrequirecomputationalenvironmentsthataddressbothcomputeanddata-intensivesimultaneously
• Data-intensivesciencefacesmanyofthesametechnologychallengesofextreme-computing– Someareevenworsefor“big-data”
– Energyuse isthegrandchallenge(e.g.thesquarekilometerarrayestimates100MWneededforcomputing)
March31,2016 DOEEEEWorkshop 6
ConvergenceofComputeandData-intensiveScienceCriticalto21st CenturyScience
Systemattributes NERSCNow
OLCFNow
ALCFNow
NERSCUpgrade
OLCF CORALUpgrade
ALCF CORALUpgrades
NameInstallation
Edison TITAN MIRA Cori2016
Summit2017-2018
Theta2016
Aurora2018-2019
System peak(PF) 2.6 27 10 >30 150 >8.5 180
PeakPower(MW)
2 9 4.8 <3.7 10 1.7 13
Totalsystemmemory
357TB 710TB 768TB
~1PBDDR4+ HighBandwidthMemory
(HBM)+1.5PBpersistentmemory
>1.74 PBDDR4+HBM+2.8PB
persistentmemory
>480TBDDR4+HighBandwidthMemory(HBM)
>7 PBHighBandwidthOn-PackageMemoryLocalMemoryandPersistentMemory
Node Perf.(TF) 0.460 1.452 0.204 >3 >40 >3 >17timesMira
Nodeprocessors IntelIvyBridge
AMDOpteron
Nvidia Kepler
64-bitPowerPCA2
Intel KnightsLandingXeonPhi
IntelHaswell CPUindatapartition
IBMPower9CPU
Nvidia VoltasGPUS
IntelKnightsLandingXeonPhi
IntelKnightsHillXeonPhi
Systemsize(nodes)
5,600nodes 18,688nodes 49,1529,300nodes
1,900nodesindatapartition
~3,500 nodes >2,500nodes >50,000 nodes
SystemInterconnect
Aries Gemini 5DTorus Aries DualRailEDR-IB Aries 2nd GenerationIntelOmni-PathArchitecture
FileSystem7.6PB
168GB/s,Lustre®
32PB1TB/s,Lustre®
26PB300GB/sGPFS™
28PB744GB/sLustre®
120 PB1TB/sGPFS™
10PB210GB/sLustre®
150 PB1TB/sLustre®March31,2016 DOEEEEWorkshop 7
HighPerformanceComputing (HPC)USFederalGovernment Investments
• USfederalHighPerformanceComputing (HPC)investments– Madepivotalinvestmentsinthecomputerindustryatcriticaltimes– Duringstabletimes,noinvestmentisrequiredorrequested– Today,wehavereachedacriticalperiod:confluenceofdigitalizationofoureconomyandsocietyandtheendofDennardScaling
• PreviousUSFederalHPCinvestmentsFueledmajorHPCadvances– 1946ENIAC:startofelectronicdigitalcomputing– 1951ERA-1101:technicalcomputing– 1972ILLIACIV:parallelcomputing– 1993CrayT3D:massivelyparallelcomputing– 2004IBMBG:lowpowercomputing– 2011CrayXC30&IBMPOWER7:productivitycomputing– 2023Exascale:energyefficiencycomputing
March31,2016 DOEEEEWorkshop 8
UncertaintyThreatensUSEconomicGrowth• Theworldhaschanged– technology ischangingatadramatic rate–DennardScaling hasended–EndofMoore'sLawlooming
• The ITmarketplace isalsochangingdramatically–PCsaleshaveflattened–Handhelds dominategrowth,H/WandS/W–HPCvendoruncertainty
• Need todrive innovationsatalllevelsoftechnology–Nodeandsystemdesigns–Systemanddevelopment software–Workflows–Algorithms
March31,2016 9
IDCWorldwideITDataITSpending($m)RowLabels 2015 2016 2017 2018 2019Devices 795,402 807,257 810,643 814,677 809,947EnterpriseHardware 249,493 256,669 264,770 272,898 280,366Software 434,727 464,242 496,264 530,551 568,049Services 668,305 690,734 713,829 738,019 762,725TotalIT 2,147,927 2,218,902 2,285,507 2,356,145 2,421,087
HPCOnly: 2015 2016 2017 2018 2019Revenue$M 11,434 12,327 13,286 14,160 15,262Source:IDC2016
Source: Intel
DOEEEEWorkshop
Challenges– acrosstheITmarketspace• IncreasingPerformance/Value• Efficiency
• Energyefficiency:reduceenergyperoperation(pJ/op)• Hardwareefficiency:massivelyparallelarchitectures,downto
theprocessorlevel• Softwareefficiency:effectivelyexploitH/Wparallelism• Currentefficienciesonconventionalmachinesare<10%formany
real-worldapplications• Memory/Storage
• Makeeffectiveuseofdatamovement (thisisthedominantenergycost)
• Reliability• Successfullycompleteexecutionthroughsystemfailures
• Productivity• ProgrammingenvironmentthatmakesHPCmachinesaccessible
toeveryone• Reducetimetosolution
• Cost /Affordability
March31,2016 DOEEEEWorkshop 10
FromGigatoExa,viaTera &Peta
1
10
100
1000
1986 1996 2006 2016
RelativeTransistorPerfo
rmance
Giga
Tera
PetaExa
32xfrom transistor32xfromparallelism
8xfrom transistor128x fromparallelism
1.5xfrom transistor670x fromparallelism
Basiccompute loop
March31,2016 DOEEEEWorkshop 11
Shekhar Borkar,Intel
PerformanceFactors- SLOWER
DefinitionofSLOWERterms
• Starvation• Insufficiencyofconcurrencyofwork
• Impactsscalabilityandlatencyhiding
• Effectsprogrammability
• Latency• Timemeasureddistanceforremoteaccess andservices
• Impactsefficiency
• Overhead• Criticaltimeadditionalworktomanagetasks&resources
• Impactsefficiencyandgranularityforscalability
• Waitingforcontentionresolution• Delaysduetosimultaneousaccessrequeststosharedphysicalorlogicalresources
P = s S ×e(L, O,W)×U(E)×a(R)P – averageperformance(ops)e – efficiency(0<e <1)s – application’saverageparallelism,a – availability(0<a <1)U – normalizationfactor/computeunitE – wattsperaveragecomputeunitR – reliability(0<R <1)
March31,2016 DOEEEEWorkshop 12
ThomasSterling,IU
HardwareArchitectureImpactonSLOWERmetrics• Starvation• Support for fine grainparallelism andlightweightmessaging, eliminate global barriers
• Latency• Putmemory and computational elements inclose proximity, support message drivencomputation
• Overhead• Reduce times forthreadcreation and contextswitching, support forGAS
• Waitingforcontentionresolution• Increase bandwidths formemory, networks,
andALUswith adaptive scheduling, routing,and resource allocation
Algorithms&Applications
March31,2016 DOEEEEWorkshop 13
I'm supposed to be a scientific person but I
use intuition more than logic in making basic
decisions.Seymour Cray
Read more at: http://www.azquotes.com/quote/729170
Algorithms&Applications
Thepost-2025applications,softwareandhardwareeffortsdesperately requiretheutilizationofanapplication-drivenco-designprocess
Application-drivenco-designistheprocessbywhich:• Scientificproblemsrequirementsguidethecomputerarchitectureandsystemsoftwaredesign• Technologycapabilitiesandconstraints informformulationanddesignofalgorithms,applicationsandsoftware
March31,2016 DOEEEEWorkshop 14
HardwareArchitectureResearchAreas
Architecturesdrivenbynew
devices
SpecializationRespondingtoreal-world
heterogeniety
Improvingperformance,efficiency,andproductivity
March31,2016 DOEEEEWorkshop 15
ArchitecturesdrivenbynewdevicesARPA-e: SWITCHES
SWITCHES projects aimtofind innovativesemiconductor materials,device architectures, anddevice fabrication processesthatwill enable increasedswitching frequency
Researchprograms areaimed at“7nanometer andbeyond”silicon technology anddeveloping alternativetechnologies for post-silicon-erachips using entirely differentapproaches($3Billion investment)
IBMResearch Initiative
Stanford-led skyscraper-stylechip design boosts electronicperformance byfactorofathousand
Carbon nanotube transistors
StanfordN3XTProject
March31,2016 DOEEEEWorkshop 16
Improvingperformance,efficiency,andproductivity
• Photonicsswitch• HMC• Shekhar’s nodedesign
Silicon photonics (SiP)Bandwidth
NearThreshold Voltage (NTV)EnergyEfficiency
CircuitsPowerReduction
March31,2016 DOEEEEWorkshop 17
SpecializationIntegrating currentandfutureprocessingtechnology intothecomputing fabric– heterogeneous processing
CoreA
CoreB
CoreC
CoreD
CoreC
CoreD
CoreD
CoreC
MC
MC
NIC
March31,2016 DOEEEEWorkshop 18
FinalWords• Thesemiconductor industry iscrucialtotheU.S.economy– drivesa>$2T/yr ITmarket• What’safterCMOS?– efficientCMOS&substantially improvedHW/SWarchitectures• There isanexponentially increasing demandforinformationtechnology– newtechnologiesarelimitedbycosttodevelopandmanufacture,notinnovations
• Itisn’tclearwhatistheenablingtechnology forthe“PostCMOS”|“PostMoore’sLaw”epoch– requiresinvolvement frommanydifferentorganizations
• Weneedasignificant investment inHW/SWarchitecture thatutilizes theco-designprocess – $$$
• Weneedtostoptryingtomaketomorrow'scomputers lookandoperate likeyesterday’scomputers– don’tbeconstrainedbythevonNeumannparadigm
• Theultimate challenge isn’t findingthetechnical solutions – it’sacceptingthatachangeinhowwethinkaboutcomputersisrequiredtoenablethenextmajoradvances
March31,2016 DOEEEEWorkshop 19