bridging the assessment modeling gap the perfect...
TRANSCRIPT
![Page 1: Bridging the Assessment Modeling Gap the PERFECT Wayhpc.pnl.gov/modsim/2014/Presentations/Barker.pdfSynthetic Aperture Radar (SAR) 2. Wide Area Motion Imaging (WAMI) 3. Space Time](https://reader034.vdocuments.net/reader034/viewer/2022050110/5f480704b2337e2f29031861/html5/thumbnails/1.jpg)
August 8, 2014 1
Bridging the Assessment Modeling Gap the PERFECT Way KEVIN J BARKER, DARREN J KERBYSON, ADOLFY HOISIE, ANDRES MARQUEZ, JOSEPH MANZANO, ROBERTO GIOIOSA, ANTONINO TUMEO, NATHAN TALLENT, GOKCEN KESTOR, LEON SONG
Pacific Northwest National Laboratory Modeling and Simulation Workshop, Seattle 2014
![Page 2: Bridging the Assessment Modeling Gap the PERFECT Wayhpc.pnl.gov/modsim/2014/Presentations/Barker.pdfSynthetic Aperture Radar (SAR) 2. Wide Area Motion Imaging (WAMI) 3. Space Time](https://reader034.vdocuments.net/reader034/viewer/2022050110/5f480704b2337e2f29031861/html5/thumbnails/2.jpg)
Test and Validation (TAV) in DARPA PERFECT
August 10, 2014 2
! DARPA’s PERFECT (Power Efficiency Revolution for Embedded Computing Technologies) ! Spearheading R&D into a multitude of diverse technologies ! 75 Gflops/W for general-purpose embedding computing ! Envisioning 7nm technologies in 2018 – 2020 timeframe
! Program split across 3 phases, beginning at the end of 2012
! 17 initial Performer teams ! Device technology, Architecture, Systems, Software, and Optimization
! Test and Validation (TAV) team ! Quantitatively assess PERFECT technologies individually with respect to
the overall system & overall performance and power goals ! Use a combination of benchmarking, modeling and simulation
![Page 3: Bridging the Assessment Modeling Gap the PERFECT Wayhpc.pnl.gov/modsim/2014/Presentations/Barker.pdfSynthetic Aperture Radar (SAR) 2. Wide Area Motion Imaging (WAMI) 3. Space Time](https://reader034.vdocuments.net/reader034/viewer/2022050110/5f480704b2337e2f29031861/html5/thumbnails/3.jpg)
PERFECT Assessment: Challenges and Strategy
August 8, 2014 3
! PERFECT Challenge ! Goal: Embedded system delivering “75
GFLOPS/W” ! Performers contribute only part of a
system (architecture to algorithms) ! TAV must assess Performer’s
contribution w.r.t. entire system
! Three pillars to assessment strategy ! Baseline Architectures: quantifying
today’s state of the art ! PERFECT Suite: defining a workload ! Proxy Architecture: modeling
framework
NTV (PVT) MUGFETS
Memory Cells
Compute Cores General Accelerators
Mem
ory
EC
C
Register Files
ALUs
Processing Stages Pipelining Caches
Concurrency Management
Power M
anagement
CommunicaCon Avoidance
Locality
Resilience
Uncertainty
V /F GaC
ng/Scalin
g
Overfetch
Throughput
Models
Simulator
Benchmarks
Benchmarks
Emulator
xRA
M
SEE Hardening
Metrics
3D Stacking
Signaling
Models
Emulator
Emulator
Models
Simulator
Simulator
Benchmarks
Legend Circuit Design
Circuit Modeling & Simula5on
Architecture Modeling & Sim
System So9ware
Architecture Design
Algorithm Modeling & Sim
Algorithms
![Page 4: Bridging the Assessment Modeling Gap the PERFECT Wayhpc.pnl.gov/modsim/2014/Presentations/Barker.pdfSynthetic Aperture Radar (SAR) 2. Wide Area Motion Imaging (WAMI) 3. Space Time](https://reader034.vdocuments.net/reader034/viewer/2022050110/5f480704b2337e2f29031861/html5/thumbnails/4.jpg)
Modeling and Simulation Challenges
August 11, 2014 4
! Integrating Performance and Power Prediction ! Overall PERFECT goals stipulate efficiency ! PERFECT TAV effort has defined methodologies for Performance and
Power prediction thrusts ! Defining a suitable interface with Performer teams
! What information is required to parameterize the models? ! What is the appropriate level of architectural abstraction?
! Although not the goal of PERFECT, how can these methodologies be extended to large-scale systems? ! Potential for combining with existing scalability modeling methodologies
with PERFECT TAV tools
![Page 5: Bridging the Assessment Modeling Gap the PERFECT Wayhpc.pnl.gov/modsim/2014/Presentations/Barker.pdfSynthetic Aperture Radar (SAR) 2. Wide Area Motion Imaging (WAMI) 3. Space Time](https://reader034.vdocuments.net/reader034/viewer/2022050110/5f480704b2337e2f29031861/html5/thumbnails/5.jpg)
TAV Overall Approach: 3 Pillars
August 10, 2014 5
Performance Analysis
Proxy Archite
cture Fram
ework
Architecture Probe
Performance Kernel Trace
Technology Model Circuit Model
Architecture Model
Tran
slat
or
Power Analysis
Proxy Architecture
InterpolaCon 1 InterpolaCon 2 Back ProjecCon
System Solver Inner Product Outer Product
Image RegistraCon
Change DetecCon Debayer
DWT Hist. EqualizaCon
2D ConvoluCon
Architectural Probes
SAR
WAMI
STAP
PFAP-‐1
SORT
2D
FFT
1D FFT
Benchmark Suite
Performer Technology Developments
Evalua
Con of Future Im
pact
nCore BrownDwarf Y-‐Class TI Keystone II
NVIDIA Kayla
IBM POWER7
Intel x86 Haswell
Baseline Architectures
Modeling and Simula5on focus
![Page 6: Bridging the Assessment Modeling Gap the PERFECT Wayhpc.pnl.gov/modsim/2014/Presentations/Barker.pdfSynthetic Aperture Radar (SAR) 2. Wide Area Motion Imaging (WAMI) 3. Space Time](https://reader034.vdocuments.net/reader034/viewer/2022050110/5f480704b2337e2f29031861/html5/thumbnails/6.jpg)
PERFECT TAV Pillar 1: Baseline Architectures ! Architectures that reflect state-of-
the-art systems in Phase 1 ! Real world data points for power/
performance model validation ! Calibration of Performer’s modeling
and simulation environments ! Validation of TAV-developed
models on existing platforms
Pla^orm # Cores (Threads)
Peak Perf
(Gflops)
Clock (GHz)
Peak Power (Waas)
Gflops per Waa
Memory (GB)
nCore BD-‐Y TI Keystone II
16+96 (ARM + DSP)
614.4 (SP)
1.2 + 1.4 36-‐56 17.1 –
11 (SP) 56
Nvidia Kayla 4+2(384) (ARM+SMX)
300 (SP) 1.2 / 1.05 27 11.1 2/1
IBM Power7 8(32) 265 (DP) 4.2 240 1.1 16
Intel X86 Haswell 4(8) 295 (SP) 2.3 45 6.5 16
! Power Instrumentation: ! Level 1: Watts-up power meters ! Level 2: Internal architecture-supplied counters (e.g., RAPL, Ameester, etc.) ! Level 3: High-fidelity DAQ
! Baseline Architectures in place in EHPC lab at PNNL and are being used
nCore Kayla Power7 Haswell
![Page 7: Bridging the Assessment Modeling Gap the PERFECT Wayhpc.pnl.gov/modsim/2014/Presentations/Barker.pdfSynthetic Aperture Radar (SAR) 2. Wide Area Motion Imaging (WAMI) 3. Space Time](https://reader034.vdocuments.net/reader034/viewer/2022050110/5f480704b2337e2f29031861/html5/thumbnails/7.jpg)
PERFECT TAV Pillar 2: PERFECT Suite
August 10, 2014 7
! Kernels and applications that represent application domains
! Benchmark-specific models will be validated on Baseline Architectures and used to predict performance and power consumption of Performer architectures
! Selection criteria ! PERFECT’s domain of interest ! Alignment of app/kernels to Performer’s projects ! Reasonable input data set sizes selected
1. Synthetic Aperture Radar (SAR) 2. Wide Area Motion Imaging (WAMI) 3. Space Time Adaptive Processing (STAP) 4. PERFECT APPLICATION 1 (PFAP-1) 5. 3 “Core” Kernels (Sort, 1D and 2D FFTs)
! All serial and CUDA reference kernels available ! http://hpc.pnnl.gov/projects/PERFECT
InterpolaCon 1 InterpolaCon 2 Back ProjecCon
System Solver Inner Product Outer Product
Image RegistraCon
Change DetecCon Debayer
DWT Hist. EqualizaCon
2D ConvoluCon
Architectural Probes
SAR
WAMI
STAP
PFAP-‐1
SORT
2D
FFT
1D FFT
Benchmark Suite
![Page 8: Bridging the Assessment Modeling Gap the PERFECT Wayhpc.pnl.gov/modsim/2014/Presentations/Barker.pdfSynthetic Aperture Radar (SAR) 2. Wide Area Motion Imaging (WAMI) 3. Space Time](https://reader034.vdocuments.net/reader034/viewer/2022050110/5f480704b2337e2f29031861/html5/thumbnails/8.jpg)
PERFECT TAV Pillar 3: Proxy Architecture Modeling Framework
8
! An integrated approach to assist in the assessment of component technologies in the context of a complete architecture
! The Proxy Architecture is a mechanism that accommodates solutions from diverse technology research (architectures to algorithms)
Two thrusts: ! Performance Analysis
! Benchmark-specific models parameterized in terms of architecture performance capabilities to derive predicted performance ranges
! Solution providers define architectural operations, latencies, and throughputs ! Power Analysis
! Utilizes McPAT, an open-source Power, Area, and Timing modeling framework ! Input from Performer teams to define technology, circuit, and architecture
parameters
Performance Analysis
Proxy Archite
cture Fram
ework
Architecture Probe
Performance Kernel Trace
Technology Model Circuit Model
Architecture Model
Tran
slat
or
Power Analysis
Proxy Architecture
![Page 9: Bridging the Assessment Modeling Gap the PERFECT Wayhpc.pnl.gov/modsim/2014/Presentations/Barker.pdfSynthetic Aperture Radar (SAR) 2. Wide Area Motion Imaging (WAMI) 3. Space Time](https://reader034.vdocuments.net/reader034/viewer/2022050110/5f480704b2337e2f29031861/html5/thumbnails/9.jpg)
Current Status of PNNL TAV Activities
August 10, 2014 9
! Baseline Architectures installed in the EHPC lab at PNNL and are being remotely accessed by Performers
! Benchmark Suite in use by Performers and available to the community ! Kernels available now; full scale applications available shortly
! Calibration of Performer’s simulation environments in progress ! Calibration against Baseline Architectures ! Provides confidence in each simulation environment
! Proxy Architecture ! Power thrust: P-McPAT Maintenance and Infrastructure
! TAV P-McPAT Architectural Validation ! Performance thrust: Internal validation of modeling methodology utilizing
kernels from the TAV Benchmark Suite and Baseline Architectures ! Initial decomposition of selected kernel into low-level operations ! Architectural micro-kernel framework to measure latencies and throughputs
![Page 10: Bridging the Assessment Modeling Gap the PERFECT Wayhpc.pnl.gov/modsim/2014/Presentations/Barker.pdfSynthetic Aperture Radar (SAR) 2. Wide Area Motion Imaging (WAMI) 3. Space Time](https://reader034.vdocuments.net/reader034/viewer/2022050110/5f480704b2337e2f29031861/html5/thumbnails/10.jpg)
PERFECT TAV Summary
August 10, 2014 10
! Key challenge being tackled by TAV in DARPA PERFECT is: ! To quantitatively assess PERFECT technologies individually and with
respect to the programs overall performance and power goals ! Approach is to use a combination of benchmarking, modeling, and
simulation in 3 pillars: ! Baseline Architectures ! Benchmark Suite ! Proxy Architecture
! Modeling and simulation work is in the Proxy Architecture pillar, and is the primary focus of current research
! Approach is not restricted to DARPA PERFECT, but is generally applicable for assessing the potential of future technologies ! Unified approach capturing performance and power impacts ! Current effort on integrating resilience modeling as well
![Page 11: Bridging the Assessment Modeling Gap the PERFECT Wayhpc.pnl.gov/modsim/2014/Presentations/Barker.pdfSynthetic Aperture Radar (SAR) 2. Wide Area Motion Imaging (WAMI) 3. Space Time](https://reader034.vdocuments.net/reader034/viewer/2022050110/5f480704b2337e2f29031861/html5/thumbnails/11.jpg)
Gaps and Opportunities
August 10, 2014 11
! PERFECT TAV strategy defines a consistent evaluation methodology ! Targeting diverse embedded computing technologies and architectures ! However, we are not limited to embedded systems; strategy can be
applied to systems across scales
! What are the gaps? ! Architectural specification and benchmarking (e.g., metrics) ! Third component of PPR – Resilience
! What are the opportunities? ! Opportunity for large-scale modeling tool kit from “first principles” ! Need well-defined interfaces between modeling layers
! “Bag of tools” approach will allow different capabilities to be “plugged in” ! Modeling tools selected based on desired levels of abstraction ! Encourage interaction between researchers, designers, and vendors