addressing exascale emulation debug complexity · next generation tools for waveform-level debug...
TRANSCRIPT
Luca RastelloApplication Engineer, Verification Group
The case for a system-level approach
Addressing Exascale Emulation Debug Complexity
© 2019 Synopsys, Inc. 2
Agenda
• Exascale Debug Challenges
• ZeBu Exascale Debug
© 2019 Synopsys, Inc. 3
Exascale Debug Challenges
© 2019 Synopsys, Inc. 4
Emergence of Exascale Debug ComplexityG
ate
s
Cycles
IP/
Subsystems
109
109
GPU
Server Networking
Mobile
AI
Exascale Debug
Complexity
Exa = 1018
© 2019 Synopsys, Inc. 5
SoC+ SW+ System
Application-Level Debug
Exascale Debug Requires Higher Levels of Abstraction
Waveform-Level Debug
SoC + SW
System-Level DebugIP / Subsystem
Scope of Emulation Debug Abstraction
Millions of cycles
Millions of gates
Billions of cycles
Billions of gates
Today’s Presentation
Today’s Presentation
© 2019 Synopsys, Inc. 6
Real World Example for Exascale Debug Challenge
User Applications(Internet Browser)
Boot OS(Bootloader, Kernel, Init, VM, etc.)
System Services(Pwr, Batt, NetStat, Bluetooth, etc.)
Reset
Root Cause(Cache Coherency
Error)
Failure(OS Hangs)
Problem: Failure happens far later than bug…
Root cause cannot be traced
Few Millions 2 Billion1 BillionCPU Clocks
Iterative waveform-level debug leads to long, unpredictable time to root cause
2+ billion clocks
Traditional
waveform depth
(1-2M samples)
© 2019 Synopsys, Inc. 7
• Iterative waveform-level debug is similar to Depth First Search
– In-depth (details) analysis of a node
– Waveform based debug of a small window (1-2M cycles) of interest
– Node to node traversal
– Waveform based debug of a different small window of interest
– Repeat “n” times for graph traversal
– Issue is root caused
• Number of iterations become too large for exascale debug problems
– Time to root cause increases exponentially with iterative waveform-level debug
A parallel drawn from graph theory
Why Waveform-level Debug is No Longer Enough
Exascale debug requires higher level debug abstraction
© 2019 Synopsys, Inc. 8
TimeTB
DUT
Why High Throughput Emulation Can Cause Non-Determinism
Deterministic failure reproduction is a key debug requirement for high throughput emulation
Low throughput emulation
DUT clock stops
Throughput reduction by up to 5X
High throughput emulation
DUT clock doesn’t stop
Speedup by parallel TB and DUT execution
Non-determinism in high throughput emulation
TB transactions arrive at different times in subsequent runs
Nearly impossible to reproduce failure! Time
TB
TB
DUT
Regression Run
Failure
Time
TB
TB
DUT
Debug Run #1
No Failure
Time
TB
TB
DUT
Debug Run #2
No Failure
TimeTB
DUT
Typical emulation with multiple testbench interfaces
© 2019 Synopsys, Inc. 9
Too slow
Why BGate SoCs Dramatically Slow Waveform-level Debug
IP /
Subsystem
Emulation Raw
Debug Data
Waveform
Debugger
Expanded
Debug Data
Fast waveform expansion and load time essential for rapid root cause of billion gate SoCs
Expansion LoadData Dump
SoC + SW
EmulationRaw
Debug Data
Waveform
Debugger
Expanded
Debug DataExpansion LoadData Dump
MB mins GB min
GB hour GBs mins➔hr
© 2019 Synopsys, Inc. 10
Challenge: Reduce time to root cause by eliminating iterative waveform-level debug
Requirement 1: Higher level analysis of entire design over entire billion cycle run
– Abstract analysis of entire graph to identify node (Breadth First Search)
Challenge: Deterministic bug reproduction in high throughput emulation
Requirement 2: Ability to reproduce failure deterministically in subsequent runs
– Consistent reproductions of the graph
Challenge: Scalability of waveform based debug for billion gate SoC
Requirement 3: Next generation tools for waveform-based analysis
– Detailed analysis of specific node
Three Requirements for Exascale Debug
© 2019 Synopsys, Inc. 11
ZeBu Exascale Debug
System-level Abstraction
Deterministic Replay
Waveform Scalability
© 2019 Synopsys, Inc. 12
• ZeBu offers streaming capability to extract system-level data
– Abstract logs (monitors/checkers) of all key interfaces
– Dump key signals
– Language features like assertions, system tasks, DPI etc.
– At-speed execution
– Infinite depth covering entire test run
– No throughput impact - Emulation clock doesn’t stop
• System-level data analysis to identify failure window
– Checkers for coarse grain search
– Monitors, key signal waveform for refining the window
High level analysis of entire design over entire billion cycle run
System-level Abstraction Debug with ZeBu
ZeBu system-level debug enables failure window identification in a single pass
Checkers
Monitors
Key signals
BCycles
MCycles
Monitors
Checkers
Key events
System log
ZeBu Server 4
© 2019 Synopsys, Inc. 13
ZeBu record/replay and save/restore enable deterministic generation of failure debug data
• ZeBu Record/Replay
– Applicable for testbench
– Eliminate testbench non-determinism
– In subsequent run, testbench is replayed
• ZeBu Save/Restore
– Applicable for DUT
– Eliminate the need to restart from time 0
– Run can start close to failure point
• Application
– Main run with stimuli recording
– DUT save during main run
– Restore, deterministic replay and debug data dump
Deterministic Error Reproduction in ZeBu
DUTTestbench
TimeTest EndRoot
Cause
Stimuli Recording #0 #N #M… … …Time N
DUT
State
Save
Time M
DUT
State
Save
…
Time
Stimuli Replay #N
Time N
DUT
State
Restore
Debug
Data
Dump
Time 0
DUT
State
Save
ZeBu Server 4
… …
© 2019 Synopsys, Inc. 14
Scalable solution for complex billion gate SoC waveform-level debug
10X Faster Expansion and Load with ZeBu and VerdiNext generation tools for waveform-level debug
High
Bandwidth
I/F
Raw Debug
Signals
2TB/s
High Performance
Parallel Expansion
High Performance
Interactive Expansion
Expanded
Debug Signals
VerdiZeBu Server 4
750MGates, 1M cycles
<10min
2.5BGates, 500k cycles
Each signal drop <1sec
Native
ZeBu
Format
© 2019 Synopsys, Inc. 15
ZeBu Exascale Debug
ZeBu Exascale Debug
• Run billion cycle workloads at MHz speed
• Stream system-level data, record stimuli and save
DUT states
• Analyze system-level data to identify failure window
• Deterministic rerun of failure window to dump million
cycles of debug data
• High performance waveform expansion and debug
in Verdi
Fastest way to root cause bugs in complex BG SoCs with BCycle Workloads
ZeBu
Emulation
Replay
Testbench
stimuli
Record
Run
Restore
System-level
Data
Stream
DUT StateDUT State
DUT state
Save
Waveform
Expansion &
Verdi debug
Select Select
Testbench
Analyze
Thank You