04 debugging profiling tools
TRANSCRIPT
-
8/14/2019 04 Debugging Profiling Tools
1/21
CUDA Debugging and Profiling ToolsMark Harris, NVIDIA
-
8/14/2019 04 Debugging Profiling Tools
2/21
CUDA-GDB Debugger
Extended version of GDB with support for CUDA C
Supported on Linux 32bit / 64bit systems
Seamlessly debug both the host|CPU and device|GPU
Set breakpoints on any source line or symbol nameSingle step executes only one warp except on __syncthre
Access and print all CUDA memory allocations, local, globa
constant and shared vars.
-
8/14/2019 04 Debugging Profiling Tools
3/21
Linux GDB
Integration with
EMACS
-
8/14/2019 04 Debugging Profiling Tools
4/21
Linux GDB
Integration with
DDD
-
8/14/2019 04 Debugging Profiling Tools
5/21
CUDA-MemCheck
Detects/tracks memory errors
Out of bounds accesses
Misaligned accesses (types must be aligned on their size)
Linux and WinXP (included with CUDA Toolkit)
Integrated into CUDA-GDB on Linux
Also standalone command-line tool on Linux and Windows
Win7 and Vista support
coming
-
8/14/2019 04 Debugging Profiling Tools
6/21
CUDA Driver Command-line Profiling
1. Set environment variables export CUDA_PROFILE=1 export CUDA_PROFILE_CSV=1
export CUDA_PROFILE_CONFIG=config.txt export CUDA_PROFILE_LOG=profile.csv
2. Set configuration fileFILE "config.txt":
gpustarttimestampinstructions
3. Run application matrixMul
4. View profiler output
FILE "profile.csv":# CUDA_PROFILE_LOG_VERSION 1.5# CUDA_DEVICE 0 GeForce 8800 GT# CUDA_PROFILE_CSV 1
# TIMESTAMPFACTOR fa292bb1ea2c12cgpustarttimestamp,method,gputime,cputime,occ115f4eaa10e3b220,memcpyHtoD,7.328,12.000115f4eaa10e5dac0,memcpyHtoD,5.664,4.000115f4eaa10e95ce0,memcpyHtoD,7.328,6.000115f4eaa10f2ea60,_Z10dmatrixmulPfiiS_iiS_,1952115f4eaa10f443a0,memcpyDtoH,7.776,36.000
-
8/14/2019 04 Debugging Profiling Tools
7/21
CUDA Visual Profiler
Performance analysis for
CUDA appsLinux,Windows,Mac
Execute app and collect
profiling data
Hardware performance
counters
Profile all kernels and memory
xfers
Profiling data analysis
-
8/14/2019 04 Debugging Profiling Tools
8/21
CUDA Visual Profiler kernel data
-
8/14/2019 04 Debugging Profiling Tools
9/21
CUDA Visual Profiler computed kernel da
Instruction throughput
(achieved instruction rate) / (peak single-issue instruction raGlobal memory read throughput (GB/s)
Global memory write throughput (GB/s)
Overall global memory access throughput (GB/s)
Global memory load efficiency
Global memory store efficiency
-
8/14/2019 04 Debugging Profiling Tools
10/21
CUDA Visual Profiler memory transfer da
Memory transfer type and
direction(D=Device, H=Host, A=cuArray)
e.g. H to D: Host to Device
Synchronous / Asynchronous
Memory transfer size (bytes)
Stream ID
-
8/14/2019 04 Debugging Profiling Tools
11/21
CUDA Visual Profiler data analysis views
Views:
Summary table
Kernel table Memcpy table
Summary plot
GPU Time Height plot
GPU Time Width plot
Profiler counter plot
Profiler table column plot Multi-device plot
Multi-stream plot
Analyze profiler counters
Analyze kernel occupancy
-
8/14/2019 04 Debugging Profiling Tools
12/21
CUDA Visual Profiler Misc.
Multiple sessions
Compare views for different
sessions
Comparison Summary plot
Profiler projects save & load
Import/Export profiler data(.CSV format)
-
8/14/2019 04 Debugging Profiling Tools
13/21
NVIDIA Parallel Nsight
Accelerates GPU + CPU
application development
The industrys 1st Development Environment for
massively parallel applications
Complete Visual Studio-integrated
development environment
-
8/14/2019 04 Debugging Profiling Tools
14/21
Parallel Nsight 1.0
Nsight Parallel Debugger
GPU source code debugging
Variable & memory inspection
Nsight Analyzer
Platform-level Analysis
For the CPU and GPU
Nsight Graphics Inspector
Visualize and debug graphics content
-
8/14/2019 04 Debugging Profiling Tools
15/21
Source Debugging
Supports CUDA C and HLSL (Direct3D Shading Language)
Hardware breakpoints
GPU memory and variable views
Nsight menu and toolbars
-
8/14/2019 04 Debugging Profiling Tools
16/21
Parallel Nsight IDE - Debugging
-
8/14/2019 04 Debugging Profiling Tools
17/21
View a correlated trace timeline with both CPU and GPU events.
Analysis
-
8/14/2019 04 Debugging Profiling Tools
18/21
Detailed tooltips are available for every event on the timeline.
Analysis
-
8/14/2019 04 Debugging Profiling Tools
19/21
Parallel Nsight 1.0 System Requirements
Operating SystemWindows Server 2008 R2
Windows 7 / Vista
32 or 64-bit
HardwareGeForce 9 series or higher
Tesla C1060/S1070 or higher
Quadro (G9x or higher)
Visual StudioVisual Studio 2008 SP1
-
8/14/2019 04 Debugging Profiling Tools
20/21
Supported System Configurations
#1: Single machine, Single GPU
Analyzer
Graphics Inspector
#2: Two machines connected over the network
Debugger
Analyzer
Graphics Inspector
TCP/IP
#3: Single machine, dual GPUs
Debugger
Analyzer
Graphics Inspector
-
8/14/2019 04 Debugging Profiling Tools
21/21
Parallel Nsight 1.0 Versions
Standard (free)GPU Source Debugger
Graphics Inspector
Professional ($349)
AnalyzerData Breakpoints
Premium ticket-based support
Volume and Site Licensing available