04 debugging profiling tools

Upload: jramirezcr

Post on 04-Jun-2018

238 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/14/2019 04 Debugging Profiling Tools

    1/21

    CUDA Debugging and Profiling ToolsMark Harris, NVIDIA

    [email protected]

  • 8/14/2019 04 Debugging Profiling Tools

    2/21

    CUDA-GDB Debugger

    Extended version of GDB with support for CUDA C

    Supported on Linux 32bit / 64bit systems

    Seamlessly debug both the host|CPU and device|GPU

    Set breakpoints on any source line or symbol nameSingle step executes only one warp except on __syncthre

    Access and print all CUDA memory allocations, local, globa

    constant and shared vars.

  • 8/14/2019 04 Debugging Profiling Tools

    3/21

    Linux GDB

    Integration with

    EMACS

  • 8/14/2019 04 Debugging Profiling Tools

    4/21

    Linux GDB

    Integration with

    DDD

  • 8/14/2019 04 Debugging Profiling Tools

    5/21

    CUDA-MemCheck

    Detects/tracks memory errors

    Out of bounds accesses

    Misaligned accesses (types must be aligned on their size)

    Linux and WinXP (included with CUDA Toolkit)

    Integrated into CUDA-GDB on Linux

    Also standalone command-line tool on Linux and Windows

    Win7 and Vista support

    coming

  • 8/14/2019 04 Debugging Profiling Tools

    6/21

    CUDA Driver Command-line Profiling

    1. Set environment variables export CUDA_PROFILE=1 export CUDA_PROFILE_CSV=1

    export CUDA_PROFILE_CONFIG=config.txt export CUDA_PROFILE_LOG=profile.csv

    2. Set configuration fileFILE "config.txt":

    gpustarttimestampinstructions

    3. Run application matrixMul

    4. View profiler output

    FILE "profile.csv":# CUDA_PROFILE_LOG_VERSION 1.5# CUDA_DEVICE 0 GeForce 8800 GT# CUDA_PROFILE_CSV 1

    # TIMESTAMPFACTOR fa292bb1ea2c12cgpustarttimestamp,method,gputime,cputime,occ115f4eaa10e3b220,memcpyHtoD,7.328,12.000115f4eaa10e5dac0,memcpyHtoD,5.664,4.000115f4eaa10e95ce0,memcpyHtoD,7.328,6.000115f4eaa10f2ea60,_Z10dmatrixmulPfiiS_iiS_,1952115f4eaa10f443a0,memcpyDtoH,7.776,36.000

  • 8/14/2019 04 Debugging Profiling Tools

    7/21

    CUDA Visual Profiler

    Performance analysis for

    CUDA appsLinux,Windows,Mac

    Execute app and collect

    profiling data

    Hardware performance

    counters

    Profile all kernels and memory

    xfers

    Profiling data analysis

  • 8/14/2019 04 Debugging Profiling Tools

    8/21

    CUDA Visual Profiler kernel data

  • 8/14/2019 04 Debugging Profiling Tools

    9/21

    CUDA Visual Profiler computed kernel da

    Instruction throughput

    (achieved instruction rate) / (peak single-issue instruction raGlobal memory read throughput (GB/s)

    Global memory write throughput (GB/s)

    Overall global memory access throughput (GB/s)

    Global memory load efficiency

    Global memory store efficiency

  • 8/14/2019 04 Debugging Profiling Tools

    10/21

    CUDA Visual Profiler memory transfer da

    Memory transfer type and

    direction(D=Device, H=Host, A=cuArray)

    e.g. H to D: Host to Device

    Synchronous / Asynchronous

    Memory transfer size (bytes)

    Stream ID

  • 8/14/2019 04 Debugging Profiling Tools

    11/21

    CUDA Visual Profiler data analysis views

    Views:

    Summary table

    Kernel table Memcpy table

    Summary plot

    GPU Time Height plot

    GPU Time Width plot

    Profiler counter plot

    Profiler table column plot Multi-device plot

    Multi-stream plot

    Analyze profiler counters

    Analyze kernel occupancy

  • 8/14/2019 04 Debugging Profiling Tools

    12/21

    CUDA Visual Profiler Misc.

    Multiple sessions

    Compare views for different

    sessions

    Comparison Summary plot

    Profiler projects save & load

    Import/Export profiler data(.CSV format)

  • 8/14/2019 04 Debugging Profiling Tools

    13/21

    NVIDIA Parallel Nsight

    Accelerates GPU + CPU

    application development

    The industrys 1st Development Environment for

    massively parallel applications

    Complete Visual Studio-integrated

    development environment

  • 8/14/2019 04 Debugging Profiling Tools

    14/21

    Parallel Nsight 1.0

    Nsight Parallel Debugger

    GPU source code debugging

    Variable & memory inspection

    Nsight Analyzer

    Platform-level Analysis

    For the CPU and GPU

    Nsight Graphics Inspector

    Visualize and debug graphics content

  • 8/14/2019 04 Debugging Profiling Tools

    15/21

    Source Debugging

    Supports CUDA C and HLSL (Direct3D Shading Language)

    Hardware breakpoints

    GPU memory and variable views

    Nsight menu and toolbars

  • 8/14/2019 04 Debugging Profiling Tools

    16/21

    Parallel Nsight IDE - Debugging

  • 8/14/2019 04 Debugging Profiling Tools

    17/21

    View a correlated trace timeline with both CPU and GPU events.

    Analysis

  • 8/14/2019 04 Debugging Profiling Tools

    18/21

    Detailed tooltips are available for every event on the timeline.

    Analysis

  • 8/14/2019 04 Debugging Profiling Tools

    19/21

    Parallel Nsight 1.0 System Requirements

    Operating SystemWindows Server 2008 R2

    Windows 7 / Vista

    32 or 64-bit

    HardwareGeForce 9 series or higher

    Tesla C1060/S1070 or higher

    Quadro (G9x or higher)

    Visual StudioVisual Studio 2008 SP1

  • 8/14/2019 04 Debugging Profiling Tools

    20/21

    Supported System Configurations

    #1: Single machine, Single GPU

    Analyzer

    Graphics Inspector

    #2: Two machines connected over the network

    Debugger

    Analyzer

    Graphics Inspector

    TCP/IP

    #3: Single machine, dual GPUs

    Debugger

    Analyzer

    Graphics Inspector

  • 8/14/2019 04 Debugging Profiling Tools

    21/21

    Parallel Nsight 1.0 Versions

    Standard (free)GPU Source Debugger

    Graphics Inspector

    Professional ($349)

    AnalyzerData Breakpoints

    Premium ticket-based support

    Volume and Site Licensing available