![Page 1: GPU Power Model Nandhini Sudarsanan sudar003@umn.edusudar003@umn.edu Nathan Vanderby vande501@umn.edu Neeraj Mishra mish0088@umn.edu Usha Vinodh kuma0253@un.edu](https://reader036.vdocuments.net/reader036/viewer/2022081420/5697bf791a28abf838c822fc/html5/thumbnails/1.jpg)
GPU Power Model
Nandhini Sudarsanan [email protected] Vanderby [email protected]
Neeraj Mishra [email protected] Vinodh [email protected]
Chi Xu [email protected]
![Page 2: GPU Power Model Nandhini Sudarsanan sudar003@umn.edusudar003@umn.edu Nathan Vanderby vande501@umn.edu Neeraj Mishra mish0088@umn.edu Usha Vinodh kuma0253@un.edu](https://reader036.vdocuments.net/reader036/viewer/2022081420/5697bf791a28abf838c822fc/html5/thumbnails/2.jpg)
Outline
• Introduction and Motivation• Analytical Model Description• Experiment Setup• Results• Conclusion and Further Work
![Page 3: GPU Power Model Nandhini Sudarsanan sudar003@umn.edusudar003@umn.edu Nathan Vanderby vande501@umn.edu Neeraj Mishra mish0088@umn.edu Usha Vinodh kuma0253@un.edu](https://reader036.vdocuments.net/reader036/viewer/2022081420/5697bf791a28abf838c822fc/html5/thumbnails/3.jpg)
Introduction
![Page 4: GPU Power Model Nandhini Sudarsanan sudar003@umn.edusudar003@umn.edu Nathan Vanderby vande501@umn.edu Neeraj Mishra mish0088@umn.edu Usha Vinodh kuma0253@un.edu](https://reader036.vdocuments.net/reader036/viewer/2022081420/5697bf791a28abf838c822fc/html5/thumbnails/4.jpg)
Motivation
![Page 5: GPU Power Model Nandhini Sudarsanan sudar003@umn.edusudar003@umn.edu Nathan Vanderby vande501@umn.edu Neeraj Mishra mish0088@umn.edu Usha Vinodh kuma0253@un.edu](https://reader036.vdocuments.net/reader036/viewer/2022081420/5697bf791a28abf838c822fc/html5/thumbnails/5.jpg)
Outline
• Introduction and Motivation• Analytical Model Description
o Parser o Power Model
• Experiment Setup• Results• Conclusion and Further Work
![Page 6: GPU Power Model Nandhini Sudarsanan sudar003@umn.edusudar003@umn.edu Nathan Vanderby vande501@umn.edu Neeraj Mishra mish0088@umn.edu Usha Vinodh kuma0253@un.edu](https://reader036.vdocuments.net/reader036/viewer/2022081420/5697bf791a28abf838c822fc/html5/thumbnails/6.jpg)
Parser
![Page 7: GPU Power Model Nandhini Sudarsanan sudar003@umn.edusudar003@umn.edu Nathan Vanderby vande501@umn.edu Neeraj Mishra mish0088@umn.edu Usha Vinodh kuma0253@un.edu](https://reader036.vdocuments.net/reader036/viewer/2022081420/5697bf791a28abf838c822fc/html5/thumbnails/7.jpg)
Outline
• Introduction and Motivation• Analytical Model Description
o Parser o Power Model
• Experiment Setup• Results• Conclusion and Further Work
![Page 8: GPU Power Model Nandhini Sudarsanan sudar003@umn.edusudar003@umn.edu Nathan Vanderby vande501@umn.edu Neeraj Mishra mish0088@umn.edu Usha Vinodh kuma0253@un.edu](https://reader036.vdocuments.net/reader036/viewer/2022081420/5697bf791a28abf838c822fc/html5/thumbnails/8.jpg)
Power Model
• PTX Level
![Page 9: GPU Power Model Nandhini Sudarsanan sudar003@umn.edusudar003@umn.edu Nathan Vanderby vande501@umn.edu Neeraj Mishra mish0088@umn.edu Usha Vinodh kuma0253@un.edu](https://reader036.vdocuments.net/reader036/viewer/2022081420/5697bf791a28abf838c822fc/html5/thumbnails/9.jpg)
Power Model
• Assembly Level
![Page 10: GPU Power Model Nandhini Sudarsanan sudar003@umn.edusudar003@umn.edu Nathan Vanderby vande501@umn.edu Neeraj Mishra mish0088@umn.edu Usha Vinodh kuma0253@un.edu](https://reader036.vdocuments.net/reader036/viewer/2022081420/5697bf791a28abf838c822fc/html5/thumbnails/10.jpg)
Outline
• Introduction and Motivation• Analytical Model Description
o Parser o Power Model
• Experiment Setup• Results• Conclusion and Further Work
![Page 11: GPU Power Model Nandhini Sudarsanan sudar003@umn.edusudar003@umn.edu Nathan Vanderby vande501@umn.edu Neeraj Mishra mish0088@umn.edu Usha Vinodh kuma0253@un.edu](https://reader036.vdocuments.net/reader036/viewer/2022081420/5697bf791a28abf838c822fc/html5/thumbnails/11.jpg)
Experiment Setup - Hardware
• Measure Power Consumption and Temperatureo Current Clamp for PCIE & GPU Power Cable
Data Acquisition Card @ 100Hzo GPU Performance Countero Sample Temperature @ 10Hz, GPU sensor
![Page 12: GPU Power Model Nandhini Sudarsanan sudar003@umn.edusudar003@umn.edu Nathan Vanderby vande501@umn.edu Neeraj Mishra mish0088@umn.edu Usha Vinodh kuma0253@un.edu](https://reader036.vdocuments.net/reader036/viewer/2022081420/5697bf791a28abf838c822fc/html5/thumbnails/12.jpg)
Experiment Setup - Software
• Driver API• Generate and Modify PTX code
o Minimize control loops• CUDA 4.0
o Built in Binary -> Assembly Converter (cuobjdump)• MATLAB to build model• Remote login
![Page 13: GPU Power Model Nandhini Sudarsanan sudar003@umn.edusudar003@umn.edu Nathan Vanderby vande501@umn.edu Neeraj Mishra mish0088@umn.edu Usha Vinodh kuma0253@un.edu](https://reader036.vdocuments.net/reader036/viewer/2022081420/5697bf791a28abf838c822fc/html5/thumbnails/13.jpg)
CUDA- Fermi Architecture
• Third Generation Streaming Multiprocessor(SM)o 32 CUDA cores per SM, 4x over GT200o 1024 thread block size, 2x over GT200o Unified address space enables full C++ supporto Improved Memory Subsystem
![Page 14: GPU Power Model Nandhini Sudarsanan sudar003@umn.edusudar003@umn.edu Nathan Vanderby vande501@umn.edu Neeraj Mishra mish0088@umn.edu Usha Vinodh kuma0253@un.edu](https://reader036.vdocuments.net/reader036/viewer/2022081420/5697bf791a28abf838c822fc/html5/thumbnails/14.jpg)
Benchmarks
• Small number of overhead operations (loop counters, initialization, etc.).
• Computational intensive work to allow for an experiment of significant length for accurate current measurement.
• Exhibit high utilization of the CUDA cores, few data hazards as possible.
• Grid and block sizes appropriately so that all SM are used, since idle SM leak.
• Accordingly 7 benchmarks were selected from CUDA SDK.
![Page 15: GPU Power Model Nandhini Sudarsanan sudar003@umn.edusudar003@umn.edu Nathan Vanderby vande501@umn.edu Neeraj Mishra mish0088@umn.edu Usha Vinodh kuma0253@un.edu](https://reader036.vdocuments.net/reader036/viewer/2022081420/5697bf791a28abf838c822fc/html5/thumbnails/15.jpg)
Benchmarks
For this project we tested out a few benchmarks.• 2D convolution• Matrix Multipication• Vector Addition• Vector Reduction• Scalar Product• DCT 8x8• 3DFD
![Page 16: GPU Power Model Nandhini Sudarsanan sudar003@umn.edusudar003@umn.edu Nathan Vanderby vande501@umn.edu Neeraj Mishra mish0088@umn.edu Usha Vinodh kuma0253@un.edu](https://reader036.vdocuments.net/reader036/viewer/2022081420/5697bf791a28abf838c822fc/html5/thumbnails/16.jpg)
![Page 17: GPU Power Model Nandhini Sudarsanan sudar003@umn.edusudar003@umn.edu Nathan Vanderby vande501@umn.edu Neeraj Mishra mish0088@umn.edu Usha Vinodh kuma0253@un.edu](https://reader036.vdocuments.net/reader036/viewer/2022081420/5697bf791a28abf838c822fc/html5/thumbnails/17.jpg)
Limitations of PTX
• Higher level than assemblyo Divide & Sqrt: 1 PTX line, library in assembly
• Compiler optimizations from PTX -> assembly• Doesn’t reflect RAW dependencies• Performance counters use assembly
![Page 18: GPU Power Model Nandhini Sudarsanan sudar003@umn.edusudar003@umn.edu Nathan Vanderby vande501@umn.edu Neeraj Mishra mish0088@umn.edu Usha Vinodh kuma0253@un.edu](https://reader036.vdocuments.net/reader036/viewer/2022081420/5697bf791a28abf838c822fc/html5/thumbnails/18.jpg)
Outline
• Introduction and Motivation• Analytical Model Description
o Parser o Power Model
• Experiment Setup• Results• Conclusion and Further Work
![Page 19: GPU Power Model Nandhini Sudarsanan sudar003@umn.edusudar003@umn.edu Nathan Vanderby vande501@umn.edu Neeraj Mishra mish0088@umn.edu Usha Vinodh kuma0253@un.edu](https://reader036.vdocuments.net/reader036/viewer/2022081420/5697bf791a28abf838c822fc/html5/thumbnails/19.jpg)
Results
![Page 20: GPU Power Model Nandhini Sudarsanan sudar003@umn.edusudar003@umn.edu Nathan Vanderby vande501@umn.edu Neeraj Mishra mish0088@umn.edu Usha Vinodh kuma0253@un.edu](https://reader036.vdocuments.net/reader036/viewer/2022081420/5697bf791a28abf838c822fc/html5/thumbnails/20.jpg)
Outline
• Introduction and Motivation• Analytical Model Description
o Parser o Power Model
• Experiment Setup• Results• Conclusion and Further Work
![Page 21: GPU Power Model Nandhini Sudarsanan sudar003@umn.edusudar003@umn.edu Nathan Vanderby vande501@umn.edu Neeraj Mishra mish0088@umn.edu Usha Vinodh kuma0253@un.edu](https://reader036.vdocuments.net/reader036/viewer/2022081420/5697bf791a28abf838c822fc/html5/thumbnails/21.jpg)
Conclusion and Further Work
• Conclusion
• Further Worko Take into account context switcheso Consider Multiple kernels running simultaneously
![Page 22: GPU Power Model Nandhini Sudarsanan sudar003@umn.edusudar003@umn.edu Nathan Vanderby vande501@umn.edu Neeraj Mishra mish0088@umn.edu Usha Vinodh kuma0253@un.edu](https://reader036.vdocuments.net/reader036/viewer/2022081420/5697bf791a28abf838c822fc/html5/thumbnails/22.jpg)
The End
Thanks
Q&A