benjamin perry and martin swany university of delaware computer information science
TRANSCRIPT
![Page 1: Benjamin Perry and Martin Swany University of Delaware Computer Information Science](https://reader036.vdocuments.net/reader036/viewer/2022062409/5697bf9b1a28abf838c92e14/html5/thumbnails/1.jpg)
Benjamin Perry and Martin SwanyUniversity of Delaware
Computer Information Science
![Page 2: Benjamin Perry and Martin Swany University of Delaware Computer Information Science](https://reader036.vdocuments.net/reader036/viewer/2022062409/5697bf9b1a28abf838c92e14/html5/thumbnails/2.jpg)
Background The problem The solution The results Conclusions and Future work
![Page 3: Benjamin Perry and Martin Swany University of Delaware Computer Information Science](https://reader036.vdocuments.net/reader036/viewer/2022062409/5697bf9b1a28abf838c92e14/html5/thumbnails/3.jpg)
MPI programs communicate via MPI data types
MPI data types are usually modeled after native data types
Payloads are often arrays of MPI data types
![Page 4: Benjamin Perry and Martin Swany University of Delaware Computer Information Science](https://reader036.vdocuments.net/reader036/viewer/2022062409/5697bf9b1a28abf838c92e14/html5/thumbnails/4.jpg)
The sending MPI library packs payload into contiguous block
The receiving MPI library unpacks payload into original form
Non-contiguous blocks incur a copy penalty SPMD programs, particularly in homogenous
environments, can use optimized packing
![Page 5: Benjamin Perry and Martin Swany University of Delaware Computer Information Science](https://reader036.vdocuments.net/reader036/viewer/2022062409/5697bf9b1a28abf838c92e14/html5/thumbnails/5.jpg)
Background The problem The solution The results Conclusions and Future work
![Page 6: Benjamin Perry and Martin Swany University of Delaware Computer Information Science](https://reader036.vdocuments.net/reader036/viewer/2022062409/5697bf9b1a28abf838c92e14/html5/thumbnails/6.jpg)
Users model MPI types after native types Some fields do not need to be transmitted Users often replace dead fields with a gap in
the MPI type to align with native type
![Page 7: Benjamin Perry and Martin Swany University of Delaware Computer Information Science](https://reader036.vdocuments.net/reader036/viewer/2022062409/5697bf9b1a28abf838c92e14/html5/thumbnails/7.jpg)
Smaller payload…. but MPI type is non-contiguous
◦ Copy penalty during packing and unpacking Multi-core machines and high-performance
networks feel the cost depending on payload
Multi-core machines are becoming ubiquitous◦ SPMD applications are ideal for these platforms
![Page 8: Benjamin Perry and Martin Swany University of Delaware Computer Information Science](https://reader036.vdocuments.net/reader036/viewer/2022062409/5697bf9b1a28abf838c92e14/html5/thumbnails/8.jpg)
Background The problem The solution The results Conclusions and Future work
![Page 9: Benjamin Perry and Martin Swany University of Delaware Computer Information Science](https://reader036.vdocuments.net/reader036/viewer/2022062409/5697bf9b1a28abf838c92e14/html5/thumbnails/9.jpg)
Applies only to SPMD applications Static analysis to locate MPI data types
◦ MPI_type_struct() Build internal representation of MPI data
type ◦ MPI data type defined via library call at runtime◦ Parameters indicate base types, consecutive
instances, and displacements◦ Def/use analysis to determine static definition
![Page 10: Benjamin Perry and Martin Swany University of Delaware Computer Information Science](https://reader036.vdocuments.net/reader036/viewer/2022062409/5697bf9b1a28abf838c92e14/html5/thumbnails/10.jpg)
Look for gaps in displacement array◦ Size of base types multiplied by consecutive array
Match MPI type to native type◦ Analyze the types of the payload◦ MPI type must be subset of native data structure◦ All sends and receives with MPI type handle must
also share same base types
![Page 11: Benjamin Perry and Martin Swany University of Delaware Computer Information Science](https://reader036.vdocuments.net/reader036/viewer/2022062409/5697bf9b1a28abf838c92e14/html5/thumbnails/11.jpg)
Perform transformation on MPI type and native type◦ Adjust parameters in MPI_type_struct◦ Relocate non-transmitted fields to bottom of type
End goal: improve library performance of packing large arrays
![Page 12: Benjamin Perry and Martin Swany University of Delaware Computer Information Science](https://reader036.vdocuments.net/reader036/viewer/2022062409/5697bf9b1a28abf838c92e14/html5/thumbnails/12.jpg)
Safety check◦ Cast to a type◦ Address-of
Except for computing displacement◦ Non-local types
Profitability◦ Sends / receives within loops◦ Large arrays of MPI types in sends / receives◦ Cost incurred by cache misses, locality by
adjusting native type when native type is in loops
![Page 13: Benjamin Perry and Martin Swany University of Delaware Computer Information Science](https://reader036.vdocuments.net/reader036/viewer/2022062409/5697bf9b1a28abf838c92e14/html5/thumbnails/13.jpg)
Background The problem The solution The results Conclusions and Future work
![Page 14: Benjamin Perry and Martin Swany University of Delaware Computer Information Science](https://reader036.vdocuments.net/reader036/viewer/2022062409/5697bf9b1a28abf838c92e14/html5/thumbnails/14.jpg)
LLVM compiler pass OpenMPI Intel Core2 Quad-core 2.4gz Ubuntu Control: sending un-optimized data type
with gap using payloads of various sizes Tested: Rearranging gap in MPI type and
native type using payloads of various sizes
![Page 15: Benjamin Perry and Martin Swany University of Delaware Computer Information Science](https://reader036.vdocuments.net/reader036/viewer/2022062409/5697bf9b1a28abf838c92e14/html5/thumbnails/15.jpg)
![Page 16: Benjamin Perry and Martin Swany University of Delaware Computer Information Science](https://reader036.vdocuments.net/reader036/viewer/2022062409/5697bf9b1a28abf838c92e14/html5/thumbnails/16.jpg)
![Page 17: Benjamin Perry and Martin Swany University of Delaware Computer Information Science](https://reader036.vdocuments.net/reader036/viewer/2022062409/5697bf9b1a28abf838c92e14/html5/thumbnails/17.jpg)
Background The problem The solution The results Conclusions and Future work
![Page 18: Benjamin Perry and Martin Swany University of Delaware Computer Information Science](https://reader036.vdocuments.net/reader036/viewer/2022062409/5697bf9b1a28abf838c92e14/html5/thumbnails/18.jpg)
MPI data types modeled after native data types
Users introduce gaps, making data noncontiguous and costly to pack on fast networks
Discover this scenario at compile time Fix it if safe and profitable Greatly improves multi-core performance;
infiniband also receives boost.
![Page 19: Benjamin Perry and Martin Swany University of Delaware Computer Information Science](https://reader036.vdocuments.net/reader036/viewer/2022062409/5697bf9b1a28abf838c92e14/html5/thumbnails/19.jpg)
Data type fission with user-injected gaps◦ Separate transmitted fields from non-transmitted
fields◦ Complete eliminates data copy during packing
Data type fission with non-used fields◦ Perform analysis on receiving end to see which
fields are actually being used◦ Cull non-used fields from data type; perform
fission
![Page 20: Benjamin Perry and Martin Swany University of Delaware Computer Information Science](https://reader036.vdocuments.net/reader036/viewer/2022062409/5697bf9b1a28abf838c92e14/html5/thumbnails/20.jpg)
? ? ? ?
? ?
? ? ?
?