![Page 1: Topics in Computing for Astronomy: Parallel Programming ...townsend/resource/talks/parallel-openmp-i.pdf · What is OpenMP? ‣ An Application Programming Interface (API) for writing](https://reader034.vdocuments.net/reader034/viewer/2022042403/5f146b732592a657336ec3a3/html5/thumbnails/1.jpg)
Topics in Computing for Astronomy: Parallel Programming with OpenMP (I)
Rich Townsend, University of Wisconsin-Madison
![Page 2: Topics in Computing for Astronomy: Parallel Programming ...townsend/resource/talks/parallel-openmp-i.pdf · What is OpenMP? ‣ An Application Programming Interface (API) for writing](https://reader034.vdocuments.net/reader034/viewer/2022042403/5f146b732592a657336ec3a3/html5/thumbnails/2.jpg)
Preliminaries!
!
‣ The Mad SDK (contains gfortran, an OpenMP-capable Fortran 90/95/2003/2008 compiler) !
!
‣ This talk & accompanying code examples:
http://www.astro.wisc.edu/~townsend/static.php?ref=madsdk
http://www.astro.wisc.edu/~townsend/static.php?ref=talks
![Page 3: Topics in Computing for Astronomy: Parallel Programming ...townsend/resource/talks/parallel-openmp-i.pdf · What is OpenMP? ‣ An Application Programming Interface (API) for writing](https://reader034.vdocuments.net/reader034/viewer/2022042403/5f146b732592a657336ec3a3/html5/thumbnails/3.jpg)
What is OpenMP?
‣ An Application Programming Interface (API) for writing multithreaded programs
‣ Consists of a set of compiler directives, library routines and environment variables
‣ Supports Fortran, C and C++ ‣ Developed by vendors (AMD, IBM, Intel, …) ‣ Mainly targeted at Symmetric Multiprocessing
systems
![Page 4: Topics in Computing for Astronomy: Parallel Programming ...townsend/resource/talks/parallel-openmp-i.pdf · What is OpenMP? ‣ An Application Programming Interface (API) for writing](https://reader034.vdocuments.net/reader034/viewer/2022042403/5f146b732592a657336ec3a3/html5/thumbnails/4.jpg)
What is a Symmetric Multiprocessing (SMP)?
‣ A multiprocessor system with ‣ centralized, shared memory ‣ a single operating system ‣ two or more homogeneous processors
‣ Modern desktops/laptops are SMP systems: ‣ Multiple cores in each physical processor ‣ Multiple physical processors
![Page 5: Topics in Computing for Astronomy: Parallel Programming ...townsend/resource/talks/parallel-openmp-i.pdf · What is OpenMP? ‣ An Application Programming Interface (API) for writing](https://reader034.vdocuments.net/reader034/viewer/2022042403/5f146b732592a657336ec3a3/html5/thumbnails/5.jpg)
What is Multithreading?
‣ The ability of a program to have multiple, independent threads executing different sequences of instructions
‣ The threads can access the resources (e.g., memory) of the parent process
‣ The threads run concurrently on a multiprocessor systems
![Page 6: Topics in Computing for Astronomy: Parallel Programming ...townsend/resource/talks/parallel-openmp-i.pdf · What is OpenMP? ‣ An Application Programming Interface (API) for writing](https://reader034.vdocuments.net/reader034/viewer/2022042403/5f146b732592a657336ec3a3/html5/thumbnails/6.jpg)
OpenMP Core Syntax
‣ Constructs use compiler directives specified using special Fortran comments: !
!
‣ Most constructs apply to a “structured block” with one point of entry and one point of exit
‣ Additional library functions and subroutines accessed through the omp_lib module
!$omp parallel Fortran 90+
C$omp parallel doFortran 77
![Page 7: Topics in Computing for Astronomy: Parallel Programming ...townsend/resource/talks/parallel-openmp-i.pdf · What is OpenMP? ‣ An Application Programming Interface (API) for writing](https://reader034.vdocuments.net/reader034/viewer/2022042403/5f146b732592a657336ec3a3/html5/thumbnails/7.jpg)
Example: Hello Worldprogram hello_world ! use omp_lib implicit none ! !$omp parallel num_threads(4) print *, ’Hello from thread’, omp_get_thread_num() !$omp end parallel !end program hello_world
hello_world.f90
% gfortran -fopenmp -o hello_world hello_world.f90 % ./hello_world Hello from thread 0 Hello from thread 2 Hello from thread 3 Hello from thread 1
![Page 8: Topics in Computing for Astronomy: Parallel Programming ...townsend/resource/talks/parallel-openmp-i.pdf · What is OpenMP? ‣ An Application Programming Interface (API) for writing](https://reader034.vdocuments.net/reader034/viewer/2022042403/5f146b732592a657336ec3a3/html5/thumbnails/8.jpg)
Example: Hello Worldprogram hello_world ! use omp_lib implicit none ! !$omp parallel num_threads(4) print *, ’Hello from thread’, omp_get_thread_num() !$omp end parallel !end program hello_world
hello_world.f90
% gfortran -fopenmp -o hello_world hello_world.f90 % ./hello_world Hello from thread 0 Hello from thread 2 Hello from thread 3 Hello from thread 1
Make OpenMP routines
available
![Page 9: Topics in Computing for Astronomy: Parallel Programming ...townsend/resource/talks/parallel-openmp-i.pdf · What is OpenMP? ‣ An Application Programming Interface (API) for writing](https://reader034.vdocuments.net/reader034/viewer/2022042403/5f146b732592a657336ec3a3/html5/thumbnails/9.jpg)
Example: Hello Worldprogram hello_world ! use omp_lib implicit none ! !$omp parallel num_threads(4) print *, ’Hello from thread’, omp_get_thread_num() !$omp end parallel !end program hello_world
hello_world.f90
% gfortran -fopenmp -o hello_world hello_world.f90 % ./hello_world Hello from thread 0 Hello from thread 2 Hello from thread 3 Hello from thread 1
Make OpenMP routines
available
Start a parallel region with
4 threads
![Page 10: Topics in Computing for Astronomy: Parallel Programming ...townsend/resource/talks/parallel-openmp-i.pdf · What is OpenMP? ‣ An Application Programming Interface (API) for writing](https://reader034.vdocuments.net/reader034/viewer/2022042403/5f146b732592a657336ec3a3/html5/thumbnails/10.jpg)
Example: Hello Worldprogram hello_world ! use omp_lib implicit none ! !$omp parallel num_threads(4) print *, ’Hello from thread’, omp_get_thread_num() !$omp end parallel !end program hello_world
hello_world.f90
% gfortran -fopenmp -o hello_world hello_world.f90 % ./hello_world Hello from thread 0 Hello from thread 2 Hello from thread 3 Hello from thread 1
Make OpenMP routines
available
Start a parallel region with
4 threads Get the ID of the thread
![Page 11: Topics in Computing for Astronomy: Parallel Programming ...townsend/resource/talks/parallel-openmp-i.pdf · What is OpenMP? ‣ An Application Programming Interface (API) for writing](https://reader034.vdocuments.net/reader034/viewer/2022042403/5f146b732592a657336ec3a3/html5/thumbnails/11.jpg)
Example: Hello Worldprogram hello_world ! use omp_lib implicit none ! !$omp parallel num_threads(4) print *, ’Hello from thread’, omp_get_thread_num() !$omp end parallel !end program hello_world
hello_world.f90
% gfortran -fopenmp -o hello_world hello_world.f90 % ./hello_world Hello from thread 0 Hello from thread 2 Hello from thread 3 Hello from thread 1
Make OpenMP routines
available
Start a parallel region with
4 threads Get the ID of the thread
Tell gfortran to recognize !$omp directives
![Page 12: Topics in Computing for Astronomy: Parallel Programming ...townsend/resource/talks/parallel-openmp-i.pdf · What is OpenMP? ‣ An Application Programming Interface (API) for writing](https://reader034.vdocuments.net/reader034/viewer/2022042403/5f146b732592a657336ec3a3/html5/thumbnails/12.jpg)
Example: Hello Worldprogram hello_world ! use omp_lib implicit none ! !$omp parallel num_threads(4) print *, ’Hello from thread’, omp_get_thread_num() !$omp end parallel !end program hello_world
hello_world.f90
% gfortran -fopenmp -o hello_world hello_world.f90 % ./hello_world Hello from thread 0 Hello from thread 2 Hello from thread 3 Hello from thread 1
Make OpenMP routines
available
Start a parallel region with
4 threads Get the ID of the thread
Tell gfortran to recognize !$omp directives
Not sequential — OpenMP does not guarantee
execution order!
![Page 13: Topics in Computing for Astronomy: Parallel Programming ...townsend/resource/talks/parallel-openmp-i.pdf · What is OpenMP? ‣ An Application Programming Interface (API) for writing](https://reader034.vdocuments.net/reader034/viewer/2022042403/5f146b732592a657336ec3a3/html5/thumbnails/13.jpg)
The Fork-Join Thread Model
‣ At the start of a program, there is a single thread — the master thread ‣ When entering a parallel region, the master thread ‘forks’ into a team
of threads ‣ When exiting a parallel region, the team joins back into a single master
thread
Wikipedia
![Page 14: Topics in Computing for Astronomy: Parallel Programming ...townsend/resource/talks/parallel-openmp-i.pdf · What is OpenMP? ‣ An Application Programming Interface (API) for writing](https://reader034.vdocuments.net/reader034/viewer/2022042403/5f146b732592a657336ec3a3/html5/thumbnails/14.jpg)
Sharing Variablesprogram hello_world_shared ! use omp_lib implicit none ! integer :: id ! !$omp parallel num_threads(4) id = omp_get_thread_num() print *, 'Hello from thread', id !$omp end parallel !end program hello_world_shared
hello_world_shared.f90
This overwrites the same memory location!
![Page 15: Topics in Computing for Astronomy: Parallel Programming ...townsend/resource/talks/parallel-openmp-i.pdf · What is OpenMP? ‣ An Application Programming Interface (API) for writing](https://reader034.vdocuments.net/reader034/viewer/2022042403/5f146b732592a657336ec3a3/html5/thumbnails/15.jpg)
Sharing Variablesprogram hello_world_shared ! use omp_lib implicit none ! integer :: id ! !$omp parallel num_threads(4) id = omp_get_thread_num() print *, 'Hello from thread', id !$omp end parallel !end program hello_world_shared
hello_world_shared.f90
The id variable is by default shared amongst
threadsThis overwrites
the same memory location!
![Page 16: Topics in Computing for Astronomy: Parallel Programming ...townsend/resource/talks/parallel-openmp-i.pdf · What is OpenMP? ‣ An Application Programming Interface (API) for writing](https://reader034.vdocuments.net/reader034/viewer/2022042403/5f146b732592a657336ec3a3/html5/thumbnails/16.jpg)
Not Sharing Variablesprogram hello_world_private ! use omp_lib implicit none ! integer :: id ! !$omp parallel num_threads(4) private(id) id = omp_get_thread_num() print *, 'Hello from thread', id !$omp end parallel !end program hello_world_private
hello_world_private.f90
![Page 17: Topics in Computing for Astronomy: Parallel Programming ...townsend/resource/talks/parallel-openmp-i.pdf · What is OpenMP? ‣ An Application Programming Interface (API) for writing](https://reader034.vdocuments.net/reader034/viewer/2022042403/5f146b732592a657336ec3a3/html5/thumbnails/17.jpg)
Not Sharing Variablesprogram hello_world_private ! use omp_lib implicit none ! integer :: id ! !$omp parallel num_threads(4) private(id) id = omp_get_thread_num() print *, 'Hello from thread', id !$omp end parallel !end program hello_world_private
hello_world_private.f90
The id variable is private; each thread has its own copy
![Page 18: Topics in Computing for Astronomy: Parallel Programming ...townsend/resource/talks/parallel-openmp-i.pdf · What is OpenMP? ‣ An Application Programming Interface (API) for writing](https://reader034.vdocuments.net/reader034/viewer/2022042403/5f146b732592a657336ec3a3/html5/thumbnails/18.jpg)
Read/Write Racesprogram count_threads ! use omp_lib implicit none ! integer :: n ! n = 0 ! !$omp parallel num_threads(4) n = n + 1 !$omp end parallel ! print *,'There were', n, 'threads' !end program count_threads
count_threads.f90
![Page 19: Topics in Computing for Astronomy: Parallel Programming ...townsend/resource/talks/parallel-openmp-i.pdf · What is OpenMP? ‣ An Application Programming Interface (API) for writing](https://reader034.vdocuments.net/reader034/viewer/2022042403/5f146b732592a657336ec3a3/html5/thumbnails/19.jpg)
Read/Write Racesprogram count_threads ! use omp_lib implicit none ! integer :: n ! n = 0 ! !$omp parallel num_threads(4) n = n + 1 !$omp end parallel ! print *,'There were', n, 'threads' !end program count_threads
count_threads.f90
Bad: another thread could update n between read and write
![Page 20: Topics in Computing for Astronomy: Parallel Programming ...townsend/resource/talks/parallel-openmp-i.pdf · What is OpenMP? ‣ An Application Programming Interface (API) for writing](https://reader034.vdocuments.net/reader034/viewer/2022042403/5f146b732592a657336ec3a3/html5/thumbnails/20.jpg)
Atomic Operationsprogram count_threads_atomic ! use omp_lib implicit none ! integer :: n ! n = 0 ! !$omp parallel num_threads(4) !$omp atomic n = n + 1 !$omp end parallel ! print *,'There were', n, 'threads' !end program count_threads_atomic
count_threads_atomic.f90
![Page 21: Topics in Computing for Astronomy: Parallel Programming ...townsend/resource/talks/parallel-openmp-i.pdf · What is OpenMP? ‣ An Application Programming Interface (API) for writing](https://reader034.vdocuments.net/reader034/viewer/2022042403/5f146b732592a657336ec3a3/html5/thumbnails/21.jpg)
Atomic Operationsprogram count_threads_atomic ! use omp_lib implicit none ! integer :: n ! n = 0 ! !$omp parallel num_threads(4) !$omp atomic n = n + 1 !$omp end parallel ! print *,'There were', n, 'threads' !end program count_threads_atomic
count_threads_atomic.f90
atomic directive: the read/write in the
following operation must be executed as one
![Page 22: Topics in Computing for Astronomy: Parallel Programming ...townsend/resource/talks/parallel-openmp-i.pdf · What is OpenMP? ‣ An Application Programming Interface (API) for writing](https://reader034.vdocuments.net/reader034/viewer/2022042403/5f146b732592a657336ec3a3/html5/thumbnails/22.jpg)
Synchronization Directives
‣ !$omp atomic is an example of a synchronization directive
‣ Other examples: ‣ !$omp critical — only execute on one thread at a
time ‣ !$omp master — only execute on master thread ‣ !$omp barrier — wait for all threads
![Page 23: Topics in Computing for Astronomy: Parallel Programming ...townsend/resource/talks/parallel-openmp-i.pdf · What is OpenMP? ‣ An Application Programming Interface (API) for writing](https://reader034.vdocuments.net/reader034/viewer/2022042403/5f146b732592a657336ec3a3/html5/thumbnails/23.jpg)
Sharing the Loadprogram square_nums ! use omp_lib implicit none ! integer :: i, j(4) ! !$omp parallel num_threads(4) i = omp_get_thread_num() + 1 j(i) = i**2 !$omp end parallel ! print *,'Square numbers:', j !end program square_nums
square_nums.f90
![Page 24: Topics in Computing for Astronomy: Parallel Programming ...townsend/resource/talks/parallel-openmp-i.pdf · What is OpenMP? ‣ An Application Programming Interface (API) for writing](https://reader034.vdocuments.net/reader034/viewer/2022042403/5f146b732592a657336ec3a3/html5/thumbnails/24.jpg)
Sharing the Loadprogram square_nums ! use omp_lib implicit none ! integer :: i, j(4) ! !$omp parallel num_threads(4) i = omp_get_thread_num() + 1 j(i) = i**2 !$omp end parallel ! print *,'Square numbers:', j !end program square_nums
square_nums.f90
Calculate array index from thread id
![Page 25: Topics in Computing for Astronomy: Parallel Programming ...townsend/resource/talks/parallel-openmp-i.pdf · What is OpenMP? ‣ An Application Programming Interface (API) for writing](https://reader034.vdocuments.net/reader034/viewer/2022042403/5f146b732592a657336ec3a3/html5/thumbnails/25.jpg)
Parallelizing DO loopscount_threads.f90program square_nums_do
! use omp_lib implicit none ! integer :: i, j(13) ! !$omp parallel num_threads(4) !$omp do do i = 1, SIZE(j) j(i) = i**2 enddo !$omp end parallel ! print *,'Square numbers:', j !end program square_nums_do
square_nums_do.f90
![Page 26: Topics in Computing for Astronomy: Parallel Programming ...townsend/resource/talks/parallel-openmp-i.pdf · What is OpenMP? ‣ An Application Programming Interface (API) for writing](https://reader034.vdocuments.net/reader034/viewer/2022042403/5f146b732592a657336ec3a3/html5/thumbnails/26.jpg)
Parallelizing DO loopscount_threads.f90program square_nums_do
! use omp_lib implicit none ! integer :: i, j(13) ! !$omp parallel num_threads(4) !$omp do do i = 1, SIZE(j) j(i) = i**2 enddo !$omp end parallel ! print *,'Square numbers:', j !end program square_nums_do
Will work with array of any size
square_nums_do.f90
![Page 27: Topics in Computing for Astronomy: Parallel Programming ...townsend/resource/talks/parallel-openmp-i.pdf · What is OpenMP? ‣ An Application Programming Interface (API) for writing](https://reader034.vdocuments.net/reader034/viewer/2022042403/5f146b732592a657336ec3a3/html5/thumbnails/27.jpg)
Parallelizing DO loopscount_threads.f90program square_nums_do
! use omp_lib implicit none ! integer :: i, j(13) ! !$omp parallel num_threads(4) !$omp do do i = 1, SIZE(j) j(i) = i**2 enddo !$omp end parallel ! print *,'Square numbers:', j !end program square_nums_do
Will work with array of any size
do directive distributes loop iterations
amongst threads
square_nums_do.f90
![Page 28: Topics in Computing for Astronomy: Parallel Programming ...townsend/resource/talks/parallel-openmp-i.pdf · What is OpenMP? ‣ An Application Programming Interface (API) for writing](https://reader034.vdocuments.net/reader034/viewer/2022042403/5f146b732592a657336ec3a3/html5/thumbnails/28.jpg)
A Shorthand: parallel docount_threads.f90program square_nums_pardo
! use omp_lib implicit none ! integer :: i, j(13) ! !$omp parallel do num_threads(4) do i = 1, SIZE(j) j(i) = i**2 enddo !$omp end parallel do ! print *,'Square numbers:', j !end program square_nums_pardo
square_nums_pardo.f90
![Page 29: Topics in Computing for Astronomy: Parallel Programming ...townsend/resource/talks/parallel-openmp-i.pdf · What is OpenMP? ‣ An Application Programming Interface (API) for writing](https://reader034.vdocuments.net/reader034/viewer/2022042403/5f146b732592a657336ec3a3/html5/thumbnails/29.jpg)
A Shorthand: parallel docount_threads.f90program square_nums_pardo
! use omp_lib implicit none ! integer :: i, j(13) ! !$omp parallel do num_threads(4) do i = 1, SIZE(j) j(i) = i**2 enddo !$omp end parallel do ! print *,'Square numbers:', j !end program square_nums_pardo
parallel do directive creates parallel region AND distributes loop iterations
amongst threads
square_nums_pardo.f90
![Page 30: Topics in Computing for Astronomy: Parallel Programming ...townsend/resource/talks/parallel-openmp-i.pdf · What is OpenMP? ‣ An Application Programming Interface (API) for writing](https://reader034.vdocuments.net/reader034/viewer/2022042403/5f146b732592a657336ec3a3/html5/thumbnails/30.jpg)
Work-Sharing Constructs‣ !$omp do is an example of a work-sharing
construct ‣ A work-sharing construct divides the execution of
the enclosed region among the members of the thread team that encounters it
‣ Other examples: ‣ !$omp sections — divide multiple enclosed blocks
amongst threads of team ‣ !$omp workshare — divide single enclosed block
amongst threads of team ‣ !$omp single — execute all work on a single thread
![Page 31: Topics in Computing for Astronomy: Parallel Programming ...townsend/resource/talks/parallel-openmp-i.pdf · What is OpenMP? ‣ An Application Programming Interface (API) for writing](https://reader034.vdocuments.net/reader034/viewer/2022042403/5f146b732592a657336ec3a3/html5/thumbnails/31.jpg)
The workshare Constructprogram square_nums_workshr ! use omp_lib implicit none ! integer :: i, j(13) ! j = [(i,i=1,SIZE(j))] ! !$omp parallel workshare num_threads(4) j = j**2 !$omp end parallel workshare ! print *,'Square numbers:', j !end program square_nums_workshr
square_nums_workshr.f90
![Page 32: Topics in Computing for Astronomy: Parallel Programming ...townsend/resource/talks/parallel-openmp-i.pdf · What is OpenMP? ‣ An Application Programming Interface (API) for writing](https://reader034.vdocuments.net/reader034/viewer/2022042403/5f146b732592a657336ec3a3/html5/thumbnails/32.jpg)
The workshare Constructprogram square_nums_workshr ! use omp_lib implicit none ! integer :: i, j(13) ! j = [(i,i=1,SIZE(j))] ! !$omp parallel workshare num_threads(4) j = j**2 !$omp end parallel workshare ! print *,'Square numbers:', j !end program square_nums_workshr
square_nums_workshr.f90
Initialize j using an array assignment
![Page 33: Topics in Computing for Astronomy: Parallel Programming ...townsend/resource/talks/parallel-openmp-i.pdf · What is OpenMP? ‣ An Application Programming Interface (API) for writing](https://reader034.vdocuments.net/reader034/viewer/2022042403/5f146b732592a657336ec3a3/html5/thumbnails/33.jpg)
The workshare Constructprogram square_nums_workshr ! use omp_lib implicit none ! integer :: i, j(13) ! j = [(i,i=1,SIZE(j))] ! !$omp parallel workshare num_threads(4) j = j**2 !$omp end parallel workshare ! print *,'Square numbers:', j !end program square_nums_workshr
square_nums_workshr.f90
Initialize j using an array assignment
These array arithmetic/assignment operations
are distributed amongst the team
![Page 34: Topics in Computing for Astronomy: Parallel Programming ...townsend/resource/talks/parallel-openmp-i.pdf · What is OpenMP? ‣ An Application Programming Interface (API) for writing](https://reader034.vdocuments.net/reader034/viewer/2022042403/5f146b732592a657336ec3a3/html5/thumbnails/34.jpg)
Restrictions on workshare Constructs‣ Only the following operations permitted: ‣ array assignments ‣ scalar assignments ‣ FORALL statements ‣ FORALL constructs ‣ WHERE statements ‣ WHERE constructs ‣ atomic constructs ‣ critical constructs ‣ parallel constructs
![Page 35: Topics in Computing for Astronomy: Parallel Programming ...townsend/resource/talks/parallel-openmp-i.pdf · What is OpenMP? ‣ An Application Programming Interface (API) for writing](https://reader034.vdocuments.net/reader034/viewer/2022042403/5f146b732592a657336ec3a3/html5/thumbnails/35.jpg)
The sections Constructprogram hello_sections ! use omp_lib implicit none ! !$omp parallel sections num_threads(4) !$omp section print *,'Hello from thread', omp_get_thread_num() !$omp section print *,'Hola from thread', omp_get_thread_num() !$omp section print *,'Aloha from thread', omp_get_thread_num() !$omp end parallel sections !end program hello_sections
hello_sections.f90
![Page 36: Topics in Computing for Astronomy: Parallel Programming ...townsend/resource/talks/parallel-openmp-i.pdf · What is OpenMP? ‣ An Application Programming Interface (API) for writing](https://reader034.vdocuments.net/reader034/viewer/2022042403/5f146b732592a657336ec3a3/html5/thumbnails/36.jpg)
The sections Constructprogram hello_sections ! use omp_lib implicit none ! !$omp parallel sections num_threads(4) !$omp section print *,'Hello from thread', omp_get_thread_num() !$omp section print *,'Hola from thread', omp_get_thread_num() !$omp section print *,'Aloha from thread', omp_get_thread_num() !$omp end parallel sections !end program hello_sections
hello_sections.f90
Each block delimited by !$omp section runs on a separate thread