parallel programming with openmp › courses › csep524 › 13wi › omp_t… · writing openmp...
TRANSCRIPT
![Page 1: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/1.jpg)
Parallel Programming withOpenMP
Alejandro Duran
Barcelona Supercomputing Center
![Page 2: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/2.jpg)
Agenda
Agenda
- Thursday10:00 - 11:15 OpenMP Basics11:00 - 11:30 Break11:30 - 13:00 Hands-on (I)13:00 - 14:30 Lunch14:30 - 15:15 Task parallelism in OpenMP15:15 - 17:00 Hands-on (II)
- Friday10:00 - 11:00 Data parallelism in OpenMP11:00 - 11:30 Break11:30 - 13:00 Hands-on (III)13:00 - 14:30 Lunch14:30 - 15:00 Other OpenMP topics15:00 - 16:00 Hands-on (IV)16:00 - 16:30 OpenMP in the future
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 2 / 217
![Page 3: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/3.jpg)
Part I
OpenMP Basics
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 3 / 217
![Page 4: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/4.jpg)
Outline
OpenMP Overview
The OpenMP model
Writing OpenMP programs
Creating Threads
Data-sharing attributes
Synchronization
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 4 / 217
![Page 5: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/5.jpg)
OpenMP Overview
Outline
OpenMP Overview
The OpenMP model
Writing OpenMP programs
Creating Threads
Data-sharing attributes
Synchronization
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 5 / 217
![Page 6: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/6.jpg)
OpenMP Overview
What is OpenMP?
It’s an API extension to the C, C++ and Fortran languages to writeparallel programs for shared memory machines
Current version is 3.0 (May 2008)Supported by most compiler vendors
Intel,IBM,PGI,Sun,Cray,Fujitsu,HP,GCC,...
Maintained by the Architecture Review Board (ARB), a consortiumof industry and academia
http://www.openmp.org
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 6 / 217
![Page 7: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/7.jpg)
OpenMP Overview
A bit of historyO
penM
PFo
rtra
n1.
0
1997
Ope
nMP
C/C
++1.
0
1998
Ope
nMP
Fort
ran
1.1
1999
Ope
nMP
Fort
ran
2.0
2000
Ope
nMP
C/C
++2.
0
2002O
penM
P2.
5
2005
Ope
nMP
3.0
2008
Ope
nMP
3.1
2011
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 7 / 217
![Page 8: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/8.jpg)
OpenMP Overview
Advantages of OpenMP
Mature standard and implementationsStandardizes practice of the last 20 years
Good performance and scalabilityPortable across architecturesIncremental parallelizationMaintains sequential version(mostly) High level language
Some people may say a medium level language :-)
Supports both task and data parallelismCommunication is implicit
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 8 / 217
![Page 9: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/9.jpg)
OpenMP Overview
Disadvantages of OpenMP
Communication is implicitFlat memory modelIncremental parallelization creates false sense of glory/failureNo support for acceleratorsNo error recovery capabilitiesDifficult to composeLacks high-level algorithms and structuresDoes not run on clusters
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 9 / 217
![Page 10: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/10.jpg)
The OpenMP model
Outline
OpenMP Overview
The OpenMP model
Writing OpenMP programs
Creating Threads
Data-sharing attributes
Synchronization
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 10 / 217
![Page 11: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/11.jpg)
The OpenMP model
OpenMP at a glance
OpenMP components
CPU CPU CPU CPU CPU CPU SMP
OS Threading Libraries
OpenMP Runtime Library ICVs
OpenMP Exec
Compiler
Constructs
OpenMP API EnvironmentVariables
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 11 / 217
![Page 12: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/12.jpg)
The OpenMP model
Execution model
Fork-join modelOpenMP uses a fork-join model
The master thread spawns a team of threads that joins at the end ofthe parallel regionThreads in the same team can collaborate to do work
Parallel Region Parallel Region
Nested Parallel Region
Master Thread
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 12 / 217
![Page 13: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/13.jpg)
The OpenMP model
Memory model
OpenMP defines a relaxed memory modelThreads can see different values for the same variableMemory consistency is only guaranteed at specific pointsLuckily, the default points are usually enough
Variables can be shared or private to each thread
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 13 / 217
![Page 14: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/14.jpg)
Writing OpenMP programs
Outline
OpenMP Overview
The OpenMP model
Writing OpenMP programs
Creating Threads
Data-sharing attributes
Synchronization
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 14 / 217
![Page 15: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/15.jpg)
Writing OpenMP programs
OpenMP directives syntax
In FortranThrough a specially formatted comment:
s e n t i n e l cons t ruc t [ c lauses ]
where sentinel is one of:!$OMP or C$OMP or *$OMP in fixed format!$OMP in free format
In C/C++Through a compiler directive:
#pragma omp cons t ruc t [ c lauses ]
OpenMP syntax is ignored if the compiler does not recognizeOpenMP
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 15 / 217
![Page 16: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/16.jpg)
Writing OpenMP programs
OpenMP directives syntax
In FortranThrough a specially formatted comment:
s e n t i n e l cons t ruc t [ c lauses ]
where sentinel is one of:!$OMP or C$OMP or *$OMP in fixed format!$OMP in free format
In C/C++Through a compiler directive:
#pragma omp cons t ruc t [ c lauses ]
OpenMP syntax is ignored if the compiler does not recognizeOpenMP
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 15 / 217
We’ll be using C/C++ syntax through this tutorial
![Page 17: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/17.jpg)
Writing OpenMP programs
Headers/Macros
C/C++ onlyomp.h contains the API prototypes and data types definitionsThe _OPENMP is defined by OpenMP enabled compiler
Allows conditional compilation of OpenMP
Fortran onlyThe omp_lib module contains the subroutine and functiondefinitions
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 16 / 217
![Page 18: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/18.jpg)
Writing OpenMP programs
Structured Block
DefinitionMost directives apply to a structured block:
Block of one or more statementsOne entry point, one exit point
No branching in or out allowed
Terminating the program is allowed
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 17 / 217
![Page 19: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/19.jpg)
Writing OpenMP programs
Hello world!
Example
i n t i d ;char ∗message = "Hello world!" ;
#pragma omp parallel private ( i d ){
i d = omp_get_thread_num ( ) ;p r i n t f ("Thread %d says: %s\n" , id , message ) ;
}
Directive
API call
Clause
Structured block
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 18 / 217
![Page 20: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/20.jpg)
Writing OpenMP programs
Hello world!
Example
i n t i d ;char ∗message = "Hello world!" ;
#pragma omp parallel private ( i d ){
i d = omp_get_thread_num ( ) ;p r i n t f ("Thread %d says: %s\n" , id , message ) ;
}
Directive
API call
Clause
Structured block
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 18 / 217
![Page 21: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/21.jpg)
Creating Threads
Outline
OpenMP Overview
The OpenMP model
Writing OpenMP programs
Creating Threads
Data-sharing attributes
Synchronization
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 19 / 217
![Page 22: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/22.jpg)
Creating Threads
The parallel construct
Directive
#pragma omp parallel [ c lauses ]s t r u c t u r e d block
where clauses can be:num_threads(expression)
if(expression)
shared(var-list)private(var-list)firstprivate(var-list)default(none|shared| private | firstprivate )reduction(var-list)copyin(var-list)
Coming shortly!
Only in Fortran
We’ll see it later
Not today
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 20 / 217
![Page 23: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/23.jpg)
Creating Threads
The parallel construct
Specifying the number of threadsThe number of threads is controlled by an internal control variable(ICV) called nthreads-var.When a parallel construct is found a parallel region with amaximum of nthreads-var is created
Parallel constructs can be nested creating nested parallelismThe nthreads-var can be modified through
the omp_set_num_threads API calledthe OMP_NUM_THREADS environment variable
Additionally, the num_threads clause causes the implementationto ignore the ICV and use the value of the clause for that region.
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 21 / 217
![Page 24: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/24.jpg)
Creating Threads
The parallel construct
Avoiding parallel regionsSometimes we only want to run in parallel under certain conditions
E.g., enough input data, not running already in parallel, ...
The if clause allows to specify an expression. When evaluates tofalse the parallel construct will only use 1 thread
Note that still creates a new team and data environment
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 22 / 217
![Page 25: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/25.jpg)
Creating Threads
Hello world!
Example
i n t i d ;char ∗message = "Hello world!" ;
#pragma omp parallel private ( i d ){
i d = omp_get_thread_num ( ) ;p r i n t f ("Thread %d says: %s\n" , id , message ) ;
}
Creates a parallel region of OMP_NUM_THREADS
All threads execute the same code
id is private to each thread
Each thread gets its id in the teammessage is shared among all threads
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 23 / 217
![Page 26: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/26.jpg)
Creating Threads
Hello world!
Example
i n t i d ;char ∗message = "Hello world!" ;
#pragma omp parallel private ( i d ){
i d = omp_get_thread_num ( ) ;p r i n t f ("Thread %d says: %s\n" , id , message ) ;
}
Creates a parallel region of OMP_NUM_THREADS
All threads execute the same code
id is private to each thread
Each thread gets its id in the teammessage is shared among all threads
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 23 / 217
![Page 27: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/27.jpg)
Creating Threads
Hello world!
Example
i n t i d ;char ∗message = "Hello world!" ;
#pragma omp parallel private ( i d ){
i d = omp_get_thread_num ( ) ;p r i n t f ("Thread %d says: %s\n" , id , message ) ;
}
Creates a parallel region of OMP_NUM_THREADS
All threads execute the same code
id is private to each thread
Each thread gets its id in the team
message is shared among all threads
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 23 / 217
![Page 28: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/28.jpg)
Creating Threads
Hello world!
Example
i n t i d ;char ∗message = "Hello world!" ;
#pragma omp parallel private ( i d ){
i d = omp_get_thread_num ( ) ;p r i n t f ("Thread %d says: %s\n" , id , message ) ;
}
Creates a parallel region of OMP_NUM_THREADS
All threads execute the same code
id is private to each thread
Each thread gets its id in the team
message is shared among all threads
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 23 / 217
![Page 29: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/29.jpg)
Creating Threads
Putting it together
Example
void main ( ) {#pragma omp parallel
. . .omp_set_num_threads ( 2 ) ;#pragma omp parallel
. . .#pragma omp parallel num_threads ( random()%4+1) if ( 0 )
. . .}
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 24 / 217
![Page 30: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/30.jpg)
Creating Threads
Putting it together
Example
void main ( ) {#pragma omp parallel
. . .omp_set_num_threads ( 2 ) ;#pragma omp parallel
. . .#pragma omp parallel num_threads ( random()%4+1) if ( 0 )
. . .}
An unknown number of threads here. Use OMP_NUM_THREADS
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 24 / 217
![Page 31: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/31.jpg)
Creating Threads
Putting it together
Example
void main ( ) {#pragma omp parallel
. . .omp_set_num_threads ( 2 ) ;#pragma omp parallel
. . .#pragma omp parallel num_threads ( random()%4+1) if ( 0 )
. . .}
A team of two threads here.
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 24 / 217
![Page 32: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/32.jpg)
Creating Threads
Putting it together
Example
void main ( ) {#pragma omp parallel
. . .omp_set_num_threads ( 2 ) ;#pragma omp parallel
. . .#pragma omp parallel num_threads ( random()%4+1) if ( 0 )
. . .}
A team of 1 thread here.
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 24 / 217
![Page 33: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/33.jpg)
Creating Threads
API calls
Other useful routinesint omp_get_num_threads() Returns the number of threads in the cur-
rent teamint omp_get_thread_num() Returns the id of the thread in the current
teamint omp_get_num_procs() Returns the number of processors in the
machineint omp_get_max_threads() Returns the maximum number of threads
that will be used in the next parallel regiondouble omp_get_wtime() Returns the number of seconds since an
arbitrary point in the past
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 25 / 217
![Page 34: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/34.jpg)
Data-sharing attributes
Outline
OpenMP Overview
The OpenMP model
Writing OpenMP programs
Creating Threads
Data-sharing attributes
Synchronization
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 26 / 217
![Page 35: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/35.jpg)
Data-sharing attributes
Data environment
A number of clauses are related to building the data environment thatthe construct will use when executing.
shared
private
firstprivate
default
threadprivate
lastprivatereductioncopyincopyprivate
We’ll see them later
Out of our scope today
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 27 / 217
![Page 36: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/36.jpg)
Data-sharing attributes
Data-sharing attributes
SharedWhen a variable is marked as shared, the variable inside theconstruct is the same as the one outside the construct.
In a parallel construct this means all threads see the samevariable
but not necessarily the same valueUsually need some kind of synchronization to update themcorrectly
OpenMP has consistency points at synchronizations
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 28 / 217
![Page 37: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/37.jpg)
Data-sharing attributes
Data-sharing attributes
Example
i n t x =1;#pragma omp parallel shared ( x ) num_threads ( 2 ){
x++;p r i n t f ("%d\n" , x ) ;
}p r i n t f ("%d\n" , x ) ;
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 29 / 217
![Page 38: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/38.jpg)
Data-sharing attributes
Data-sharing attributes
Example
i n t x =1;#pragma omp parallel shared ( x ) num_threads ( 2 ){
x++;p r i n t f ("%d\n" , x ) ;
}p r i n t f ("%d\n" , x ) ; Prints 2 or 3
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 29 / 217
![Page 39: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/39.jpg)
Data-sharing attributes
Data-sharing attributes
PrivateWhen a variable is marked as private, the variable inside theconstruct is a new variable of the same type with an undefined value.
In a parallel construct this means all threads have a differentvariableCan be accessed without any kind of synchronization
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 30 / 217
![Page 40: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/40.jpg)
Data-sharing attributes
Data-sharing attributes
Example
i n t x =1;#pragma omp parallel private ( x ) num_threads ( 2 ){
x++;p r i n t f ("%d\n" , x ) ;
}p r i n t f ("%d\n" , x ) ;
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 31 / 217
![Page 41: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/41.jpg)
Data-sharing attributes
Data-sharing attributes
Example
i n t x =1;#pragma omp parallel private ( x ) num_threads ( 2 ){
x++;p r i n t f ("%d\n" , x ) ;
}p r i n t f ("%d\n" , x ) ;
Can print anything
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 31 / 217
![Page 42: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/42.jpg)
Data-sharing attributes
Data-sharing attributes
Example
i n t x =1;#pragma omp parallel private ( x ) num_threads ( 2 ){
x++;p r i n t f ("%d\n" , x ) ;
}p r i n t f ("%d\n" , x ) ; Prints 1
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 31 / 217
![Page 43: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/43.jpg)
Data-sharing attributes
Data-sharing attributes
FirstprivateWhen a variable is marked as firstprivate, the variable inside theconstruct is a new variable of the same type but it is initialized to theoriginal variable value.
In a parallel construct this means all threads have a differentvariable with the same initial valueCan be accessed without any kind of synchronization
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 32 / 217
![Page 44: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/44.jpg)
Data-sharing attributes
Data-sharing attributes
Example
i n t x =1;#pragma omp parallel firstprivate ( x ) num_threads ( 2 ){
x++;p r i n t f ("%d\n" , x ) ;
}p r i n t f ("%d\n" , x ) ;
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 33 / 217
![Page 45: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/45.jpg)
Data-sharing attributes
Data-sharing attributes
Example
i n t x =1;#pragma omp parallel firstprivate ( x ) num_threads ( 2 ){
x++;p r i n t f ("%d\n" , x ) ;
}p r i n t f ("%d\n" , x ) ;
Prints 2 (twice)
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 33 / 217
![Page 46: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/46.jpg)
Data-sharing attributes
Data-sharing attributes
Example
i n t x =1;#pragma omp parallel firstprivate ( x ) num_threads ( 2 ){
x++;p r i n t f ("%d\n" , x ) ;
}p r i n t f ("%d\n" , x ) ; Prints 1
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 33 / 217
![Page 47: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/47.jpg)
Data-sharing attributes
Data-sharing attributes
What is the default?Static/global storage is sharedHeap-allocated storage is sharedStack-allocated storage inside the construct is privateOthers
If there is a default clause, what the clause saysnone means that the compiler will issue an error if the attribute is notexplicitly set by the programmer
Otherwise, depends on the constructFor the parallel region the default is shared
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 34 / 217
![Page 48: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/48.jpg)
Data-sharing attributes
Data-sharing attributes
Example
i n t x , y ;#pragma omp parallel private ( y ){
x =y =#pragma omp parallel private ( x ){
x =y =
}}
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 35 / 217
![Page 49: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/49.jpg)
Data-sharing attributes
Data-sharing attributes
Example
i n t x , y ;#pragma omp parallel private ( y ){
x =y =#pragma omp parallel private ( x ){
x =y =
}}
x is shared
y is private
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 35 / 217
![Page 50: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/50.jpg)
Data-sharing attributes
Data-sharing attributes
Example
i n t x , y ;#pragma omp parallel private ( y ){
x =y =#pragma omp parallel private ( x ){
x =y =
}}
x is private
y is shared
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 35 / 217
![Page 51: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/51.jpg)
Data-sharing attributes
Threadprivate storage
The threadprivate construct
#pragma omp t h r e a d p r i v a t e ( var− l i s t )
Can be applied to:Global variablesStatic variablesClass-static members
Allows to create a per-thread copy of “global” variables.threadprivate storage persist across parallel regions if thenumber of threads is the same
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 36 / 217
Threadprivate persistence across nested regions is complex
![Page 52: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/52.jpg)
Data-sharing attributes
Threaprivate storage
Example
char∗ foo ( ){
s t a t i c char b u f f e r [ BUF_SIZE ] ;
. . .
return b u f f e r ;}
void bar ( ){
#pragma omp parallel{
char ∗ s t r = foo ( ) ;s t r [ 0 ] = random ( ) ;
}}
Unsafe. All threadsaccess the same
buffer
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 37 / 217
![Page 53: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/53.jpg)
Data-sharing attributes
Threaprivate storage
Example
char∗ foo ( ){
s t a t i c char b u f f e r [ BUF_SIZE ] ;
. . .
return b u f f e r ;}
void bar ( ){
#pragma omp parallel{
char ∗ s t r = foo ( ) ;s t r [ 0 ] = random ( ) ;
}}
Unsafe. All threadsaccess the same
buffer
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 37 / 217
![Page 54: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/54.jpg)
Data-sharing attributes
Threaprivate storage
Example
char∗ foo ( ){
s t a t i c char b u f f e r [ BUF_SIZE ] ;#pragma omp t h r e a d p r i v a t e ( b u f f e r )
. . .
return b u f f e r ;}
void bar ( ){
#pragma omp parallel{
char ∗ s t r = foo ( ) ;s t r [ 0 ] = random ( ) ;
}}
Creates one staticcopy of buffer per
thread
Now foo can be called safelyby multiple threads at the
same time
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 38 / 217
![Page 55: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/55.jpg)
Synchronization
Outline
OpenMP Overview
The OpenMP model
Writing OpenMP programs
Creating Threads
Data-sharing attributes
Synchronization
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 39 / 217
![Page 56: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/56.jpg)
Synchronization
Why synchronization?
MechanismsThreads need to synchronize to impose some ordering in thesequence of actions of the threads. OpenMP provides differentsynchronization mechanisms:
barrier
critical
atomic
taskwaitorderedlocks
We’ll see them later
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 40 / 217
![Page 57: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/57.jpg)
Synchronization
Thread Barrier
The barrier construct
#pragma omp barrier
Threads cannot proceed past a barrier point until all threads reachthe barrier AND all previously generated work is completedSome constructs have an implicit barrier at the end
E.g., the parallel construct
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 41 / 217
![Page 58: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/58.jpg)
Synchronization
Barrier
Example
#pragma omp parallel{
foo ( ) ;#pragma omp barrierbar ( ) ;
}
Forces all foo occurrences toohappen before all bar occurrences
Implicit barrier at the end of the parallel region
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 42 / 217
![Page 59: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/59.jpg)
Synchronization
Barrier
Example
#pragma omp parallel{
foo ( ) ;#pragma omp barrierbar ( ) ;
}
Forces all foo occurrences toohappen before all bar occurrences
Implicit barrier at the end of the parallel region
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 42 / 217
![Page 60: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/60.jpg)
Synchronization
Barrier
Example
#pragma omp parallel{
foo ( ) ;#pragma omp barrierbar ( ) ;
}
Forces all foo occurrences toohappen before all bar occurrences
Implicit barrier at the end of the parallel region
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 42 / 217
![Page 61: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/61.jpg)
Synchronization
Exclusive access
The critical construct
#pragma omp critical [ ( name ) ]s t r u c t u r e d block
Provides a region of mutual exclusion where only one thread canbe working at any given time.By default all critical regions are the same, but you can providethem with names
Only those with the same name synchronize
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 43 / 217
![Page 62: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/62.jpg)
Synchronization
Critical construct
Example
i n t x =1;#pragma omp parallel num_threads ( 2 ){
#pragma omp criticalx++;
}p r i n t f ("%d\n" , x ) ;
Only one thread at a time here
Prints 3!
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 44 / 217
![Page 63: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/63.jpg)
Synchronization
Critical construct
Example
i n t x =1;#pragma omp parallel num_threads ( 2 ){
#pragma omp criticalx++;
}p r i n t f ("%d\n" , x ) ;
Only one thread at a time here
Prints 3!
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 44 / 217
![Page 64: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/64.jpg)
Synchronization
Critical construct
Example
i n t x =1;#pragma omp parallel num_threads ( 2 ){
#pragma omp criticalx++;
}p r i n t f ("%d\n" , x ) ;
Only one thread at a time here
Prints 3!
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 44 / 217
![Page 65: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/65.jpg)
Synchronization
Critical construct
Example
i n t x=1 ,y =0;#pragma omp parallel num_threads ( 4 ){
#pragma omp critical ( x )x++;
#pragma omp critical ( y )y++;
}
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 45 / 217
![Page 66: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/66.jpg)
Synchronization
Critical construct
Example
i n t x=1 ,y =0;#pragma omp parallel num_threads ( 4 ){
#pragma omp critical ( x )x++;
#pragma omp critical ( y )y++;
}
Different names: One thread canupdate x while another updates y
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 45 / 217
![Page 67: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/67.jpg)
Synchronization
Exclusive access
The atomic construct
#pragma omp atomicexpression
Provides an special mechanism of mutual exclusion to do read &update operationsOnly supports simple read & update expressions
E.g., x ++, x -= foo()Only protects the read & update part
foo() not protected
Usually much more efficient than a critical constructNot compatible with critical
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 46 / 217
![Page 68: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/68.jpg)
Synchronization
Atomic construct
Example
i n t x =1;#pragma omp parallel num_threads ( 2 ){
#pragma omp atomicx++;
}p r i n t f ("%d\n" , x ) ;
Only one thread at a time updates x here
Prints 3!
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 47 / 217
![Page 69: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/69.jpg)
Synchronization
Atomic construct
Example
i n t x =1;#pragma omp parallel num_threads ( 2 ){
#pragma omp atomicx++;
}p r i n t f ("%d\n" , x ) ;
Only one thread at a time updates x here
Prints 3!
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 47 / 217
![Page 70: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/70.jpg)
Synchronization
Atomic construct
Example
i n t x =1;#pragma omp parallel num_threads ( 2 ){
#pragma omp atomicx++;
}p r i n t f ("%d\n" , x ) ;
Only one thread at a time updates x here
Prints 3!
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 47 / 217
![Page 71: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/71.jpg)
Synchronization
Atomic construct
Example
i n t x =1;#pragma omp parallel num_threads ( 2 ){
#pragma omp criticalx++;
#pragma omp atomicx++;
}p r i n t f ("%d\n" , x ) ;
Prints 3,4 or 5 :(
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 48 / 217
![Page 72: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/72.jpg)
Synchronization
Atomic construct
Example
i n t x =1;#pragma omp parallel num_threads ( 2 ){
#pragma omp criticalx++;
#pragma omp atomicx++;
}p r i n t f ("%d\n" , x ) ;
Different threads can update x atthe same time!
Prints 3,4 or 5 :(
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 48 / 217
![Page 73: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/73.jpg)
Synchronization
Atomic construct
Example
i n t x =1;#pragma omp parallel num_threads ( 2 ){
#pragma omp criticalx++;
#pragma omp atomicx++;
}p r i n t f ("%d\n" , x ) ; Prints 3,4 or 5 :(
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 48 / 217
![Page 74: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/74.jpg)
Break
Coffee time! :-)
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 49 / 217
![Page 75: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/75.jpg)
Part II
Hands-on (I)
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 50 / 217
![Page 76: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/76.jpg)
Outline
Setup
Hello world!
Other
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 51 / 217
![Page 77: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/77.jpg)
Setup
Outline
Setup
Hello world!
Other
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 52 / 217
![Page 78: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/78.jpg)
Setup
Hands-on preparationEnvironment
We’ll be using ...an SGI Altix 4700 System
128 cpus Dual Core Montecito(IA-64). Each one of the 256 coresworks at 1,6 GHz, with a 8MB L3 cache and 533 MHz Bus.
Unfortunately will be using just 8 of them :-)
2.5 TB RAM.2 internal SAS disks of 146 GB at 15000 RPMs12 external SAS disks of 300 GB at 10000 RPMS
Intel’s compiler version 11.0Full support of OpenMP 3.0Other vendors that support 3.0: PGI, IBM, SUN, GCC
Log into the system with the provided username and password
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 53 / 217
![Page 79: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/79.jpg)
Setup
Hands-on preparation
Ready...Copy the exercises from my home:
$ cp -a∼aduran/Prace_OpenMP_Handson_1/hello .
Go!Now enter the hello directory to start the fun :-)
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 54 / 217
![Page 80: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/80.jpg)
Setup
Hands-on preparation
Ready...Copy the exercises from my home:
$ cp -a∼aduran/Prace_OpenMP_Handson_1/hello .
Go!Now enter the hello directory to start the fun :-)
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 54 / 217
![Page 81: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/81.jpg)
Hello world!
Outline
Setup
Hello world!
Other
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 55 / 217
![Page 82: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/82.jpg)
Hello world!
First exerciseHello world!
Compile1 Edit the Makefile in the directory and answer the following
questions:Which is the compiler name?Which flag does activate OpenMP?
2 Run make and check that it generates a hello program.
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 56 / 217
![Page 83: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/83.jpg)
Hello world!
First exerciseHello world!
Run1 Edit the file hello.c and try to figure out what is going to be the
output of the following commands:
$ ./hello
$ OMP_NUM_THREADS=2 ./hello
$ OMP_NUM_THREADS=4 ./hello
2 Now run them. Were you right?
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 57 / 217
![Page 84: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/84.jpg)
Hello world!
First exerciseHello world!
Being oneself
Now modify our hello program so that each thread generates a mes-sage with its id
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 58 / 217
Tip: Use omp_get_thread_num()
![Page 85: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/85.jpg)
Hello world!
First exerciseHello world!
Generate extra infoNow modify our hello program so before any thread says hello, it outputsthe following information:
1 The number of processors in the system2 The number of threads that will be available in the parallel region
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 59 / 217
![Page 86: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/86.jpg)
Hello world!
First exerciseHello world!
Measuring timeMeasure the time that it takes to execute the parallel region andoutput it at the end of the program.
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 60 / 217
Tip: Use omp_get_wtime()
![Page 87: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/87.jpg)
Hello world!
First exercise
One at a time!Extend the program so that each thread uses C rand to get a randomnumber. Accumulate those numbers in a shared variable and outputthe result at the end of the program.
Should the result always be the same given the same seed andnumber of threads?
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 61 / 217
![Page 88: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/88.jpg)
Other
Outline
Setup
Hello world!
Other
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 62 / 217
![Page 89: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/89.jpg)
Other
Second exercise
1 Edit the sync.c file2 Is correct the access to the variable x?3 Fix it using a critical construct. Compile it:
$ make sync
4 Run it from 1 to 4 threads and observe how it changes theaverage time
5 Now change the critical construct with an atomic one.6 Run it from 1 to 4 threads. How does the averages times compare
to the previous ones?
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 63 / 217
![Page 90: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/90.jpg)
Other
Some more...
One for each thread1 Compile the tp.c program:
$ make tp
2 The program is suposed to print three times the tread id3 Run it with 4 threads. Observe the results4 Edit tp.c and fix it so it behaves correctly5 How did you solve the problem for x?6 How did you solve the problem for y?7 If you solved them in the same way, then rethink what you did for x
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 64 / 217
![Page 91: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/91.jpg)
Break
Bon appétit!*
*Disclaimer: actual food may differfrom the image! :-)
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 65 / 217
![Page 92: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/92.jpg)
Part III
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 66 / 217
![Page 93: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/93.jpg)
Outline
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 67 / 217
![Page 94: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/94.jpg)
Part IV
The OpenMP Tasking Model
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 68 / 217
![Page 95: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/95.jpg)
Outline
OpenMP tasks
Task synchronization
The single construct
Task clauses
Common tasking problems
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 69 / 217
![Page 96: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/96.jpg)
OpenMP tasks
Outline
OpenMP tasks
Task synchronization
The single construct
Task clauses
Common tasking problems
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 70 / 217
![Page 97: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/97.jpg)
OpenMP tasks
Task parallelism in OpenMP
Task parallelism model
Team Task pool
Parallelism is extracted from “several” pieces of codeAllows to parallelize very unstructured parallelism
Unbounded loops, recursive functions, ...
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 71 / 217
![Page 98: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/98.jpg)
OpenMP tasks
What is a task in OpenMP ?
Tasks are work units whose execution may be deferredthey can also be executed immediately
Tasks are composed of:code to executea data environment
Initialized at creation time
internal control variables (ICVs)
Threads of the team cooperate to execute them
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 72 / 217
![Page 99: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/99.jpg)
OpenMP tasks
Creating tasks
The task construct
#pragma omp task [ c lauses ]s t r u c t u r e d block
Where clauses can be:sharedprivatefirstprivate
Values are captured at creation time
defaultif(expression)
untied
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 73 / 217
![Page 100: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/100.jpg)
OpenMP tasks
When are task created?
Parallel regions create tasksOne implicit task is created and assigned to each thread
So all task-concepts have sense inside the parallel region
Each thread that encounters a task constructPackages the code and dataCreates a new explicit task
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 74 / 217
![Page 101: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/101.jpg)
OpenMP tasks
Default task data-sharing attributesWhen there are no clauses ...
If no default clauseImplicit rules apply
e.g., global variables are sharedOtherwise...
firstprivateshared attribute is lexically inherited
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 75 / 217
![Page 102: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/102.jpg)
OpenMP tasks
Task default data-sharing attributesIn practice...
Example
i n t a ;void foo ( ) {
i n t b , c ;#pragma omp parallel shared ( b )#pragma omp parallel private ( b ){
i n t d ;#pragma omp task{
i n t e ;
a =b =c =d =e =
} } }
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 76 / 217
![Page 103: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/103.jpg)
OpenMP tasks
Task default data-sharing attributesIn practice...
Example
i n t a ;void foo ( ) {
i n t b , c ;#pragma omp parallel shared ( b )#pragma omp parallel private ( b ){
i n t d ;#pragma omp task{
i n t e ;
a = sharedb =c =d =e =
} } }
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 76 / 217
![Page 104: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/104.jpg)
OpenMP tasks
Task default data-sharing attributesIn practice...
Example
i n t a ;void foo ( ) {
i n t b , c ;#pragma omp parallel shared ( b )#pragma omp parallel private ( b ){
i n t d ;#pragma omp task{
i n t e ;
a = sharedb = firstprivatec =d =e =
} } }
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 76 / 217
![Page 105: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/105.jpg)
OpenMP tasks
Task default data-sharing attributesIn practice...
Example
i n t a ;void foo ( ) {
i n t b , c ;#pragma omp parallel shared ( b )#pragma omp parallel private ( b ){
i n t d ;#pragma omp task{
i n t e ;
a = sharedb = firstprivatec = sharedd =e =
} } }
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 76 / 217
![Page 106: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/106.jpg)
OpenMP tasks
Task default data-sharing attributesIn practice...
Example
i n t a ;void foo ( ) {
i n t b , c ;#pragma omp parallel shared ( b )#pragma omp parallel private ( b ){
i n t d ;#pragma omp task{
i n t e ;
a = sharedb = firstprivatec = sharedd = firstprivatee =
} } }
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 76 / 217
![Page 107: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/107.jpg)
OpenMP tasks
Task default data-sharing attributesIn practice...
Example
i n t a ;void foo ( ) {
i n t b , c ;#pragma omp parallel shared ( b )#pragma omp parallel private ( b ){
i n t d ;#pragma omp task{
i n t e ;
a = sharedb = firstprivatec = sharedd = firstprivatee = private
} } }
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 76 / 217
![Page 108: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/108.jpg)
OpenMP tasks
Task default data-sharing attributesIn practice...
Example
i n t a ;void foo ( ) {
i n t b , c ;#pragma omp parallel shared ( b )#pragma omp parallel private ( b ){
i n t d ;#pragma omp task{
i n t e ;
a = sharedb = firstprivatec = sharedd = firstprivatee = private
} } }
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 76 / 217
Tip: default(none) is your friend if you do not see it clearly
![Page 109: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/109.jpg)
OpenMP tasks
List traversal
Example
void t r a v e r s e _ l i s t ( L i s t l ){
Element e ;for ( e = l−> f i r s t ; e ; e = e−>next )
#pragma omp taskprocess ( e ) ;
}e is firstprivate
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 77 / 217
![Page 110: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/110.jpg)
Task synchronization
Outline
OpenMP tasks
Task synchronization
The single construct
Task clauses
Common tasking problems
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 78 / 217
![Page 111: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/111.jpg)
Task synchronization
Task synchronization
There are two main constructs to synchronize tasks:barrier
Remember: all previous work (including tasks) must be completed
taskwait
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 79 / 217
![Page 112: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/112.jpg)
Task synchronization
Waiting for children
The taskwait construct
#pragma omp taskwait
Suspends the current task until all children tasks are completedJust direct children, not descendants
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 80 / 217
![Page 113: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/113.jpg)
Task synchronization
Taskwait
Example
void t r a v e r s e _ l i s t ( L i s t l ){
Element e ;for ( e = l−> f i r s t ; e ; e = e−>next )
#pragma omp taskprocess ( e ) ;
#pragma omp taskwait
}
All tasks guaranteed to be completed here
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 81 / 217
![Page 114: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/114.jpg)
Task synchronization
Taskwait
Example
void t r a v e r s e _ l i s t ( L i s t l ){
Element e ;for ( e = l−> f i r s t ; e ; e = e−>next )
#pragma omp taskprocess ( e ) ;
#pragma omp taskwait
}All tasks guaranteed to be completed here
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 81 / 217
![Page 115: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/115.jpg)
Task synchronization
Taskwait
Example
void t r a v e r s e _ l i s t ( L i s t l ){
Element e ;for ( e = l−> f i r s t ; e ; e = e−>next )
#pragma omp taskprocess ( e ) ;
#pragma omp taskwait
}
All tasks guaranteed to be completed here
Now we need some threadsto execute the tasks
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 81 / 217
![Page 116: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/116.jpg)
Task synchronization
List traversalCompleting the picture
Example
L i s t l
#pragma omp parallelt r a v e r s e _ l i s t ( l ) ;
This will generate multiple traversalsWe need a way to have a singlethread execute traverse_list
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 82 / 217
![Page 117: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/117.jpg)
Task synchronization
List traversalCompleting the picture
Example
L i s t l
#pragma omp parallelt r a v e r s e _ l i s t ( l ) ; This will generate multiple traversals
We need a way to have a singlethread execute traverse_list
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 82 / 217
![Page 118: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/118.jpg)
Task synchronization
List traversalCompleting the picture
Example
L i s t l
#pragma omp parallelt r a v e r s e _ l i s t ( l ) ;
This will generate multiple traversals
We need a way to have a singlethread execute traverse_list
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 82 / 217
![Page 119: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/119.jpg)
The single construct
Outline
OpenMP tasks
Task synchronization
The single construct
Task clauses
Common tasking problems
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 83 / 217
![Page 120: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/120.jpg)
The single construct
Giving work to just one thread
The single construct
#pragma omp single [ c lauses ]s t r u c t u r e d block
where clauses can be:privatefirstprivatenowaitcopyprivate
Only one thread of the team executes the structured blockThere is an implicit barrier at the end
We’ll see it laterNot today
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 84 / 217
![Page 121: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/121.jpg)
The single construct
The single construct
Example
i n t main ( i n t argc , char ∗∗argv ){
#pragma omp parallel{
#pragma omp single{
p r i n t f ("Hello world!\n" ) ;}
}}
This program outputs justone “Hello world”
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 85 / 217
![Page 122: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/122.jpg)
The single construct
The single construct
Example
i n t main ( i n t argc , char ∗∗argv ){
#pragma omp parallel{
#pragma omp single{
p r i n t f ("Hello world!\n" ) ;}
}}
This program outputs justone “Hello world”
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 85 / 217
![Page 123: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/123.jpg)
The single construct
List traversalCompleting the picture
Example
L i s t l
#pragma omp parallel#pragma single
t r a v e r s e _ l i s t ( l ) ;
One thread creates the tasks of the traversalAll threads cooperate to execute them
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 86 / 217
![Page 124: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/124.jpg)
The single construct
List traversalCompleting the picture
Example
L i s t l
#pragma omp parallel#pragma single
t r a v e r s e _ l i s t ( l ) ; One thread creates the tasks of the traversal
All threads cooperate to execute them
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 86 / 217
![Page 125: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/125.jpg)
The single construct
List traversalCompleting the picture
Example
L i s t l
#pragma omp parallel#pragma single
t r a v e r s e _ l i s t ( l ) ;
One thread creates the tasks of the traversal
All threads cooperate to execute them
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 86 / 217
![Page 126: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/126.jpg)
Task clauses
Outline
OpenMP tasks
Task synchronization
The single construct
Task clauses
Common tasking problems
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 87 / 217
![Page 127: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/127.jpg)
Task clauses
Task scheduling
How it works?Tasks are tied by default
Tied tasks are executed always by the same threadNot necessarily the creator
Tied tasks have scheduling restrictionsDeterministic scheduling points (creation, synchronization, ... )
Tasks can be suspended/resumed at these points
Another constraint to avoid deadlock problems
Tied tasks may run into performance problems
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 88 / 217
![Page 128: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/128.jpg)
Task clauses
The untied clause
A task that has been marked as untied has none of the previousscheduling restrictions:
Can potentially switch to any threadCan potentially switch at any momentBad mix with thread based features
thread-id, critical regions, threadprivate
Gives the runtime more flexibility to schedule tasks
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 89 / 217
![Page 129: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/129.jpg)
Task clauses
The if clause
If the the expression of an if clause evaluates to falseThe encountering task is suspendedThe new task is executed immediately
with its own data environmentdifferent task with respect to synchronization
The parent task resumes when the task finishesAllows implementations to optimize task creation
For very fine grain task you may need to do your own if
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 90 / 217
![Page 130: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/130.jpg)
Common tasking problems
Outline
OpenMP tasks
Task synchronization
The single construct
Task clauses
Common tasking problems
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 91 / 217
![Page 131: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/131.jpg)
Common tasking problems
Search problem
Example
void search ( i n t n , i n t j , bool ∗s ta te ){
i n t i , res ;
i f ( n == j ) {/∗ good so lu t i on , count i t ∗ /s o l u t i o n s ++;return ;
}
/∗ t r y each poss ib le s o l u t i o n ∗ /for ( i = 0 ; i < n ; i ++)
{s t a t e [ j ] = i ;i f ( ok ( j +1 , s t a t e ) ) {
search ( n , j +1 , s t a t e ) ;}
}}
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 92 / 217
![Page 132: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/132.jpg)
Common tasking problems
Search problem
Example
void search ( i n t n , i n t j , bool ∗s ta te ){
i n t i , res ;
i f ( n == j ) {/∗ good so lu t i on , count i t ∗ /s o l u t i o n s ++;return ;
}
/∗ t r y each poss ib le s o l u t i o n ∗ /for ( i = 0 ; i < n ; i ++)#pragma omp task{
s t a t e [ j ] = i ;i f ( ok ( j +1 , s t a t e ) ) {
search ( n , j +1 , s t a t e ) ;}
}}
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 92 / 217
![Page 133: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/133.jpg)
Common tasking problems
Search problem
Example
void search ( i n t n , i n t j , bool ∗s ta te ){
i n t i , res ;
i f ( n == j ) {/∗ good so lu t i on , count i t ∗ /s o l u t i o n s ++;return ;
}
/∗ t r y each poss ib le s o l u t i o n ∗ /for ( i = 0 ; i < n ; i ++)#pragma omp task{
s t a t e [ j ] = i ;i f ( ok ( j +1 , s t a t e ) ) {
search ( n , j +1 , s t a t e ) ;}
}}
Data scopingBecause it’s an orphanedtask all variables arefirstprivate
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 92 / 217
![Page 134: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/134.jpg)
Common tasking problems
Search problem
Example
void search ( i n t n , i n t j , bool ∗s ta te ){
i n t i , res ;
i f ( n == j ) {/∗ good so lu t i on , count i t ∗ /s o l u t i o n s ++;return ;
}
/∗ t r y each poss ib le s o l u t i o n ∗ /for ( i = 0 ; i < n ; i ++)#pragma omp task{
s t a t e [ j ] = i ;i f ( ok ( j +1 , s t a t e ) ) {
search ( n , j +1 , s t a t e ) ;}
}}
Data scopingBecause it’s an orphanedtask all variables arefirstprivate
State is not capturedJust the pointer is capturednot the pointed data
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 92 / 217
![Page 135: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/135.jpg)
Common tasking problems
Search problem
Example
void search ( i n t n , i n t j , bool ∗s ta te ){
i n t i , res ;
i f ( n == j ) {/∗ good so lu t i on , count i t ∗ /s o l u t i o n s ++;return ;
}
/∗ t r y each poss ib le s o l u t i o n ∗ /for ( i = 0 ; i < n ; i ++)#pragma omp task{
s t a t e [ j ] = i ;i f ( ok ( j +1 , s t a t e ) ) {
search ( n , j +1 , s t a t e ) ;}
}}
Problem #1Incorrectly capturingpointed data
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 92 / 217
![Page 136: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/136.jpg)
Common tasking problems
Problem #1Incorrectly capturing pointed data
Problemfirstprivate does not allow to capture data through pointers
Solutions1 Capture it manually2 Copy it to an array and capture the array with firstprivate
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 93 / 217
![Page 137: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/137.jpg)
Common tasking problems
Search problem
Example
void search ( i n t n , i n t j , bool ∗s ta te ){
i n t i , res ;
i f ( n == j ) {/∗ good so lu t i on , count i t ∗ /s o l u t i o n s ++;return ;
}
/∗ t r y each poss ib le s o l u t i o n ∗ /for ( i = 0 ; i < n ; i ++)#pragma omp task{
bool ∗new_state = a l l o c a ( sizeof ( bool )∗n ) ;memcpy( new_state , s ta te , sizeof ( bool )∗n ) ;new_state [ j ] = i ;i f ( ok ( j +1 , new_state ) ) {
search ( n , j +1 , new_state ) ;}
}}
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 94 / 217
![Page 138: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/138.jpg)
Common tasking problems
Search problem
Example
void search ( i n t n , i n t j , bool ∗s ta te ){
i n t i , res ;
i f ( n == j ) {/∗ good so lu t i on , count i t ∗ /s o l u t i o n s ++;return ;
}
/∗ t r y each poss ib le s o l u t i o n ∗ /for ( i = 0 ; i < n ; i ++)#pragma omp task{
bool ∗new_state = a l l o c a ( sizeof ( bool )∗n ) ;memcpy( new_state , s ta te , sizeof ( bool )∗n ) ;new_state [ j ] = i ;i f ( ok ( j +1 , new_state ) ) {
search ( n , j +1 , new_state ) ;}
}}
Caution!Will state still be valid by thetime memcpy is executed?
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 94 / 217
![Page 139: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/139.jpg)
Common tasking problems
Search problem
Example
void search ( i n t n , i n t j , bool ∗s ta te ){
i n t i , res ;
i f ( n == j ) {/∗ good so lu t i on , count i t ∗ /s o l u t i o n s ++;return ;
}
/∗ t r y each poss ib le s o l u t i o n ∗ /for ( i = 0 ; i < n ; i ++)#pragma omp task{
bool ∗new_state = a l l o c a ( sizeof ( bool )∗n ) ;memcpy( new_state , s ta te , sizeof ( bool )∗n ) ;new_state [ j ] = i ;i f ( ok ( j +1 , new_state ) ) {
search ( n , j +1 , new_state ) ;}
}}
Problem #2Data can go out of scope!
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 94 / 217
![Page 140: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/140.jpg)
Common tasking problems
Problem #2Out-of-scope data
ProblemStack-allocated parent data can become invalid before being used bychild tasks
Only if not captured with firstprivate
Solutions1 Use firstprivate when possible2 Allocate it in the heap
Not always easy (we also need to free it)3 Put additional synchronizations
May reduce the available parallelism
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 95 / 217
![Page 141: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/141.jpg)
Common tasking problems
Search problem
Example
void search ( i n t n , i n t j , bool ∗s ta te ){
i n t i , res ;
i f ( n == j ) {/∗ good so lu t i on , count i t ∗ /s o l u t i o n s ++ ;return ;
}
/∗ t r y each poss ib le s o l u t i o n ∗ /for ( i = 0 ; i < n ; i ++)#pragma omp task{
bool ∗new_state = a l l o c a ( sizeof ( bool )∗n ) ;memcpy( new_state , s ta te , sizeof ( bool )∗n ) ;new_state [ j ] = i ;i f ( ok ( j +1 , new_state ) ) {
search ( n , j +1 , new_state ) ;}
}
#pragma omp taskwait}
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 96 / 217
![Page 142: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/142.jpg)
Common tasking problems
Search problem
Example
void search ( i n t n , i n t j , bool ∗s ta te ){
i n t i , res ;
i f ( n == j ) {/∗ good so lu t i on , count i t ∗ /s o l u t i o n s ++ ;return ;
}
/∗ t r y each poss ib le s o l u t i o n ∗ /for ( i = 0 ; i < n ; i ++)#pragma omp task{
bool ∗new_state = a l l o c a ( sizeof ( bool )∗n ) ;memcpy( new_state , s ta te , sizeof ( bool )∗n ) ;new_state [ j ] = i ;i f ( ok ( j +1 , new_state ) ) {
search ( n , j +1 , new_state ) ;}
}
#pragma omp taskwait}
Shared variable needs protected access
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 96 / 217
![Page 143: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/143.jpg)
Common tasking problems
Search problem
Example
void search ( i n t n , i n t j , bool ∗s ta te ){
i n t i , res ;
i f ( n == j ) {/∗ good so lu t i on , count i t ∗ /s o l u t i o n s ++ ;return ;
}
/∗ t r y each poss ib le s o l u t i o n ∗ /for ( i = 0 ; i < n ; i ++)#pragma omp task{
bool ∗new_state = a l l o c a ( sizeof ( bool )∗n ) ;memcpy( new_state , s ta te , sizeof ( bool )∗n ) ;new_state [ j ] = i ;i f ( ok ( j +1 , new_state ) ) {
search ( n , j +1 , new_state ) ;}
}
#pragma omp taskwait}
SolutionsUse critical
Use atomic
Use threadprivate
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 96 / 217
![Page 144: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/144.jpg)
Common tasking problems
Reductions for tasks
Example
i n t s o l u t i o n s =0;i n t mysolutions=0;#pragma omp t h r e a d p r i v a t e (mysolutions )
void s ta r t_sea rch ( ){#pragma omp parallel{
#pragma omp single{
bool i n i t i a l _ s t a t e [ n ] ;search ( n ,0 , i n i t i a l _ s t a t e ) ;
}#pragma omp atomic
s o l u t i o n s += mysolutions ;}
}
Use a separate counter for each thread
Accumulate them at the end
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 97 / 217
![Page 145: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/145.jpg)
Common tasking problems
Search problem
Example
void search ( i n t n , i n t j , bool ∗s ta te ){
i n t i , res ;
i f ( n == j ) {/∗ good so lu t i on , count i t ∗ /mysolutions++;return ;
}
/∗ t r y each poss ib le s o l u t i o n ∗ /for ( i = 0 ; i < n ; i ++)#pragma omp task{
bool ∗new_state = a l l o c a ( sizeof ( bool )∗n ) ;memcpy( new_state , s ta te , sizeof ( bool )∗n ) ;new_state [ j ] = i ;i f ( ok ( j +1 , new_state ) ) {
search ( n , j +1 , new_state ) ;}
}
#pragma omp taskwait}
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 98 / 217
![Page 146: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/146.jpg)
Common tasking problems
Search problem
Example
void search ( i n t n , i n t j , bool ∗s ta te ){
i n t i , res ;
i f ( n == j ) {/∗ good so lu t i on , count i t ∗ /mysolutions++;return ;
}
/∗ t r y each poss ib le s o l u t i o n ∗ /for ( i = 0 ; i < n ; i ++)#pragma omp task{
bool ∗new_state = a l l o c a ( sizeof ( bool )∗n ) ;memcpy( new_state , s ta te , sizeof ( bool )∗n ) ;new_state [ j ] = i ;i f ( ok ( j +1 , new_state ) ) {
search ( n , j +1 , new_state ) ;}
}
#pragma omp taskwait}
Pruning mechanism potentially introducesimbalance in the tree
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 99 / 217
![Page 147: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/147.jpg)
Common tasking problems
Search problem
Example
void search ( i n t n , i n t j , bool ∗s ta te ){
i n t i , res ;
i f ( n == j ) {/∗ good so lu t i on , count i t ∗ /mysolutions++;return ;
}
/∗ t r y each poss ib le s o l u t i o n ∗ /for ( i = 0 ; i < n ; i ++)#pragma omp task untied{
bool ∗new_state = a l l o c a ( sizeof ( bool )∗n ) ;memcpy( new_state , s ta te , sizeof ( bool )∗n ) ;new_state [ j ] = i ;i f ( ok ( j +1 , new_state ) ) {
search ( n , j +1 , new_state ) ;}
}
#pragma omp taskwait}
Untied clauseAllows theimplementation toeasier load balance
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 99 / 217
![Page 148: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/148.jpg)
Common tasking problems
Search problem
Example
void search ( i n t n , i n t j , bool ∗s ta te ){
i n t i , res ;
i f ( n == j ) {/∗ good so lu t i on , count i t ∗ /mysolutions++ ;return ;
}
/∗ t r y each poss ib le s o l u t i o n ∗ /for ( i = 0 ; i < n ; i ++)#pragma omp task untied{
bool ∗new_state = a l l o c a ( sizeof ( bool )∗n ) ;memcpy( new_state , s ta te , sizeof ( bool )∗n ) ;new_state [ j ] = i ;i f ( ok ( j +1 , new_state ) ) {
search ( n , j +1 , new_state ) ;}
}
#pragma omp taskwait}
Because of untied this is not safe!
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 100 / 217
![Page 149: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/149.jpg)
Common tasking problems
Pitfall #3Unsafe use of untied tasks
ProblemBecause tasks can migrate between threads at any pointthread-centric constructs can yield unexpected results
RememberWhen using untied tasks avoid:
Threadprivate variablesAny thread-id uses
And be very careful with:Critical regions (and locks)
Simple solutionCreate a task tied region with #pragma omp task if(0)
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 101 / 217
![Page 150: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/150.jpg)
Common tasking problems
Search problem
Example
void search ( i n t n , i n t j , bool ∗s ta te ){
i n t i , res ;
i f ( n == j ) {/∗ good so lu t i on , count i t ∗ /#pragma omp task i f ( 0 )mysolutions++ ;return ;
}
/∗ t r y each poss ib le s o l u t i o n ∗ /for ( i = 0 ; i < n ; i ++)#pragma omp task untied{
bool ∗new_state = a l l o c a ( sizeof ( bool )∗n ) ;memcpy( new_state , s ta te , sizeof ( bool )∗n ) ;new_state [ j ] = i ;i f ( ok ( j +1 , new_state ) ) {
search ( n , j +1 , new_state ) ;}
}
#pragma omp taskwait}
Now this statement is tied and safe
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 102 / 217
![Page 151: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/151.jpg)
Common tasking problems
Task granularity
Granularity is a key performance factorTasks tend to be fine-grainedTry to “group“ tasks together
Use if clause or manual transformations
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 103 / 217
![Page 152: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/152.jpg)
Common tasking problems
Using the if clause
Example
void search ( i n t n , i n t j , bool ∗s ta te , int depth ){
i n t i , res ;
i f ( n == j ) {/∗ good so lu t i on , count i t ∗ /#pragma omp task i f ( 0 )mysolut ions ++;return ;
}
/∗ t r y each poss ib le s o l u t i o n ∗ /for ( i = 0 ; i < n ; i ++)#pragma omp task untied if(depth < MAX_DEPTH){
bool ∗new_state = a l l o c a ( sizeof ( bool )∗n ) ;memcpy( new_state , s ta te , sizeof ( bool )∗n ) ;new_state [ j ] = i ;i f ( ok ( j +1 , new_state ) ) {
search ( n , j +1 , new_state,depth+1 ) ;}
}#pragma omp taskwait
}
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 104 / 217
![Page 153: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/153.jpg)
Common tasking problems
Using an if statement
Example
void search ( i n t n , i n t j , bool ∗s ta te , int depth ){
i n t i , res ;
i f ( n == j ) {/∗ good so lu t i on , count i t ∗ /#pragma omp task i f ( 0 )mysolut ions ++;return ;
}
/∗ t r y each poss ib le s o l u t i o n ∗ /for ( i = 0 ; i < n ; i ++)#pragma omp task untied{
bool ∗new_state = a l l o c a ( sizeof ( bool )∗n ) ;memcpy( new_state , s ta te , sizeof ( bool )∗n ) ;new_state [ j ] = i ;i f ( ok ( j +1 , new_state ) ) {
if ( depth < MAX_DEPTH )search ( n , j +1 , new_state,depth+1 ) ;
elsesearch_serial(n,j+1,new_state);
}}#pragma omp taskwait
}Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 105 / 217
![Page 154: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/154.jpg)
Part V
Hands-on (II)
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 106 / 217
![Page 155: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/155.jpg)
Outline
List traversal
Computing Pi
Finding Fibonacci
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 107 / 217
![Page 156: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/156.jpg)
Before you start
Copy the exercises to your directory:
$ cp -a∼aduran/Prace_OpenMP_Handson_1/tasking .
Enter the tasking directory to do the following exercises.
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 108 / 217
![Page 157: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/157.jpg)
List traversal
Outline
List traversal
Computing Pi
Finding Fibonacci
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 109 / 217
![Page 158: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/158.jpg)
List traversal
List traversal
Examine the codeTake a look at the list.cc file which implements a parallel list traversalwith OpenMP.
1 What should be the output of executing this program?2 Run it with one thread:
$ ./list
3 Do you get the expected result?4 Run it with two threads:
$ OMP_NUM_THREADS=2 ./list
5 Does it work?
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 110 / 217
![Page 159: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/159.jpg)
List traversal
List traversal
Fix itFix the list traversal so it gets the correct result with two threads (ormore). Use the following questions as a guide to help you:
1 How many tasks are being generated?2 Which is the data scoping in each construct?3 Are memory accesses properly synchronized?
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 111 / 217
![Page 160: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/160.jpg)
Computing Pi
Outline
List traversal
Computing Pi
Finding Fibonacci
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 112 / 217
![Page 161: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/161.jpg)
Computing Pi
Computing Pi
Our algorithmWe will use an algorithm that computes the pi number throughnumerical integration.
Take a look at the pi.c fileBecause iterations are independent we will create one task periteration
When you run make it will generate two programs: pi.serial andpi.omp. We will use the serial version to evaluate our parallel version.
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 113 / 217
![Page 162: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/162.jpg)
Computing Pi
Computing Pi
Measuring timeTo get reliable execution times will use the Altix batch system. Usethe following command to launch your executions:
$ make run-$program-$threads
It sets up OMP_NUM_THREADS for youIt will generate an output file in your directory when it finishes.You can check your status with mnqRun both versions with one thread
$ make run-pi.ser-1
$ make run-pi.omp-1
When they finish compare the results. Now run it with 2 threads.What do you observe? How is this possible?
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 114 / 217
![Page 163: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/163.jpg)
Computing Pi
Computing Pi
ProblemsOur version of pi has two main problems:
Tasks are too fine grain. The overheads associated with creating atask cannot be overcome.There is too much synchronization. Hidden synchronization andcommunications are a common source of performance problems.
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 115 / 217
![Page 164: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/164.jpg)
Computing Pi
Computing Pi
Increase the granularity1 Modify the pi program so that each task executes a chunk of N
iterations,2 Experiment with different numbers of N and see how the execution
time changesWhich would be the optimal number for N?
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 116 / 217
![Page 165: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/165.jpg)
Computing Pi
Computing Pi
Reduce the number of synchronizations1 Modify the pi program so that instead of using critical uses anatomic construct
Does the execution time improve?2 We can improve it further by reducing the number of atomic
accessesUse a private variable and only do one atomic update at theend of the task
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 117 / 217
![Page 166: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/166.jpg)
Computing Pi
Computing Pi
Final numbers1 Run our improved version up to 8 threads.
Does it scale?How does it compare to the serial version?
2 Now increase the total number of iterations by 10 and run it again.
How it behaves now?
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 118 / 217
![Page 167: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/167.jpg)
Computing Pi
Computing Pi
Some conclusionsIt’s difficult to go further than this with tasks
Task parallelism is very flexible but we need to overcome theoverheads
Beware hidden communication and synchronizationsOpenMP parallelization is an incremental process
As every other paradigm, sometimes we need effort to obtainoptimal performance
We’ll see later how to improve further our pi program
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 119 / 217
![Page 168: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/168.jpg)
Finding Fibonacci
Outline
List traversal
Computing Pi
Finding Fibonacci
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 120 / 217
![Page 169: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/169.jpg)
Finding Fibonacci
Fibonacci
The algorithmWe used a recursive implementation to find the Fibonacci number inthe fib.c file.
It’s very inefficientBut useful for educational purposes :-)
To compile it use:
$ make fib
To submit jobs use:
$ make run-fib-threads
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 121 / 217
![Page 170: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/170.jpg)
Finding Fibonacci
Fibonacci
FirstComplete the code so all the branches are computed in parallel
Use the serial version to check you have the correct resultAdd code to measure the time it takes to compute the number
To be more precise put the code inside the single region
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 122 / 217
![Page 171: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/171.jpg)
Finding Fibonacci
Fibonacci
Evaluate1 Run the code from 1 to 8 threads.2 Compare it to the time of the serial version3 What do you observe?
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 123 / 217
![Page 172: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/172.jpg)
Finding Fibonacci
Fibonacci
Incresing granularityAs in the pi program, Fibonacci because it recursive nature ends gen-erating to fine grain tasks.
1 Modify the program so it does not generate tasks at all when n istoo small (e.g. 20)
2 Run again this improved version up to 8 threads3 How does it compare with respect to the serial version?4 Try changing the cut-off value from 20 and how affects
performance
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 124 / 217
![Page 173: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/173.jpg)
Part VI
Data Parallelism in OpenMP
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 125 / 217
![Page 174: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/174.jpg)
Outline
The worksharing concept
Loop worksharing
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 126 / 217
![Page 175: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/175.jpg)
The worksharing concept
Outline
The worksharing concept
Loop worksharing
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 127 / 217
![Page 176: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/176.jpg)
The worksharing concept
Worksharings
Worksharing constructs divide the execution of a code region amongthe threads of a team
Threads cooperate to do some workBetter way to split work than using thread-idsLower overhead than using tasks
But, less flexible
In OpenMP, there are four worksharing constructs:singleloop worksharingsectionworkshare
We’ll see them later
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 128 / 217
Restriction: worksharings cannot be nested
![Page 177: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/177.jpg)
Loop worksharing
Outline
The worksharing concept
Loop worksharing
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 129 / 217
![Page 178: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/178.jpg)
Loop worksharing
Loop parallelism
The for construct
#pragma omp for [ c lauses ]for ( i n i t −expr ; t es t−expr ; inc−expr )
where clauses can be:privatefirstprivatelastprivate(variable-list)reduction(operator:variable-list)schedule(schedule-kind)nowaitcollapse(n)ordered We’ll see it later
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 130 / 217
![Page 179: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/179.jpg)
Loop worksharing
The for construct
How it works?The iterations of the loop(s) associated to the construct are dividedamong the threads of the team.
Loop iterations must be independentLoops must follow a form that allows to compute the number ofiterationsValid data types for inductions variables are: integer types,pointers and random access iterators (in C++)
The induction variable(s) are automatically privatized
The default data-sharing attribute is shared
It can be merged with the parallel construct:#pragma omp parallel for
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 131 / 217
![Page 180: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/180.jpg)
Loop worksharing
The for construct
Example
void foo ( i n t ∗m, i n t N, i n t M){
i n t i ;#pragma omp parallel for private ( j )for ( i = 0 ; i < N; i ++ )
for ( j = 0 ; j < M; j ++ )m[ i ] [ j ] = 0 ;
}
The i variable is automatically privatizedMust be explicitly privatized
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 132 / 217
![Page 181: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/181.jpg)
Loop worksharing
The for construct
Example
void foo ( i n t ∗m, i n t N, i n t M){
i n t i ;#pragma omp parallel for private ( j )for ( i = 0 ; i < N; i ++ )
for ( j = 0 ; j < M; j ++ )m[ i ] [ j ] = 0 ;
}
New created threads cooperate to exe-cute all the iterations of the loop
The i variable is automatically privatizedMust be explicitly privatized
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 132 / 217
![Page 182: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/182.jpg)
Loop worksharing
The for construct
Example
void foo ( i n t ∗m, i n t N, i n t M){
i n t i ;#pragma omp parallel for private ( j )for ( i = 0 ; i < N; i ++ )
for ( j = 0 ; j < M; j ++ )m[ i ] [ j ] = 0 ;
}
The i variable is automatically privatized
Must be explicitly privatized
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 132 / 217
![Page 183: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/183.jpg)
Loop worksharing
The for construct
Example
void foo ( i n t ∗m, i n t N, i n t M){
i n t i ;#pragma omp parallel for private ( j )for ( i = 0 ; i < N; i ++ )
for ( j = 0 ; j < M; j ++ )m[ i ] [ j ] = 0 ;
}
The i variable is automatically privatized
Must be explicitly privatized
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 132 / 217
![Page 184: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/184.jpg)
Loop worksharing
The for construct
Example
void foo ( s td : : vector < int > &v ){#pragma omp parallel forfor ( s td : : vector < int > : : i t e r a t o r i t = v . begin ( ) ;
i t < v . end ( ) ;i t ++ )
∗ i t = 0 ;}
random access iterators(and pointers) are valid
types!= cannot be used in the test expression
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 133 / 217
![Page 185: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/185.jpg)
Loop worksharing
The for construct
Example
void foo ( s td : : vector < int > &v ){#pragma omp parallel forfor ( s td : : vector < int > : : i t e r a t o r i t = v . begin ( ) ;
i t < v . end ( ) ;i t ++ )
∗ i t = 0 ;}
random access iterators(and pointers) are valid
types
!= cannot be used in the test expression
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 133 / 217
![Page 186: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/186.jpg)
Loop worksharing
The for construct
Example
void foo ( s td : : vector < int > &v ){#pragma omp parallel forfor ( s td : : vector < int > : : i t e r a t o r i t = v . begin ( ) ;
i t < v . end ( ) ;i t ++ )
∗ i t = 0 ;}
random access iterators(and pointers) are valid
types
!= cannot be used in the test expression
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 133 / 217
![Page 187: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/187.jpg)
Loop worksharing
Removing dependences
Example
x = 0;for ( i = 0 ; i < n ; i ++ ){
v [ i ] = x ;x += dx ;
}
Each iteration x depends on theprevious one. Can’t be parallelized
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 134 / 217
![Page 188: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/188.jpg)
Loop worksharing
Removing dependences
Example
x = 0;for ( i = 0 ; i < n ; i ++ ){
v [ i ] = x ;x += dx ;
}
Each iteration x depends on theprevious one. Can’t be parallelized
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 134 / 217
![Page 189: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/189.jpg)
Loop worksharing
Removing dependences
Example
x = 0;for ( i = 0 ; i < n ; i ++ ){
x = i ∗ dx ;v [ i ] = x ;
}
But x can be rewritten in terms of i .Now it can be parallelized
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 135 / 217
![Page 190: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/190.jpg)
Loop worksharing
Removing dependences
Example
x = 0;#pragma omp parallel for private ( x )for ( i = 0 ; i < n ; i ++ ){
x = i ∗ dx ;v [ i ] = x ;
}
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 136 / 217
![Page 191: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/191.jpg)
Loop worksharing
The lastprivate clause
When a variable is declared lastprivate, a private copy isgenerated for each thread. Then the value of the variable in the lastiteration of the loop is copied back to the original variable.
A variable can be both firstprivate and lastprivate
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 137 / 217
![Page 192: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/192.jpg)
Loop worksharing
The lastprivate clause
Example
i n t i#pragma omp for l a s t p r i v a t e ( i )for ( i = 0 ; i < 100; i ++ )
v [ i ] = 0 ;
p r i n t f ("i=%d\n" , i ) ;
prints 100
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 138 / 217
![Page 193: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/193.jpg)
Loop worksharing
The lastprivate clause
Example
i n t i#pragma omp for l a s t p r i v a t e ( i )for ( i = 0 ; i < 100; i ++ )
v [ i ] = 0 ;
p r i n t f ("i=%d\n" , i ) ; prints 100
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 138 / 217
![Page 194: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/194.jpg)
Loop worksharing
The reduction clause
A very common pattern is where all threads accumulate some valuesinto a shared variable
E.g., n += v[i], our pi program, ...Using critical or atomic is not good enough
Besides being error prone and cumbersome
Instead we can use the reduction clause for basic types.Valid operators for C/C++: +,-,*,|,||,&,&&,^Valid operators for Fortran: +,-,*,.and.,.or.,.eqv.,.neqv.,max,min
also supports reductions of arrays
The compiler creates a private copy that is properly initializedAt the end of the region, the compiler ensures that the sharedvariable is properly (and safely) updated.
We can also specify reduction variables in the parallel construct.
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 139 / 217
![Page 195: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/195.jpg)
Loop worksharing
The reduction clause
Example
i n t vector_sum ( i n t n , i n t v [ n ] ){
i n t i , sum = 0;#pragma omp parallel for reduction ( + :sum)
for ( i = 0 ; i < n ; i ++ )sum += v [ i ] ;
return sum;}
Private copy initialized here to the identity value
Shared variable updated here with the partial values of each thread
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 140 / 217
![Page 196: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/196.jpg)
Loop worksharing
The reduction clause
Example
i n t vector_sum ( i n t n , i n t v [ n ] ){
i n t i , sum = 0;#pragma omp parallel for reduction ( + :sum)
for ( i = 0 ; i < n ; i ++ )sum += v [ i ] ;
return sum;}
Private copy initialized here to the identity value
Shared variable updated here with the partial values of each thread
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 140 / 217
![Page 197: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/197.jpg)
Loop worksharing
Also in parallel
Example
i n t nt = 0 ;
#pragma omp parallel reduction ( + : n t )n t ++;
p r i n t f ("%d\n" , n t ) ;
reduction available in parallel as well
Prints the number of threads
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 141 / 217
![Page 198: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/198.jpg)
Loop worksharing
Also in parallel
Example
i n t nt = 0 ;
#pragma omp parallel reduction ( + : n t )n t ++;
p r i n t f ("%d\n" , n t ) ;
reduction available in parallel as well
Prints the number of threads
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 141 / 217
![Page 199: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/199.jpg)
Loop worksharing
Also in parallel
Example
i n t nt = 0 ;
#pragma omp parallel reduction ( + : n t )n t ++;
p r i n t f ("%d\n" , n t ) ;
reduction available in parallel as well
Prints the number of threads
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 141 / 217
![Page 200: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/200.jpg)
Loop worksharing
The schedule clause
The schedule clause determines which iterations are executed byeach thread.
If no schedule clause is present then is implementation definedThere are several possible options as schedule:
STATIC
STATIC,chunk
DYNAMIC[,chunk]
GUIDED[,chunk]
AUTO
RUNTIME
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 142 / 217
![Page 201: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/201.jpg)
Loop worksharing
The schedule clause
Static scheduleThe iteration space is broken in chunks of approximately sizeN/num − threads. Then these chunks are assigned to the threads in aRound-Robin fashion.
Static,N schedule (Interleaved)The iteration space is broken in chunks of size N. Then these chunksare assigned to the threads in a Round-Robin fashion.
Characteristics of static schedulesLow overheadGood locality (usually)Can have load imbalance problems
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 143 / 217
![Page 202: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/202.jpg)
Loop worksharing
The schedule clause
Dynamic,N scheduleThreads dynamically grab chunks of N iterations until all iterationshave been executed. If no chunk is specified, N = 1.
Guided,N scheduleVariant of dynamic. The size of the chunks deceases as the threadsgrab iterations, but it is at least of size N. If no chunk is specified,N = 1.
Characteristics of dynamic schedulesHigher overheadNot very good locality (usually)Can solve imbalance problems
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 144 / 217
![Page 203: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/203.jpg)
Loop worksharing
The schedule clause
Auto scheduleIn this case, the implementation is allowed to do whatever it wishes.
Do not expect much of it as of now
Runtime scheduleThe decision is delayed until the program is run through thesched-nvar ICV. It can be set with:
The OMP_SCHEDULE environment variableThe omp_set_schedule() API call
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 145 / 217
![Page 204: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/204.jpg)
Loop worksharing
False sharing
When a thread writes to a cache location, and another threadreads the same location the coherence protocol will copy the datafrom one cache to the other. This is called true sharingBut it can happen that this communication happens even if twothreads are not working on the same memory address. This isfalse sharing
Cpu1 Cpu2
x y
Invalidations
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 146 / 217
![Page 205: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/205.jpg)
Loop worksharing
Scheduling
Example
i n t v [N ] ;
#pragma omp forfor ( i n t i = 0 ; i < N; i ++ )
for ( i n t j = 0 ; j < i ; j ++ )v [ i ] += j ;
i loop quite unbalaceddynamic schedule?
lots of false sharing!
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 147 / 217
![Page 206: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/206.jpg)
Loop worksharing
Scheduling
Example
i n t v [N ] ;
#pragma omp forfor ( i n t i = 0 ; i < N; i ++ )
for ( i n t j = 0 ; j < i ; j ++ )v [ i ] += j ;
i loop quite unbalaced
dynamic schedule?
lots of false sharing!
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 147 / 217
![Page 207: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/207.jpg)
Loop worksharing
Scheduling
Example
i n t v [N ] ;
#pragma omp forfor ( i n t i = 0 ; i < N; i ++ )
for ( i n t j = 0 ; j < i ; j ++ )v [ i ] += j ;
i loop quite unbalaced
dynamic schedule?
lots of false sharing!
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 147 / 217
![Page 208: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/208.jpg)
Loop worksharing
Scheduling
Example
i n t v [N ] ;
#pragma omp forfor ( i n t i = 0 ; i < N; i ++ )
for ( i n t j = 0 ; j < i ; j ++ )v [ i ] += j ;
i loop quite unbalaceddynamic schedule?
lots of false sharing!
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 147 / 217
![Page 209: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/209.jpg)
Loop worksharing
The nowait clause
When a worksharing has a nowait clause then the implicit barrierat the end of the loop is removed.
This allows to overlap the execution of non-dependentloops/tasks/worksharings
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 148 / 217
![Page 210: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/210.jpg)
Loop worksharing
The nowait clause
Example
#pragma omp for nowaitfor ( i = 0 ; i < n ; i ++ )
v [ i ] = 0 ;#pragma omp forfor ( i = 0 ; i < n ; i ++ )
a [ i ] = 0 ;
First and second loop are indepen-dent so we can overlap them
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 149 / 217
![Page 211: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/211.jpg)
Loop worksharing
The nowait clause
Example
#pragma omp for nowaitfor ( i = 0 ; i < n ; i ++ )
v [ i ] = 0 ;#pragma omp forfor ( i = 0 ; i < n ; i ++ )
a [ i ] = 0 ;
On a side note, you would be bet-ter by fusing the loops in this case
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 149 / 217
![Page 212: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/212.jpg)
Loop worksharing
The nowait clause
Example
#pragma omp for nowaitfor ( i = 0 ; i < n ; i ++ )
v [ i ] = 0 ;#pragma omp forfor ( i = 0 ; i < n ; i ++ )
a [ i ] = v [ i ]∗v [ i ] ;
First and second loop are depen-dent!. No guarantees that the pre-vious iteration is finished
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 150 / 217
![Page 213: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/213.jpg)
Loop worksharing
The nowait clause
Exception: static schedulesIf the two (or more) loops have the same static schedule and allhave the same number of iterations.
Example
#pragma omp for schedule ( stat ic , 2 ) nowaitfor ( i = 0 ; i < n ; i ++ )
v [ i ] = 0 ;#pragma omp for schedule ( stat ic , 2 )for ( i = 0 ; i < n ; i ++ )
a [ i ] = v [ i ]∗v [ i ] ;
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 151 / 217
![Page 214: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/214.jpg)
Loop worksharing
The collapse clause
Allows to distribute work from a set of n nested loops.Loops must be perfectly nestedThe nest must traverse a rectangular iteration space
Example
#pragma omp for collapse ( 2 )for ( i = 0 ; i < N; i ++ )
for ( j = 0 ; j < M; j ++ )foo ( i , j ) ;
i and j loops are folded and itera-tions distributed among all threads.Both i and j are privatized
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 152 / 217
![Page 215: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/215.jpg)
Loop worksharing
The collapse clause
Allows to distribute work from a set of n nested loops.Loops must be perfectly nestedThe nest must traverse a rectangular iteration space
Example
#pragma omp for collapse ( 2 )for ( i = 0 ; i < N; i ++ )
for ( j = 0 ; j < M; j ++ )foo ( i , j ) ;
i and j loops are folded and itera-tions distributed among all threads.Both i and j are privatized
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 152 / 217
![Page 216: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/216.jpg)
Break
Coffee time! :-)
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 153 / 217
![Page 217: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/217.jpg)
Part VII
Hands-on (III)
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 154 / 217
![Page 218: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/218.jpg)
Outline
Matrix Multiply
Computing Pi (revisited)
Mandelbrot
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 155 / 217
![Page 219: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/219.jpg)
Before you start
Copy the exercises to your directory:
$ cp -a∼aduran/Prace_OpenMP_Handson_2/worksharing.
Enter the worksharing directory to do the following exercises.
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 156 / 217
![Page 220: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/220.jpg)
Matrix Multiply
Outline
Matrix Multiply
Computing Pi (revisited)
Mandelbrot
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 157 / 217
![Page 221: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/221.jpg)
Matrix Multiply
Matrix Multiply
Parallel loopsThe file matmul implements a sequential matrix multiply.
1 Use OpenMP worksharings to parallelize the application.check the init_mat and matmul functions
2 Run it up to 8 threads to check the scalability
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 158 / 217
Remember: To submit it use make run-matmul.omp-$threads
![Page 222: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/222.jpg)
Matrix Multiply
Matrix Multiply
Memory matters!To optimize accesses to the cache in these kind of algorithms, it is acommon practice to “logically” split the matrix in blocks of size BxB, anddo computation block-a-block instead of going through all the matrix atonce.
1 Implement such a blocking scheme for our matrix multiply2 Experiment with different sizes of B3 Run it up to 8 threads and compare the results with the previous
version
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 159 / 217
Tip: You need three additional inner loops
![Page 223: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/223.jpg)
Computing Pi (revisited)
Outline
Matrix Multiply
Computing Pi (revisited)
Mandelbrot
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 160 / 217
![Page 224: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/224.jpg)
Computing Pi (revisited)
Computing Pi
Using data parallelism1 Complete the implementation of our pi algorithm using data
parallelism2 Execute with 1 and 2 threads.
Does it scale?How does it compare to our previous implementation with tasks?What is the problem?
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 161 / 217
![Page 225: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/225.jpg)
Computing Pi (revisited)
Computing Pi
ProblemThe number of synchronizations is still very high for this program toscale.
Using reduction
1 Change the program to make use of the reduction clause2 Run it up to 8 threads3 How it compares to the previous version?
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 162 / 217
![Page 226: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/226.jpg)
Mandelbrot
Outline
Matrix Multiply
Computing Pi (revisited)
Mandelbrot
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 163 / 217
![Page 227: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/227.jpg)
Mandelbrot
Mandelbrot
More data parallelismWe will now parallelize an algorithm that generates sections of the Man-delbrot function.
1 Edit file mandel.c and complete the parallelization in functionmandel
Note that there is a dependence on the variable x
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 164 / 217
![Page 228: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/228.jpg)
Mandelbrot
Mandelbrot
Uncover load imbalanceWe can see that each point in the final output is computed through themandel_point function. If we check the code of that function we can seethat the number of iterations it takes will be different from one point toanother.We want to know how many iterations (this also happens to be the resultof mandel_point) each thread does.
1 Add a private counter to each thread2 Add to this counter the result of each mandel_point call by that
thread3 Output the count for each thread at the end of the parallel region4 What do you observe?
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 165 / 217
![Page 229: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/229.jpg)
Mandelbrot
Mandelbrot
Playing with schedules
To overcome the observed load imbalance we can use a different loopschedule.
Use the clause schedule(runtime) so the schedule is notfixed at compile timeNow run different experiments with different schedules andnumber of threads
Try at least static, dynamic and guided
Which one obtains the best result?
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 166 / 217
Tip: Change OMP_SCHEDULE before doing make run-...
![Page 230: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/230.jpg)
Part VIII
Other OpenMP Topics
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 167 / 217
![Page 231: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/231.jpg)
Outline
The master construct
Other synchronization mechanisms
Nested parallelism
Other worksharings
Other environment variables and API calls
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 168 / 217
![Page 232: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/232.jpg)
The master construct
Outline
The master construct
Other synchronization mechanisms
Nested parallelism
Other worksharings
Other environment variables and API calls
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 169 / 217
![Page 233: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/233.jpg)
The master construct
Only the master thread
The master construct
#pragma omp masters t r u c t u r e d block
The structured block is only executed by the master threadUseful when we want always the same thread to execute something
No implicit barrier at the end
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 170 / 217
![Page 234: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/234.jpg)
The master construct
Master construct
Example
void foo ( ){
#pragma omp parallel{
#pragma omp singlep r i n f t ("I am %d\n" , omp_get_thread_num ( ) ) ;
#pragma omp masterp r i n f t ("I am %d\n" , omp_get_thread_num ( ) ) ;
}}
Can be any thread
It’s always thread 0
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 171 / 217
![Page 235: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/235.jpg)
The master construct
Master construct
Example
void foo ( ){
#pragma omp parallel{
#pragma omp singlep r i n f t ("I am %d\n" , omp_get_thread_num ( ) ) ;
#pragma omp masterp r i n f t ("I am %d\n" , omp_get_thread_num ( ) ) ;
}}
Can be any thread
It’s always thread 0
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 171 / 217
![Page 236: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/236.jpg)
Other synchronization mechanisms
Outline
The master construct
Other synchronization mechanisms
Nested parallelism
Other worksharings
Other environment variables and API calls
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 172 / 217
![Page 237: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/237.jpg)
Other synchronization mechanisms
Ordering
The ordered construct
#pragma omp ordereds t r u c t u r e d block
Must appear in the dynamic extend of a loop worksharingThe worksharing must also have the ordered clause
The structured block is executed in the iteration’s sequential order
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 173 / 217
![Page 238: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/238.jpg)
Other synchronization mechanisms
Locks
OpenMP provides lock primitives for low-level synchronizationomp_init_lock Initialize the lockomp_set_lock Acquires the lockomp_unset_lock Releases the lockomp_test_lock Tries to acquire the lock (won’t block)omp_destroy_lock Frees lock resources
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 174 / 217
![Page 239: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/239.jpg)
Other synchronization mechanisms
Locks
OpenMP provides lock primitives for low-level synchronizationomp_init_lock Initialize the lockomp_set_lock Acquires the lockomp_unset_lock Releases the lockomp_test_lock Tries to acquire the lock (won’t block)omp_destroy_lock Frees lock resources
OpenMP also provides nested locks where the thread owning the lockcan reacquire the lock without blocking.
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 174 / 217
![Page 240: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/240.jpg)
Other synchronization mechanisms
Locks
Example
#include <omp . h>void foo ( ){
omp_lock_t l ock ;
omp_init_lock(& lock ) ;#pragma omp parallel{
omp_set_lock(& lock ) ;/ / mutual exc lus ion reg ionomp_unset_lock(& lock ) ;
}omp_destroy_lock(& lock ) ;
}
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 175 / 217
![Page 241: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/241.jpg)
Other synchronization mechanisms
Locks
Example
#include <omp . h>void foo ( ){
omp_lock_t l ock ;
omp_init_lock(& lock ) ;#pragma omp parallel{
omp_set_lock(& lock ) ;/ / mutual exc lus ion reg ionomp_unset_lock(& lock ) ;
}omp_destroy_lock(& lock ) ;
}
Lock must be initialized before being used
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 175 / 217
![Page 242: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/242.jpg)
Other synchronization mechanisms
Locks
Example
#include <omp . h>void foo ( ){
omp_lock_t l ock ;
omp_init_lock(& lock ) ;#pragma omp parallel{
omp_set_lock(& lock ) ;/ / mutual exc lus ion reg ionomp_unset_lock(& lock ) ;
}omp_destroy_lock(& lock ) ;
}
Only one thread at a time here
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 175 / 217
![Page 243: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/243.jpg)
Other synchronization mechanisms
Locks
Example
# inc lude <omp . h>
omp_lock_t l ock ;
void foo ( ){
omp_set_lock(& lock ) ;}
void bar ( ){
omp_unset_lock(& lock ) ;}
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 176 / 217
![Page 244: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/244.jpg)
Other synchronization mechanisms
Locks
Example
# inc lude <omp . h>
omp_lock_t l ock ;
void foo ( ){
omp_set_lock(& lock ) ;}
void bar ( ){
omp_unset_lock(& lock ) ;}
Locks are unstructured
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 176 / 217
![Page 245: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/245.jpg)
Nested parallelism
Outline
The master construct
Other synchronization mechanisms
Nested parallelism
Other worksharings
Other environment variables and API calls
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 177 / 217
![Page 246: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/246.jpg)
Nested parallelism
Nested parallelism
OpenMP parallel constructs can dynamically be nested. Thiscreates a hierarchy of teams that is called nested parallelism.Useful when not enough parallelism is available with a single levelof parallelism
More difficult to understand and manageImplementations are not required to support it
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 178 / 217
![Page 247: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/247.jpg)
Nested parallelism
Controlling nested parallelism
Related Internal Control VariablesThe ICV nest-var controls whether nested parallelism isenabled or not.
Set with the OMP_NESTED environment variableSet with the omp_set_nested API callThe current value can be retrieved with omp_get_nested.
The ICV max-active-levels-var controls the maximumnumber of nested regions
Set with the OMP_MAX_ACTIVE_LEVELS environment variableSet with the omp_set_max_active_levels API callThe current value can be retrieved withomp_get_max_active_levels.
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 179 / 217
![Page 248: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/248.jpg)
Nested parallelism
Nested parallelism info API
To obtain information about nested parallelismHow many nested parallel regions at this point?
omp_get_level()How many active (with 2 or more threads) regions?
omp_get_active_level()Which thread-id was my ancestor?
omp_get_ancestor_thread_num(level)How many threads there are at a previous region?
omp_get_team_size(level)
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 180 / 217
![Page 249: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/249.jpg)
Other worksharings
Outline
The master construct
Other synchronization mechanisms
Nested parallelism
Other worksharings
Other environment variables and API calls
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 181 / 217
![Page 250: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/250.jpg)
Other worksharings
Static tasks
The sections construct
#pragma omp sections [ c lauses ]#pragma omp section
s t r u c t u r e b lock. . .
The different section are distributed among the threadsThere is an implicit barrier at the endClauses can be:
privatelastprivatefirstprivatereductionnowait
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 182 / 217
![Page 251: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/251.jpg)
Other worksharings
Sections
Example
#pragma omp parallel sections num_threads ( 3 ){#pragma omp section
read ( data ) ;#pragma omp section#pragma omp parallel
work ( data ) ;#pragma omp section
w r i t e ( data ) ;}
Combined construct
Nested parallel region
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 183 / 217
![Page 252: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/252.jpg)
Other worksharings
Sections
Example
#pragma omp parallel sections num_threads ( 3 ){#pragma omp section
read ( data ) ;#pragma omp section#pragma omp parallel
work ( data ) ;#pragma omp section
w r i t e ( data ) ;}
Combined construct
Nested parallel region
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 183 / 217
![Page 253: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/253.jpg)
Other worksharings
Sections
Example
#pragma omp parallel sections num_threads ( 3 ){#pragma omp section
read ( data ) ;#pragma omp section#pragma omp parallel
work ( data ) ;#pragma omp section
w r i t e ( data ) ;}
Combined construct
Sections distributed among threads
Nested parallel region
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 183 / 217
![Page 254: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/254.jpg)
Other worksharings
Sections
Example
#pragma omp parallel sections num_threads ( 3 ){#pragma omp section
read ( data ) ;#pragma omp section#pragma omp parallel
work ( data ) ;#pragma omp section
w r i t e ( data ) ;}
Combined construct
Nested parallel region
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 183 / 217
![Page 255: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/255.jpg)
Other worksharings
Supporting array syntax
The workshare construct
$!OMP WORKSHAREar ray syntax
!$OMP END WORKSHARE [NOWAIT]
Only for FortranThe array operation is distributed among threads
Example
$!OMP WORKSHAREA( 1 :M) = A( 1 :M) ∗ B( 1 :M)
!$OMP END WORKSHARE NOWAIT
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 184 / 217
![Page 256: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/256.jpg)
Other environment variables and API calls
Outline
The master construct
Other synchronization mechanisms
Nested parallelism
Other worksharings
Other environment variables and API calls
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 185 / 217
![Page 257: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/257.jpg)
Other environment variables and API calls
Other Environment variables
OMP_STACKSIZE Controls the stack size of created threadsOMP_WAIT_POLICY Controls the behaviour of idle threadsOMP_THREAD_LIMIT Limit of threads that can be createdOMP_DYNAMIC Turns on/off thread dynamic adjusting
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 186 / 217
![Page 258: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/258.jpg)
Other environment variables and API calls
Other API calls
omp_in_parallel Returns true if inside a parallel re-gion
omp_get_wtick Returns the precision of the wtimeclock
omp_get_thread_limit Returns the limit of threadsomp_set_dynamic Returns whether thread dynamic
adjusting is on or offomp_get_dynamic Returns the current value of dy-
namic adjustingomp_get_schedule Returns the current loop schedule
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 187 / 217
![Page 259: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/259.jpg)
Part IX
Hands-on (IV)
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 188 / 217
![Page 260: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/260.jpg)
Outline
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 189 / 217
![Page 261: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/261.jpg)
Before you start
Copy the exercises to your directory:
$ cp -a∼aduran/Prace_OpenMP_Handson_2/other .
Enter the other directory to do the following exercises.
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 190 / 217
![Page 262: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/262.jpg)
Nested parallelism
First take1 Edit the file nested.c and try to understand what it does2 Run make3 Execute the programe nested with differents numbers of threads
How many messages are printed? Does it match yourexpectations?
4 Run the program again the defining the OMP_NESTED variable.E.g.:
$ OMP_NUM_THREADS=2 OMP_NESTED=true./nested
5 What is the difference? Why?
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 191 / 217
![Page 263: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/263.jpg)
Nested parallelism
Shaping the tree1 Now, change the code so the nested level only creates as many
threads as the parent id+1Thread 0 creates a nested parallel region of 1
Thread 1 creates a nested parallel region of 2
...
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 192 / 217
Tip: Use either omp_set_num_threads or num_threads
![Page 264: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/264.jpg)
Locks
Exclusive access1 Edit the file lock.c and take a look at the code2 Parallelize the first two loops of the application3 Now run it several times with different numbers of threads4 We see that result differs because of improper synchronization5 Use critical to fix it
What problem do we have?
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 193 / 217
![Page 265: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/265.jpg)
Locks
Locks to the help1 Use locks to implement a fine grain locking scheme2 Assign a lock to each position of the array a3 Then use it to lock only that position in the main loop
Does it work better?4 Now compare it to an implementation using atomic
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 194 / 217
![Page 266: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/266.jpg)
Part X
OpenMP in the future
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 195 / 217
![Page 267: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/267.jpg)
Outline
How OpenMP evolves
OpenMP 3.1
OpenMP 4.0
OpenMP is Open
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 196 / 217
![Page 268: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/268.jpg)
How OpenMP evolves
Outline
How OpenMP evolves
OpenMP 3.1
OpenMP 4.0
OpenMP is Open
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 197 / 217
![Page 269: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/269.jpg)
How OpenMP evolves
The OpenMP Language Committee
Body that prepares new standard versions for the ARB.Composed by representatives of all ARB members
Lead by Bronis de Supinski from LLNL
Integrates the information about the different subcommitteesCurrently working on OpenMP 3.1
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 198 / 217
![Page 270: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/270.jpg)
How OpenMP evolves
The OpenMP Subcommittees
When a topic is deemed important or too complex usually a separategroup is formed (with a subset of the same people usually).Currently, the following subcommittees exist:
1 Error model subcommitteeIn charge of defining an error model for OpenMP
2 Tasking subcommitteeIn charge of defining new extensions to the tasking model
3 Affinity subcommitteeIn charge of breaking the flat memory model
4 Accelerators subcommitteeIn charge of integrating accelerator computing into OpenMP
5 Interoperability and Composability subcommittee
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 199 / 217
![Page 271: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/271.jpg)
How OpenMP evolves
What can we expect in the future?
DisclaimerThis are my subjective appreciations.All these dates and topics are my guessings.They might or might not happen.
Tentative TimelineNovember 2010 3.1 Public comment versionMay 2011 3.1 Final versionJune 2012 4.0 Public comment versionNovember 2012 4.0 Final version
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 200 / 217
![Page 272: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/272.jpg)
OpenMP 3.1
Outline
How OpenMP evolves
OpenMP 3.1
OpenMP 4.0
OpenMP is Open
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 201 / 217
![Page 273: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/273.jpg)
OpenMP 3.1
Clarifications
Several clarifications to different parts of the specificationNothing exciting but needs to be done
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 202 / 217
![Page 274: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/274.jpg)
OpenMP 3.1
Atomic extensions
Extensions to the atomic construct to allow:to do atomic writes#pragma omp atomic
x = value ;
to capture the value before/after the atomic update#pragma omp atomic
v = x , x−−;
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 203 / 217
![Page 275: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/275.jpg)
OpenMP 3.1
User-defined reductions
Allow the users to extend reductions to cope with non-basic types andnon-standard operators.
In 3.1Including pointer reductions in C
Including class members and operators in C++
In 4.0Array for C
Template reductions for C++
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 204 / 217
![Page 276: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/276.jpg)
OpenMP 3.1
User-defined reductions
Example
#pragma omp declare reduction ( + : s td : : s t r i n g : omp_out += omp_in )
void foo ( ){
s td : : s t r i n g s ;
#pragma omp parallel reduction ( + : s ){
s += "I’m a thread"}
s td : : cout << s << std : : endl ;
}
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 205 / 217
![Page 277: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/277.jpg)
OpenMP 3.1
Affinity extensions
New environment variablesOMP_PROCBIND=true, false
Portable mechanism to bind threads
Extend OMP_NUM_THREADS to support multiple levels ofparallelismOMP_AFFINITY=scatter,compact
Specifies how threads should be distributed in the machineOMP_MEMORY_PLACEMENT=first_touch|round_robin|random
Portable mechanisms to specify memory placement policies
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 206 / 217
![Page 278: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/278.jpg)
OpenMP 3.1
Tasking extensions
New constructs/clausethe taskyield construct to allow user-defined scheduling pointsthe final clause to allow the optimization of leaf tasks
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 207 / 217
![Page 279: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/279.jpg)
OpenMP 4.0
Outline
How OpenMP evolves
OpenMP 3.1
OpenMP 4.0
OpenMP is Open
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 208 / 217
![Page 280: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/280.jpg)
OpenMP 4.0
Error model
Allow the programmer to catch and react to runtime errorsIntegrate C++ exceptions into this modelAllow the programmer to cancel nicely the parallel computation
It looks like we are leaning towards a model based on callbacks
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 209 / 217
![Page 281: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/281.jpg)
OpenMP 4.0
Error model
Example
void er ro r_hand le r ( omp_err_info_t ∗ i n fo , i n t ∗nths ){
i f ( omp_get_error_type ( i n f o ) == OMP_ERR_NOT_ENOUGH_THREADS )∗nths = ∗nths > 1 ? ∗nths −1 : 1 ;
return OMP_RETRY ;}
nths = 4;#pragma omp parallel onerror ( e r ro r_hand ler ,& nths ) num_threads ( nths ){
. . . .}
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 210 / 217
![Page 282: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/282.jpg)
OpenMP 4.0
Other tasking improvements
Tasking reductionsAdd a reduction clause to the task construct
Tasking dependencesAllow finer tasking synchronizations by means of expressing datadependences among tasks
Scheduling hints for the runtimeAllow the programmer to express some kind of task priority
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 211 / 217
![Page 283: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/283.jpg)
OpenMP 4.0
Task dependences
Example
for ( ; ; ) {char ∗b u f f e r ;#pragma omp task output ( b u f f e r ){
b u f f e r = mal loc ( . . . ) ;stage1 ( b u f f e r ) ;
}#pragma omp task inout ( b u f f e r ){
stage2 ( b u f f e r )}#pragma omp task input ( b u f f e r ){
stage3 ( b u f f e r )}
}
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 212 / 217
![Page 284: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/284.jpg)
OpenMP 4.0
Accelerators support
Discussion is in the very early stages.Several proposals on the table
Cover both data and task parallelismWill probably take care of the backend compilation
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 213 / 217
![Page 285: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/285.jpg)
OpenMP 4.0
A glimpse into BSC proposal
Example
i n t main ( void ) {for ( i n t i = 0 ; i < NB; i ++)
for ( i n t j = 0 ; j < NB; j ++)for ( i n t k = 0; k < NB; k++)#pragma omp target device (smp, c e l l ) \
copy_in ( [ BS ] [ BS] A, [BS ] [ BS] B, [BS ] [ BS] C) \copy_out ( [ BS ] [ BS] C)
#pragma omp task inout ( [ BS ] [ BS] C)matmul ( A [ i ] [ k ] , B [ k ] [ j ] , C[ i ] [ j ] ) ;
}
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 214 / 217
![Page 286: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/286.jpg)
OpenMP is Open
Outline
How OpenMP evolves
OpenMP 3.1
OpenMP 4.0
OpenMP is Open
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 215 / 217
![Page 287: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/287.jpg)
OpenMP is Open
OpenMP is Open
CompunityCompunity represents the OpenMP User’s Group.
It is an special ARB memberRepresentative: Barbara Chapman from Univ of Houston
Anyone can join and participateand also give feedback
OpenMP ForumForum oversighted by ARB members
OpenMP usage forumSpec clarifications forum
Several 3.1 clarifications have its origin in comments from users
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 216 / 217
![Page 288: Parallel Programming with OpenMP › courses › csep524 › 13wi › omp_t… · Writing OpenMP programs Headers/Macros C/C++ only omp.hcontains the API prototypes and data types](https://reader035.vdocuments.net/reader035/viewer/2022081402/5f147779bc2890553518876e/html5/thumbnails/288.jpg)
OpenMP is Open
Where to go now?
http://www.openmp.orghttp://www.compunity.orghttp://nanos.ac.upc.edu
Alex Duran (BSC) Advanced Programming with OpenMP February 2, 2013 217 / 217