OpenMP Summary and Outline You write C programs in OpenMP by using a combination of I) compiler directives II) library functions III) environment variables. ============== I. Compiler Directives ============== There are three important terms that we need to define in order to understand the semantics of OpenMP compiler directives. i.) Directive - The one line compiler pragma itself. ii.) Construct - The compiler directive combined with the expression, statement, or "structured block" that immediately follows it. Notice that this is a lexical (compile time) concept. iii.) Region - This is a dynamic (run time) concept. At run time, the "region" associated with a construct is all the code encountered during the execution of the construct. In particular, the body of any function called from within a construct will be part of the region associated with that instance of the construct. Notice that different run time instances of a construct can lead to different regions. ====== Parallel directive ====== NOTE: This directive creates threads. By default, unless told otherwise by work-sharing directives, every thread does all of the work contained in the "structured block" #pragma omp parallel [optional clauses ...] { // a "structured block" with a single entry point // at the top and a single exit point at the bottom } ====== Work-sharing directives ====== NOTE: These directives do not create threads. So they can only lead to parallelized code if they are executed within a "parallel region" (which is not the same thing as being nested inside of a "parallel construct"). #pragma omp for [optional clauses ...] <for-loop> #pragma omp sections [optional clauses ...] { #pragma omp section { // a "structured block" } #pragma omp section { // a "structured block" } // additional section directives } #pragma omp single [optional clauses ...] { // a "structured block" that is // executed only by a single thread } ====== Combined parallel work-sharing directives ====== NOTE: These two directives both create threads and decide how the work contained in their parallel region will be shared between those threads. #pragma omp parallel for [optional clauses ...] <for-loop> #pragma omp parallel sections [optional clauses ...] { #pragma omp section { // a "structured block" } #pragma omp section { // a "structured block" } // additional section directives } ====== Synchronization directives ====== #pragma omp barrier #pragma omp critical [(<name>)] { // a "structured block" } #pragma omp atomic <expression-statement> #pragma omp flush [(<list-of-variables>)] #pragma omp master { // a "structured block" that is // executed only by the master thread } #pragma omp ordered { // a "structured block" within // a for-loop with the ordered clause } ====== Data environment ====== #pragma omp threadprivate (<list-of-variables>) { // a "structured block" } =========== Clauses =========== Here is a list of the available clauses that can be used in the above directives. (Not all clauses can be used in every directive.) default( none | shared ) shared( <list-of-variables> ) private( <list-of-variables> ) firstprivate( <list-of-variables> ) lastprivate( <list-of-variables> ) copyin( <list-of-variables> ) copyprivate( <list-of-variables> ) if( <scalar-expression> ) collapse(n) ordered untied nowait num_threads( <integer-expression> ) schedule( static | dynamic | guided | runtime [, <chunk_size>] ) // for loop directives reduction( <operator> : <list-of-variables> ) A reduction clause can use the following operators. + * - & | ^ && || =========== II. Runtime Library functions =========== These functions are declared in the include file <omp.h>. // Execution Environment Functions int omp_get_num_threads( void ) int omp_get_thread_num( void ) int omp_get_num_procs( void ) int omp_in_parallel( void ) // get & set "internal control variables" void omp_set_num_threads( int num_threads ) int omp_get_max_threads( void ) void omp_set_dynamic( int dynamic_threads ) int omp_get_dynamic( void ) void omp_set_schedule( omp_sched_t kind, int modifier ) void omp_get_schedule( omp_sched_t *kind, int *modifier ) int omp_get_thread_limit( void ) void omp_set_nested( int nested ) int omp_get_nested( void ) void omp_set_max_active_levels( int max_active_levels ) int omp_get_max_active_levels( void ) // nested parallelism functions (including the previous 4 functions) int omp_get_level( void ) int omp_get_ancestor_thread_num( int level ) int omp_get_team_size( int level ) int omp_get_active_level( void ) // Lock Functions //Simple Lock Functions // Nestable Lock Functions void omp_init_lock( omp_lock_t *lock ) void omp_init_nest_lock( omp_nest_lock_t *lock ) void omp_destroy_lock( omp_lock_t *lock ) void omp_destroy_nest_lock( omp_nest_lock_t *lock ) void omp_set_lock( omp_lock_t *lock ) void omp_set_nest_lock( omp_nest_lock_t *lock ) void omp_unset_lock( omp_lock_t *lock ) void omp_unset_nest_lock( omp_nest_lock_t *lock ) int omp_test_lock( omp_lock_t *lock ) int omp_test_nest_lock( omp_nest_lock_t *lock ) // Timing Functions double omp_get_wtime( void ) double omp_get_wtick( void ) ============= III. Environment Variables ============= Each environment variable sets some "internal control variable" (ICV). Several of these ICV's can also be set (and gotten) by library functions. OMP_NUM_THREADS OMP_DYNAMIC OMP_NESTED OMP_SCHEDULE OMP_THREAD_LIMIT OMP_MAX_ACTIVE_LEVELS OMP_STACKSIZE OMP_WAIT_POLICY