From e491ed09b31dc8ab89cf84cc5f94b3ee02792d12 Mon Sep 17 00:00:00 2001 From: Johannes Singler Date: Thu, 15 May 2008 07:31:50 +0000 Subject: [PATCH] parallel_mode.xml: General revision and documentation of new compile-time options for sorting. 2008-05-15 Johannes Singler * doc/xml/manual/parallel_mode.xml: General revision and documentation of new compile-time options for sorting. From-SVN: r135327 --- libstdc++-v3/ChangeLog | 6 + libstdc++-v3/doc/xml/manual/parallel_mode.xml | 116 +++++++++++------- 2 files changed, 77 insertions(+), 45 deletions(-) diff --git a/libstdc++-v3/ChangeLog b/libstdc++-v3/ChangeLog index 2f06c8e92a0..c17263b8d7b 100644 --- a/libstdc++-v3/ChangeLog +++ b/libstdc++-v3/ChangeLog @@ -1,3 +1,9 @@ +2008-05-15 Johannes Singler + + * xml/manual/parallel_mode.xml: + General revision and documentation of new compile-time + options for sorting. + 2008-05-14 Benjamin Kosnik * include/std/mutex (mutex::try_lock): Eat errors. diff --git a/libstdc++-v3/doc/xml/manual/parallel_mode.xml b/libstdc++-v3/doc/xml/manual/parallel_mode.xml index 79577487bd9..fb439062fea 100644 --- a/libstdc++-v3/doc/xml/manual/parallel_mode.xml +++ b/libstdc++-v3/doc/xml/manual/parallel_mode.xml @@ -90,6 +90,8 @@ specific compiler flag. The parallel mode STL algorithms are currently not exception-safe, i.e. user-defined functors must not throw exceptions. +Also, the order of execution is not guaranteed for some functions, of course. +Therefore, user-defined functors should not have any concurrent side effects. Since the current GCC OpenMP implementation does not support @@ -459,34 +461,16 @@ function, if no parallel functions are deemed worthy), based on either compile-time or run-time conditions. - Compile-time conditions are referred to as "embarrassingly -parallel," and are denoted with the appropriate dispatch object, i.e., -one of __gnu_parallel::sequential_tag, -__gnu_parallel::parallel_tag, -__gnu_parallel::balanced_tag, -__gnu_parallel::unbalanced_tag, -__gnu_parallel::omp_loop_tag, or -__gnu_parallel::omp_loop_static_tag. - - - Run-time conditions depend on the hardware being used, the number -of threads available, etc., and are denoted by the use of the enum -__gnu_parallel::parallelism. Values of this enum include -__gnu_parallel::sequential, -__gnu_parallel::parallel_unbalanced, -__gnu_parallel::parallel_balanced, -__gnu_parallel::parallel_omp_loop, -__gnu_parallel::parallel_omp_loop_static, or -__gnu_parallel::parallel_taskqueue. - + The available signature options are specific for the different +algorithms/algorithm classes. - Putting all this together, the general view of overloads for the -parallel algorithms look like this: + The general view of overloads for the parallel algorithms look like this: ISO C++ signature ISO C++ signature + sequential_tag argument - ISO C++ signature + parallelism argument + ISO C++ signature + algorithm-specific tag type + (several signatures) Please note that the implementation may use additional functions @@ -512,8 +496,8 @@ by standard OpenMP function calls. -To specify the number of threads to be used for an algorithm, use the -function omp_set_num_threads. An example: +To specify the number of threads to be used for the algorithms globally, +use the function omp_set_num_threads. An example: @@ -527,12 +511,18 @@ int main() omp_set_dynamic(false); omp_set_num_threads(threads_wanted); - // Do work. + // Call parallel mode algorithms. return 0; } + + Some algorithms allow the number of threads being set for a particular call, + by augmenting the algorithm variant. + See the next section for further information. + + Other parts of the runtime environment able to be manipulated include nested parallelism (omp_set_nested), schedule kind @@ -549,8 +539,7 @@ documentation for more information. To force an algorithm to execute sequentially, even though parallelism is switched on in general via the macro _GLIBCXX_PARALLEL, add __gnu_parallel::sequential_tag() to the end -of the algorithm's argument list, or explicitly qualify the algorithm -with the __gnu_parallel:: namespace. +of the algorithm's argument list. @@ -562,22 +551,50 @@ std::sort(v.begin(), v.end(), __gnu_parallel::sequential_tag()); -or +Some parallel algorithm variants can be excluded from compilation by +preprocessor defines. See the doxygen documentation on +compiletime_settings.h and features.h for details. - -__gnu_serial::sort(v.begin(), v.end()); - + +For some algorithms, the desired variant can be chosen at compile-time by +appending a tag object. The available options are specific to the particular +algorithm (class). + - -In addition, some parallel algorithm variants can be enabled/disabled/selected -at compile-time. + +For the "embarrassingly parallel" algorithms, there is only one "tag object +type", the enum _Parallelism. +It takes one of the following values, +__gnu_parallel::parallel_tag, +__gnu_parallel::balanced_tag, +__gnu_parallel::unbalanced_tag, +__gnu_parallel::omp_loop_tag, +__gnu_parallel::omp_loop_static_tag. +This means that the actual parallelization strategy is chosen at run-time. +(Choosing the variants at compile-time will come soon.) -See compiletime_settings.h and -See features.h for details. +For the sort and stable_sort algorithms, there are +several possible choices, +__gnu_parallel::parallel_tag, +__gnu_parallel::default_parallel_tag, +__gnu_parallel::multiway_mergesort_tag, +__gnu_parallel::multiway_mergesort_exact_tag, +__gnu_parallel::multiway_mergesort_sampling_tag, +__gnu_parallel::quicksort_tag, +__gnu_parallel::balanced_quicksort_tag. +Multiway mergesort comes with two splitting strategies for merging, therefore +the extra choice. If non is chosen, the default splitting strategy is selected. +__gnu_parallel::default_parallel_tag chooses the default parallel +sorting algorithm at runtime. __gnu_parallel::parallel_tag +postpones the decision to runtime (see next section). +The quicksort options cannot be used for stable_sort. +For all tags, the number of threads desired for this call can optionally be +passed to the tag's constructor. + @@ -593,19 +610,18 @@ of __gnu_parallel::_Settings member data. First off, the choice of parallelization strategy: serial, parallel, -or implementation-deduced. This corresponds +or heuristically deduced. This corresponds to __gnu_parallel::_Settings::algorithm_strategy and is a value of enum __gnu_parallel::_AlgorithmStrategy type. Choices include: heuristic, force_sequential, -and force_parallel. The default is -implementation-deduced, i.e. heuristic. +and force_parallel. The default is heuristic. -Next, the sub-choices for algorithm implementation. Specific -algorithms like find or sort +Next, the sub-choices for algorithm variant, if not fixed at compile-time. +Specific algorithms like find or sort can be implemented in multiple ways: when this is the case, a __gnu_parallel::_Settings member exists to pick the default strategy. For @@ -626,7 +642,7 @@ active __gnu_parallel::_Settings object. This threshold variable follows the following naming scheme: __gnu_parallel::_Settings::[algorithm]_minimal_n. So, for fill, the threshold variable -is __gnu_parallel::_Settings::fill_minimal_n +is __gnu_parallel::_Settings::fill_minimal_n, @@ -634,10 +650,20 @@ Finally, hardware details like L1/L2 cache size can be hardwired via __gnu_parallel::_Settings::L1_cache_size and friends. + + + All these configuration variables can be changed by the user, if -desired. Please -see settings.h +desired. +There exists one global instance of the class _Settings, +i. e. it is a singleton. It can be read and written by calling +__gnu_parallel::_Settings::get and +__gnu_parallel::_Settings::set, respectively. +Please note that the first call return a const object, so direct manipulation +is forbidden. +See + settings.h for complete details. -- 2.30.2