From 1904bef10a80e25e59105a084501097f67621c3f Mon Sep 17 00:00:00 2001
From: Johannes Singler
Several of the standard algorithms, for instance
- The libstdc++ parallel mode performs parallization of algorithms,
+ The libstdc++ parallel mode performs parallelization of algorithms,
function objects, classes, and functions in the C++ Standard.std::search
, are made parallel using OpenMP
+std::sort
, are made parallel using OpenMP
annotations. These parallel mode constructs and can be invoked by
explicit source declaration or by compiling existing sources with a
specific compiler flag.
@@ -43,7 +43,7 @@ specific compiler flag.
The libstdc++ parallel mode
-Using the libstdc++ parallel mode
@@ -53,7 +53,7 @@ function objects, classes, and functions in the C++ Standard.
libgomp
, the GNU OpenMP implementation,
whose presence is mandatory. In addition, hardware capable of atomic
- operations is de rigueur. Actually activating these atomic
+ operations is mandatory. Actually activating these atomic
operations may require explicit compiler flags on some targets
(like sparc and x86), such as -march=i686
,
-march=native
or -mcpu=v9
.
@@ -113,6 +113,13 @@ function objects, classes, and functions in the C++ Standard.
std::unique_copy
The following library components in the includes
+<set>
and <map>
are included in the parallel mode:
std::(multi_)map/set<T>::(multi_)map/set(Iterator begin, Iterator end)
(bulk construction)std::(multi_)map/set<T>::insert(Iterator begin, Iterator end)
(bulk insertion)Something about exception safety, interaction with threads, -etc. Goal is to have the usual constraints of the STL with respect to -exception safety and threads, but add in support for parallel -computing.
- Something about compile-time settings and configuration, ie using
-__gnu_parallel::Settings
. XXX Up in the air.
The parallel mode STL algorithms are currently not exception-safe, +i. e. user-defined functors must not throw exceptions. +
+ +Since the current GCC OpenMP implementation does not support +OpenMP parallel regions in concurrent threads, +it is not possible to call parallel STL algorithm in +concurrent threads, either. +It might work with other compilers, though.
+ + + Some algorithm variants can be enabled/disabled/selected at compile-time.
+See
+<compiletime_settings.h>
and
+See
+<features.h>
for details.
+
+To specify the number of threads to be used for an algorithm,
+use omp_set_num_threads
.
+To force a function to execute sequentially,
+even though parallelism is switched on in general,
+add __gnu_parallel::sequential_tag()
+to the end of the argument list.
+
+Parallelism always incurs some overhead. Thus, it is not
+helpful to parallelize operations on very small sets of data.
+There are measures to avoid parallelizing stuff that is not worth it.
+For each algorithm, a minimum problem size can be stated,
+usually using the variable
+__gnu_parallel::Settings::[algorithm]_minimal_n
.
+Please see
+<settings.h>
for details.
std::transform
from
<algorithm> has a parallel counterpart in
std::__parallel::transform
from
<parallel/algorithm>. In addition, these parallel
-implementatations are injected into namespace
+implementations are injected into namespace
__gnu_parallel
with using declarations.
@@ -526,6 +567,16 @@ testsuite's Makefile.
+References / Further Reading
+
+
+Johannes Singler, Peter Sanders, Felix Putze. The Multi-Core Standard Template Library. Euro-Par 2007: Parallel Processing. (LNCS 4641)
+
+
+
+Leonor Frias, Johannes Singler: Parallelization of Bulk Operations for STL Dictionaries. Workshop on Highly Parallel Processing on a Chip (HPPC) 2007. (LNCS)
+
+
diff --git a/libstdc++-v3/include/parallel/balanced_quicksort.h b/libstdc++-v3/include/parallel/balanced_quicksort.h
index 881099cb37f..c28277039c5 100644
--- a/libstdc++-v3/include/parallel/balanced_quicksort.h
+++ b/libstdc++-v3/include/parallel/balanced_quicksort.h
@@ -32,6 +32,14 @@
* @brief Implementation of a dynamically load-balanced parallel quicksort.
*
* It works in-place and needs only logarithmic extra memory.
+ * The algorithm is similar to the one proposed in
+ *
+ * P. Tsigas and Y. Zhang.
+ * A simple, fast parallel implementation of quicksort and
+ * its performance evaluation on SUN enterprise 10000.
+ * In 11th Euromicro Conference on Parallel, Distributed and
+ * Network-Based Processing, page 372, 2003.
+ *
* This file is a GNU parallel extension to the Standard C++ Library.
*/
diff --git a/libstdc++-v3/include/parallel/multiseq_selection.h b/libstdc++-v3/include/parallel/multiseq_selection.h
index 5b34173cff2..208f4098f56 100644
--- a/libstdc++-v3/include/parallel/multiseq_selection.h
+++ b/libstdc++-v3/include/parallel/multiseq_selection.h
@@ -32,6 +32,13 @@
* @brief Functions to find elements of a certain global rank in
* multiple sorted sequences. Also serves for splitting such
* sequence sets.
+ *
+ * The algorithm description can be found in
+ *
+ * P. J. Varman, S. D. Scheufler, B. R. Iyer, and G. R. Ricard.
+ * Merging Multiple Lists on Hierarchical-Memory Multiprocessors.
+ * Journal of Parallel and Distributed Computing, 12(2):171â177, 1991.
+ *
* This file is a GNU parallel extension to the Standard C++ Library.
*/
diff --git a/libstdc++-v3/include/parallel/multiway_merge.h b/libstdc++-v3/include/parallel/multiway_merge.h
index 2a6c38a5a4f..c940e4578dc 100644
--- a/libstdc++-v3/include/parallel/multiway_merge.h
+++ b/libstdc++-v3/include/parallel/multiway_merge.h
@@ -30,6 +30,13 @@
/** @file parallel/multiway_merge.h
* @brief Implementation of sequential and parallel multiway merge.
+ *
+ * Explanations on the high-speed merging routines in the appendix of
+ *
+ * P. Sanders.
+ * Fast priority queues for cached memory.
+ * ACM Journal of Experimental Algorithmics, 5, 2000.
+ *
* This file is a GNU parallel extension to the Standard C++ Library.
*/
diff --git a/libstdc++-v3/include/parallel/tree.h b/libstdc++-v3/include/parallel/tree.h
index 9b2efc46dae..5b7b41c6ef9 100644
--- a/libstdc++-v3/include/parallel/tree.h
+++ b/libstdc++-v3/include/parallel/tree.h
@@ -30,6 +30,13 @@
/** @file parallel/tree.h
* @brief Parallel red-black tree operations.
+ *
+ * This implementation is described in
+ *
+ * Leonor Frias, Johannes Singler.
+ * Parallelization of Bulk Operations for STL Dictionaries.
+ * Workshop on Highly Parallel Processing on a Chip (HPPC) 2007.
+ *
* This file is a GNU parallel extension to the Standard C++ Library.
*/
diff --git a/libstdc++-v3/include/parallel/workstealing.h b/libstdc++-v3/include/parallel/workstealing.h
index cc8f37e8d09..0b2102c45ed 100644
--- a/libstdc++-v3/include/parallel/workstealing.h
+++ b/libstdc++-v3/include/parallel/workstealing.h
@@ -31,6 +31,13 @@
/** @file parallel/workstealing.h
* @brief Parallelization of embarrassingly parallel execution by
* means of work-stealing.
+ *
+ * Work stealing is described in
+ *
+ * R. D. Blumofe and C. E. Leiserson.
+ * Scheduling multithreaded computations by work stealing.
+ * Journal of the ACM, 46(5):720â748, 1999.
+ *
* This file is a GNU parallel extension to the Standard C++ Library.
*/
--
2.30.2