Shared sub-task to sort a sequence using a parallel mergesort like algorithm for shared memory contexts.
More...
|
| SM_ParallelMergeSortSubTask (std::shared_ptr< SharedTaskSequencer > ch, size_t const tIdx, size_t const numThreads, size_t const minElements, int const maxDepth, RandomAccessIterator begin, RandomAccessIterator end, Comparator comparator) |
| Main constructor for shared context parallel merge sort sub-task. More...
|
|
void | run () override |
| Implementation of the sort method itself. More...
|
|
| SharedSubTask (std::shared_ptr< SharedSubTaskCompletionHandler > ch) |
| Default constructor for shared sub-task. More...
|
|
virtual void | operator() () |
| The functor that will be called by any thread. It calls the SharedSubTask::run method to solve/compute the sub-task. Also, once the task has been computed, it delegates upon the task completion handler. More...
|
|
virtual void | postProcess () |
| Post-processing to be applied after shared sub-task has finished. By default it is a void function which does nothing, but it can be overridden.
|
|
virtual size_t | getKey () |
| Obtain the key of the shared sub-task inside the shared task sequencer context. More...
|
|
virtual void | setKey (size_t const key) |
| Set the key of the shared sub-task inside the shared task sequencer context. More...
|
|
virtual std::shared_ptr< boost::thread > | getThread () |
| Get the thread associated to the shared sub-task. More...
|
|
virtual void | setThread (std::shared_ptr< boost::thread > thread) |
| Set the thread associated to the shared sub-task. More...
|
|
template<typename RandomAccessIterator, typename Comparator>
class helios::hpc::SM_ParallelMergeSortSubTask< RandomAccessIterator, Comparator >
Shared sub-task to sort a sequence using a parallel mergesort like algorithm for shared memory contexts.
- Author
- Alberto M. Esmoris Pena
- Version
- 1.0
The parallel merge sort here implemented is based on spawning threads on a binary tree basis. For this purpose, let \(n\) be the number of threads, so \(\forall x \in [0, n-1],\; T_x\) notes the \(x\)-th thread. Let \(d\) be the current depth and \(d^*\) the max depth, so \(d \in [0, d^*]\), where:
\[ d^* = \left\lfloor{\log_2{(n)}}\right\rfloor \]
For any \(x\)-thread it is possible to calculate its initial depth \(d_*\), it is at which depth the thread was spwned:
\[ d_* = \left\lceil\log_2{(x+1)}\right\rceil \]
At each sub-task, the thread will take care of the left partition but it will delegate the right one to a new thread if possible. To do so, let \(k=1+d-d_*\) for the \(x-th\) thread. Thus, the \(y\)-thread to deal with right partition will be:
\[ \begin{split} y(x, d) = \;& x + (2^k-1)x + 2^{k-1} \\ = \;& 2^k x + 2^{k-1} \end{split} \]
For instance, the \(x=3\) thread at depth \(d=4\) would correspond to the \(y(3, 4) = 28\) thread. Thus, if \(y(3, 4) = 28 < n\) is satisfied then a new thread will be spawned to handle that workload. Otherwise, the \(x=3\) thread will have to sort the entire workload.
As an example, a tree with \(8\) threads would be:
\[ \left\{\begin{array}{lll} d=0 &:& \left\{ T_0 \right\} \\ d=1 &:& \left\{ T_0, T_1 \right\} \\ d=2 &:& \left\{ T_0, T_2, T_1, T_3 \right\} \\ d=3 &:& \left\{ T_0, T_4, T_2, T_5, T_1, T_6, T_3, T_7 \right\} \end{array}\right. \]
Once a thread has finished its sorting, it waits for its immediate child to do the same. Next, the parent merges with child and so on recursively until the root node is reached. Merging process could have been implemented in a faster way by using both parent and child to merge, but this would require to work with a buffer of similar size than the sequence being sorted. This has been avoided relying the entire merge computation to each parent because this algorithm is meant to work with big sequences that might lead to out of memory scenarios.
- See also
- SharedSubTask
-
helios::hpc::SM_ParallelMergeSort