Utopia  2
Framework for studying models of complex & adaptive systems.
Modules | Classes | Enumerations | Functions
Multithreading

Facilities for parallel execution based on runtime configuration. More...

Collaboration diagram for Multithreading:

Modules

 STL Algorithm Overloads
 Overloads for selecting execution policies at runtime.
 

Classes

class  Utopia::ParallelExecution
 Static information on the status of parallel execution. More...
 

Enumerations

enum  Utopia::ExecPolicy { Utopia::seq , Utopia::unseq , Utopia::par , Utopia::par_unseq }
 Runtime execution policies. More...
 

Functions

template<class Func , class... Args>
auto Utopia::exec_parallel (MAYBE_UNUSED const Utopia::ExecPolicy policy, Func &&f, Args &&... args)
 Call a function with an STL execution policy and arguments. More...
 

Detailed Description

Facilities for parallel execution based on runtime configuration.

Overview

C++17 introduced execution policies for running STL algorithms in parallel, on multiple threads and/or vectorized instructions. The decision of executing an algorithm is thereby taken at compile time because this optimizes performance and, depending on the particular policy chosen, poses restrictions on the algorithm and underlying data due to data races. However, most large STL libraries still do not support parallel execution policies. Utopia therefore relies on optional third party software to implement multithreading. The Intel oneAPI toolkit is a complete software suite for developing high-performance software on heterogeneous, parallel computer architectures. Users of Utopia can either install the complete oneAPI toolkit, or only the Thread-Building Blocks (TBB) component to enable multithreading support in Utopia, depending on the used operating system. Please refer to the Utopia installation instructions for further information.

The goal of the parallel facilities in Utopia is to shift the compile-time decision to a runtime decision. Developers only indicate that their algorithms can be run in parallel and take all necessary precautions for that. The core ingredients for this are overloads of STL algorithms where the ExecutionPolicy template parameter is replaced by a Utopia::ExecPolicy runtime parameter.

Enabling Multithreading Facilities

Parallel execution must first be enabled through the build system. In CMake, developers can use the function enable_parallel() on a target to enable possibly parallel execution. This will have no effect if the prerequisites for parallel execution are not fulfilled on the system.

For models added to the build system via add_model(), parallel features are enabled by default. Developers may pass the DISABLE_PARALLEL option to this function to globally disable any parallel features, as if enable_parallel() was not called on the respective model targets.

# Define a target and enable parallel features
add_executable(my_target source.cc)
enable_parallel(my_target)
# Define a model and disable parallel features
add_model(my_model my_model.cc DISABLE_PARALLEL)

Writing Parallel Code

The basic rule of thumb is that developers who want their code to be executed in parallel using the Utopia facilities simply write the code as if it was always run in parallel, and then replace the STL execution policies with Utopia::ExecPolicy. In particular, this means that data races have to be avoided, which can be achieved by using guards against mutual execution (mutex) or atomic data types which are inherently safe when accessed by multiple threads simultaneously. See the notes on execution policies and data races for additional information.

Parallel execution at runtime is controlled through Utopia::ParallelExecution and is disabled by default. Utopia::PseudoParent will disable parallel execution if the respective parameter space setting is not available. In the base configuration, parallel execution is also disabled by default. The static method Utopia::ParallelExecution::set can be used at any time in a program to enable or disable parallel execution. If disabled, all algorithms simply fall back to their sequential version.

#include <vector>
using namespace Utopia;
// Enable parallel if requirements are met
ParallelExecution::set(ParallelExecution::Setting::enabled);
std::vector<double> in(1E6, 0.0), out(1E6);
static void set(const Setting value)
Choose a setting for parallel execution at runtime.
Definition: parallel.hh:135
OutputIt copy(const Utopia::ExecPolicy policy, InputIt first, InputIt last, OutputIt d_first)
Copy the input range to a new range.
Definition: parallel.hh:324
@ par_unseq
SIMD execution on multiple threads.
Definition: parallel.hh:69
auto end(zip< Containers... > &zipper)
end function like std::end
Definition: zip.hh:550
auto begin(zip< Containers... > &zipper)
Begin function like std::begin.
Definition: zip.hh:537
Definition: agent.hh:11

Applying Rules in Parallel

In addition to the STL algorithms, apply_rule() may be called with Utopia::ExecPolicy as argument. This execution policy is then used for applying the rule function in parallel, where all considerations for data races apply. Additionally, apply_rule() will parallelize some internal operations like copying values automatically, if parallel exeuction is enabled. To properly parallelize the application of a rule, additional rule arguments are a key ingredient. Depending on the complexity of a rule, it might be feasible to compute values beforehand to avoid locking different threads against each other, and then pass these values as additional rule arguments:

#include <mutex>
SomeRandomNumberGenerator rng;
std::mutex m; // Will guard main thread
// Option 1: Multithreading with guard
Utopia::apply_rule<Utopia::Update::sync>(
// Rule creates random number and returns its value
[&](auto&){
// Lock thread to access RNG safely
std::lock_guard<std::mutex> guard(m);
return rng();
},
my_cells
);
// Option 2: Apply RNG first, then use multithreading and vectorization
std::vector<double> numbers(my_cells.size());
std::generate(begin(numbers), end(numbers), rng);
Utopia::apply_rule<Utopia::Update::sync>(
// Rule fetches value from 'numbers' and returns it
[&](auto&, auto val){ return val; },
my_cells,
numbers
);
@ par
Parallel/multithreaded execution.
Definition: parallel.hh:68
std::mt19937 rng
– Type definitions ----------------------------------------------------—
Definition: test_revision.cc:17

Notes on Parallel Implementation

There can be various configurations for the ParallelSTL setup. The parallel.hh header therefore requires information on the setup passed through several pre-processor macros, which, if defined, declare the following:

Macro Defined Meaning and Use
USE_INTERNAL_PSTL The setup should rely on the standard library parallel policy definitions. This currently implies that the TBB package is installed and available.
HAVE_ONEDPL Intel oneDPL library has been found and is used.
ENABLE_PARALLEL_STL Parallel features of Utopia are enabled. Does nothing if neither USE_INTERNAL_PSTL nor HAVE_ONEDPL are set.

Enumeration Type Documentation

◆ ExecPolicy

Runtime execution policies.

These policies directly relate to the C++ standard policies with the adjustment that they can be set at runtime and only apply if enabled by the build system.

Warning
Depending on the nature of the parallelized operation, data races may occur when executing algorithms in parallel. Users themselves are responsible for avoiding data races!
Enumerator
seq 

Sequential (i.e., regular) execution.

If parallel features are disabled at runtime or compile-time, all parallel algorithms behave as if they are called with this policy.

unseq 

SIMD execution on single thread.

par 

Parallel/multithreaded execution.

par_unseq 

SIMD execution on multiple threads.

Function Documentation

◆ exec_parallel()

template<class Func , class... Args>
auto Utopia::exec_parallel ( MAYBE_UNUSED const Utopia::ExecPolicy  policy,
Func &&  f,
Args &&...  args 
)

Call a function with an STL execution policy and arguments.

This function takes a set of STL algorithm arguments args, wraps them into a tuple, and calls the function object f with it. Depending on the inserted Utopia::ExecPolicy, an STL execution policy may be added to the tuple. This procedure makes the call site of exec_parallel() agnostic to the number of arguments actually passed inside the tuple, allowing for a convenient handling of systems where the <execution> header is not defined and the namespace std::execution does not exist.

If parallel execution was enabled at compile-time, this function actually compiles four different STL algorithms, one for each possible execution policy. This works because by definition of the STL they all have the same return type.

See https://stackoverflow.com/questions/52975114/different-execution-policies-at-runtime for the inspiration to this implementation.

Parameters
policyUtopia execution policy
fFunction (object) which executes an STL algorithm and takes the arguments for this algorithm as tuple of argument values.
argsThe arguments for the STL algorithm without execution policy.
Returns
Return value of function f called with an execution policy and the given arguments wrapped into a tuple.