Facilities for parallel execution based on runtime configuration. More...

Collaboration diagram for Multithreading:

Modules
	STL Algorithm Overloads
	Overloads for selecting execution policies at runtime.

Classes
class	Utopia::ParallelExecution
	Static information on the status of parallel execution. More...

Enumerations
enum	Utopia::ExecPolicy { Utopia::seq , Utopia::unseq , Utopia::par , Utopia::par_unseq }
	Runtime execution policies. More...

Functions
template<class Func , class... Args>
auto	Utopia::exec_parallel (MAYBE_UNUSED const Utopia::ExecPolicy policy, Func &&f, Args &&... args)
	Call a function with an STL execution policy and arguments.

Detailed Description

Facilities for parallel execution based on runtime configuration.

Overview

C++17 introduced execution policies for running STL algorithms in parallel, on multiple threads and/or vectorized instructions. The decision of executing an algorithm is thereby taken at compile time because this optimizes performance and, depending on the particular policy chosen, poses restrictions on the algorithm and underlying data due to data races. However, most large STL libraries still do not support parallel execution policies. Utopia therefore relies on optional third party software to implement multithreading. The Intel oneAPI toolkit is a complete software suite for developing high-performance software on heterogeneous, parallel computer architectures. Users of Utopia can either install the complete oneAPI toolkit, or only the Thread-Building Blocks (TBB) component to enable multithreading support in Utopia, depending on the used operating system. Please refer to the Utopia installation instructions for further information.

The goal of the parallel facilities in Utopia is to shift the compile-time decision to a runtime decision. Developers only indicate that their algorithms can be run in parallel and take all necessary precautions for that. The core ingredients for this are overloads of STL algorithms where the ExecutionPolicy template parameter is replaced by a Utopia::ExecPolicy runtime parameter.

Enabling Multithreading Facilities

Parallel execution must first be enabled through the build system. In CMake, developers can use the function enable_parallel() on a target to enable possibly parallel execution. This will have no effect if the prerequisites for parallel execution are not fulfilled on the system.

For models added to the build system via add_model(), parallel features are enabled by default. Developers may pass the DISABLE_PARALLEL option to this function to globally disable any parallel features, as if enable_parallel() was not called on the respective model targets.

# Define a target and enable parallel features
add_executable(my_target source.cc)
enable_parallel(my_target)
 
# Define a model and disable parallel features
add_model(my_model my_model.cc DISABLE_PARALLEL)

Writing Parallel Code

The basic rule of thumb is that developers who want their code to be executed in parallel using the Utopia facilities simply write the code as if it was always run in parallel, and then replace the STL execution policies with Utopia::ExecPolicy. In particular, this means that data races have to be avoided, which can be achieved by using guards against mutual execution (mutex) or atomic data types which are inherently safe when accessed by multiple threads simultaneously. See the notes on execution policies and data races for additional information.

Parallel execution at runtime is controlled through Utopia::ParallelExecution and is disabled by default. Utopia::PseudoParent will disable parallel execution if the respective parameter space setting is not available. In the base configuration, parallel execution is also disabled by default. The static method Utopia::ParallelExecution::set can be used at any time in a program to enable or disable parallel execution. If disabled, all algorithms simply fall back to their sequential version.

#include <vector>
#include <utopia/core/parallel.hh>
 
using namespace Utopia;
 
// Enable parallel if requirements are met
ParallelExecution::set(ParallelExecution::Setting::enabled);
 
std::vector<double> in(1E6, 0.0), out(1E6);
std::copy(ExecPolicy::par_unseq, begin(in), end(in), begin(out));

Applying Rules in Parallel

In addition to the STL algorithms, apply_rule() may be called with Utopia::ExecPolicy as argument. This execution policy is then used for applying the rule function in parallel, where all considerations for data races apply. Additionally, apply_rule() will parallelize some internal operations like copying values automatically, if parallel exeuction is enabled. To properly parallelize the application of a rule, additional rule arguments are a key ingredient. Depending on the complexity of a rule, it might be feasible to compute values beforehand to avoid locking different threads against each other, and then pass these values as additional rule arguments:

#include <mutex>
#include <utopia/core/apply.hh>
 
SomeRandomNumberGenerator rng;
std::mutex m;  // Will guard main thread
 
// Option 1: Multithreading with guard
Utopia::apply_rule<Utopia::Update::sync>(
    Utopia::ExecPolicy::par,
    // Rule creates random number and returns its value
    [&](auto&){
        // Lock thread to access RNG safely
        std::lock_guard<std::mutex> guard(m);
        return rng();
    },
    my_cells
);
 
// Option 2: Apply RNG first, then use multithreading and vectorization
std::vector<double> numbers(my_cells.size());
std::generate(begin(numbers), end(numbers), rng);
Utopia::apply_rule<Utopia::Update::sync>(
    Utopia::ExecPolicy::par_unseq,
    // Rule fetches value from 'numbers' and returns it
    [&](auto&, auto val){ return val; },
    my_cells,
    numbers
);

Notes on Parallel Implementation

There can be various configurations for the ParallelSTL setup. The parallel.hh header therefore requires information on the setup passed through several pre-processor macros, which, if defined, declare the following:

Macro Defined	Meaning and Use
`USE_INTERNAL_PSTL`	The setup should rely on the standard library parallel policy definitions. This currently implies that the TBB package is installed and available.
`HAVE_ONEDPL`	Intel oneDPL library has been found and is used.
`ENABLE_PARALLEL_STL`	Parallel features of Utopia are enabled. Does nothing if neither `USE_INTERNAL_PSTL` nor `HAVE_ONEDPL` are set.

Enumeration Type Documentation

◆ ExecPolicy

enum Utopia::ExecPolicy

Runtime execution policies.

These policies directly relate to the C++ standard policies with the adjustment that they can be set at runtime and only apply if enabled by the build system.

Warning: Depending on the nature of the parallelized operation, data races may occur when executing algorithms in parallel. Users themselves are responsible for avoiding data races!

Enumerator
seq	Sequential (i.e., regular) execution. If parallel features are disabled at runtime or compile-time, all parallel algorithms behave as if they are called with this policy.
unseq	SIMD execution on single thread.
par	Parallel/multithreaded execution.
par_unseq	SIMD execution on multiple threads.

{
 
    seq,
    unseq,    
    par,      
    par_unseq 
};

Function Documentation

◆ exec_parallel()

template<class Func , class... Args>

auto Utopia::exec_parallel	(	MAYBE_UNUSED const Utopia::ExecPolicy	policy,
		Func &&	f,
		Args &&...	args
	)

Call a function with an STL execution policy and arguments.

This function takes a set of STL algorithm arguments args, wraps them into a tuple, and calls the function object f with it. Depending on the inserted Utopia::ExecPolicy, an STL execution policy may be added to the tuple. This procedure makes the call site of exec_parallel() agnostic to the number of arguments actually passed inside the tuple, allowing for a convenient handling of systems where the <execution> header is not defined and the namespace std::execution does not exist.

If parallel execution was enabled at compile-time, this function actually compiles four different STL algorithms, one for each possible execution policy. This works because by definition of the STL they all have the same return type.

See https://stackoverflow.com/questions/52975114/different-execution-policies-at-runtime for the inspiration to this implementation.

Parameters

policy	Utopia execution policy
f	Function (object) which executes an STL algorithm and takes the arguments for this algorithm as tuple of argument values.
args	The arguments for the STL algorithm without execution policy.

Returns: Return value of function f called with an execution policy and the given arguments wrapped into a tuple.

{
#ifdef UTOPIA_PARALLEL
    if (Utopia::ParallelExecution::is_enabled())
    {
        if (policy == Utopia::ExecPolicy::unseq)
            return f(std::forward_as_tuple(std::execution::unseq, args...));
        else if (policy == Utopia::ExecPolicy::par)
            return f(std::forward_as_tuple(std::execution::par, args...));
        else if (policy == Utopia::ExecPolicy::par_unseq)
            return f(std::forward_as_tuple(std::execution::par_unseq, args...));
    }
#endif
 
#ifdef HAVE_PSTL
    return f(std::forward_as_tuple(std::execution::seq, args...));
#else
    return f(std::forward_as_tuple(args...));
#endif
}

Modules

Classes

Enumerations

Functions