Multiverse Base Configuration
Multiverse Base Configuration#
The following is the base configuration of the utopya.multiverse.Multiverse
class.
It provides all defaults that are needed to run a simulation, but is subsequently updated by other configuration layers to form the meta configuration.
The base configuration is meant to be self-documenting, thus allowing to see which parameters are available.
Note
The parameter_space
key is extended with the default model configuration of the chosen model.
This will lead to the default model configuration being available at parameter_space.<CurrentlyChosenModelName>
.
# This file provides the basic configuration for the utopya Multiverse
#
# It is read in by the Multiverse during initialization and is subsequently
# updated by other configuration files to generate the meta configuration of
# the Multiverse, which determines all details of how a run is performed.
#
# The top-level keys here are used to configure different parts of Multiverse:
# - properties of the Multiverse itself: `paths`, `perform_sweep`
# - properties of attributes: `worker_manager`, `run_kwargs`, ...
# - and the parameter space that is passed on to the model instance
#
# NOTE that this configuration file documents some features in the comments.
# This cannot be exhaustive. Check the docstrings of the functions for
# further information.
# Also, this file is used for deployment of the user configuration (with
# this header section removed).
---
# Multiverse configuration ....................................................
# Output paths
# These are passed to Multiverse._create_run_dir
paths:
# base output directory
out_dir: ~/utopia_output
# model note is added to the output directory path
model_note: ~
# From the two above, the run directory will be created at:
# <out_dir>/<model_name>/<timestamp>-<model_note>/
# With subfolders: config, eval, universes
# Control of the backup of files that belong to a simulation
backups:
# Whether to save all involved config files granularly, i.e. one by one.
# If false, only the resulting meta_cfg is saved to the config subdirectory.
backup_cfg_files: true
# Whether to save the executable
backup_executable: false
# Control of the model executable
executable_control:
# Whether to copy the binary to a temporary directory at the initialization
# of the Multiverse and execute it from there. This way, accidental changes
# to the executable _during_ a simulation are prevented.
run_from_tmpdir: true
# Whether to perfom a parameter sweep
# Is evaluated by the Multiverse.run method
perform_sweep: false
# NOTE This will be ignored if run_single or run_sweep are called directly.
# Also, the `parameter_space` key (see below) will need to span at least
# a volume of 1 in order to be sweep-able.
# Whether to perform parameter validation
# For large sweeps, validation can take quite some time. For such scenarios, it
# might make sense to disable parameter validation by setting this to false.
perform_validation: true
# Parameters that are to be validated
# This is a mapping of key sequence -> Parameter object
parameters_to_validate: {}
# Reporter ....................................................................
# The Multiverse owns a Reporter object to report on the progress of the
# WorkerManager. Part of its configuration happens using its init kwargs, which
# are defined in the following.
# The rest of the configuration happens on the WorkerManager-side (see there).
reporter:
# Define report formats, which are accessible, e.g. from the WorkerManager
report_formats:
progress_bar: # Name of the report format specification
parser: progress_bar # The parser to use
write_to: stdout_noreturn # The writer to use
min_report_intv: 0.5 # Required time (in s) between writes
# -- All further kwargs on this level are passed to the parser
# Terminal width for the progress bar
# Can also be `adaptive` (poll each time), or `fixed` (poll once)
num_cols: adaptive
# The format string to use for progress information
# Available keys:
# - `total_progress` (in %)
# - `active_progress` (mean progress of _active_ simulations, in %)
# - `cnt` (dict of counters: `total`, `finished`, `active`)
info_fstr: "{total_progress:>5.1f}% "
# Example of how to access counters in format string:
# info_fstr: "finished {cnt[finished]}/{cnt[total]} "
# Whether to show time information alongside the progress bar
show_times: true
# How to display time information.
# Available keys: `elapsed`, `est_left`, `est_end`, `start`, `now`
# (see `times` parser for more information)
times_fstr: "| {elapsed:>7s} elapsed | ~{est_left:>7s} left "
times_fstr_final: "| finished in {elapsed:} "
times_kwargs:
# How to compute the estimated time left to finish the work session
# Available modes:
# - `from_start`: extrapolates from progress made since start
# - `from_buffer`: uses a buffer to store recent progress
# information and use the oldest value for
# making the estimate; see `progress_buffer_size`
mode: from_buffer
# Number of records kept for computing ETA in `from_buffer` mode.
# This is in units of parser invocations, so goes back *at least* a
# a time interval of `min_report_intv * progress_buffer_size`.
# If the reporter is called less frequently (e.g. because of a larger
# model-side `monitor_emit_interval`), this interval will be longer.
progress_buffer_size: 90
# Creates a report file containing runtime statistics
report_file:
parser: report
write_to:
file:
path: _report.txt
min_report_intv: 10 # don't update this one too often
min_num: 4 # min. number of universes for statistics
show_individual_runtimes: true # for large number of universes, disable
task_label_singular: universe
task_label_plural: universes
# Creates a parameter sweep information file
sweep_info:
parser: pspace_info
write_to:
file:
path: _sweep_info.txt
skip_if_empty: true
log:
lvl: 18
skip_if_empty: true
fstr: "Sweeping over the following parameter space:\n\n{sweep_info:}"
# Can define a default format to use
# default_format: ~
# Worker Manager ..............................................................
# Initialization arguments for the WorkerManager
worker_manager:
# Specify how many processes work in parallel
num_workers: auto
# can be: an int, 'auto' (== #CPUs). For values <= 0: #CPUs - num_workers
# Delay between polls [seconds]
poll_delay: 0.05
# NOTE: If this value is too low, the main thread becomes very busy.
# If this value is too high, the log output from simulations is not
# read from the line buffer frequently enough.
# Maximum number of lines to read from each task's stream per poll cycle.
# Choosing a value that is too large may affect poll performance in cases
# where the task generates many lines of output.
# Set to -1 to read *all* available lines from the stream upon each poll.
lines_per_poll: 20
# Periodic task callback (in units of poll events). Set None to deactivate.
periodic_task_callback: ~
# How to react upon a simulation exiting with non-zero exit code
nonzero_exit_handling: raise
# can be: ignore, warn, warn_all, raise
# warn_all will also warn if the simulation was terminated by the frontend
# raise will lead to a SystemExit with the error code of the simulation
# How to handle keyboard interrupts
interrupt_params:
# Which signal to send to the workers
send_signal: SIGINT # can be any valid signal name
# NOTE that only SIGINT and SIGTERM lead to a graceful shutdown on C++ side
# How long to wait for workers to shut down before calling SIGKILL on them
grace_period: 5.
# WARNING Choosing a grace period that is shorter than the duration of one
# iteration step of your model might lead to corrupted HDF5 data!
# Whether to exit after working; exit code will be 128 + abs(signum)
exit: false
# In which events to save streams *during* the work session
# May be: `monitor_updated`, `periodic_callback`
save_streams_on: [monitor_updated]
# Reporters to invoke at different points of the WorkerManager's operation.
# Keys refer to events, values are lists of report format names, which can be
# defined via the WorkerManagerReporter (see `reporter.report_formats` above)
rf_spec:
before_working: [sweep_info]
while_working: [progress_bar]
task_spawned: [progress_bar]
monitor_updated: [progress_bar]
task_finished: [progress_bar, report_file]
after_work: [progress_bar, report_file]
after_abort: [progress_bar, report_file]
# Configuration for the WorkerManager.start_working method
run_kwargs:
# Total timeout (in s) of a run; to ignore, set to ~
timeout: ~
# A list of StopCondition objects to check during the run _for each worker_.
# The entries of the following list are OR-connected, i.e. it suffices that
# one is fulfilled for the corresponding worker to be stopped
stop_conditions: ~
# See docs for how to set these up:
# https://docs.utopia-project.org/html/usage/run/stop-conditions.html
# The defaults for the worker_kwargs
# These are passed to the setup function of each WorkerTask before spawning
worker_kwargs:
# Whether to save the streams of each Universe to a log file
save_streams: true
# This file is saved only after the WorkerTask has finished in order to
# reduce I/O operations on files
# Whether to forward the streams to stdout
forward_streams: in_single_run
# can be: true, false, or 'in_single_run' (print only in single runs)
# Whether to forward the raw stream output or only those lines that were not
# parsable to yaml, i.e.: only the lines that came _not_ from the monitor
forward_raw: true
# The log level at which the streams should be forwarded to stdout
streams_log_lvl: ~ # if None, uses print instead of the logging module
# Arguments to subprocess.Popen
popen_kwargs:
# The encoding of the streams (STDOUT, STDERR) coming from the simulation.
# NOTE If your locale is set to some other encoding, or the simulation uses
# a custom one, overwrite this value accordingly via the user config.
encoding: utf8
# Cluster mode configuration
# Whether cluster mode is enabled
cluster_mode: false
# Parameters to configure the cluster mode
cluster_params:
# Specify the workload manager to use.
# The names of environment variables are chosen accordingly.
manager: slurm # available: slurm
# The environment to look for parameters in. If not given, uses os.environ
env: ~
# Specify the name of environment variables for each supported manager
# The resolved values are available at the top level of the dict that is
# returned by Multiverse.resolved_cluster_params
env_var_names:
slurm:
# --- Required variables ---
# ID of the job
job_id: SLURM_JOB_ID
# Number of available nodes
num_nodes: SLURM_JOB_NUM_NODES
# List of node names
node_list: SLURM_JOB_NODELIST
# Name of the current node
node_name: SLURMD_NODENAME # sic!
# This is used for the name of the run
timestamp: RUN_TIMESTAMP
# --- Optional values ---
# Name of the job
job_name: SLURM_JOB_NAME
# Account from which the job is run
job_account: SLURM_JOB_ACCOUNT
# Number of processes on current node
num_procs: SLURM_CPUS_ON_NODE
# Cluster name
cluster_name: SLURM_CLUSTER_NAME
# Custom output directory
custom_out_dir: UTOPIA_CLUSTER_MODE_OUT_DIR
# Could have more managers here, e.g.: docker
# Which parser to use to extract node names from node list
node_list_parser_params:
slurm: condensed # e.g.: node[002,004-011,016]
# Which additional info to include into the name of the run directory, i.e.
# after the timestamp and before the model directory. All information that
# is extracted from the environment variables is available as keyword
# argument to format. Should be a sequence of format strings.
additional_run_dir_fstrs: [ "job{job_id:}" ]
# Data Manager ................................................................
# The DataManager takes care of loading the data into a tree-like structure
# after the simulations are finished.
# It is based on the DataManager class from the dantro package. See there for
# full documentation.
data_manager:
# Where to create the output directory for this DataManager, relative to
# the run directory of the Multiverse.
out_dir: eval/{timestamp:}
# The {timestamp:} placeholder is replaced by the current timestamp such that
# future DataManager instances that operate on the same data directory do
# not create collisions.
# Directories are created recursively, if they do not exist.
# Define the structure of the data tree beforehand; this allows to specify
# the types of groups before content is loaded into them.
# NOTE The strings given to the Cls argument are mapped to a type using a
# class variable of the DataManager
create_groups:
- path: multiverse
Cls: MultiverseGroup
# Where the default tree cache file is located relative to the data
# directory. This is used when calling DataManager.dump and .restore without
# any arguments, as done e.g. in the Utopia CLI.
default_tree_cache_path: data/.tree_cache.d3
# Supply a default load configuration for the DataManager
# This can then be invoked using the dm.load_from_cfg() method.
load_cfg:
# Load the frontend configuration files from the config/ directory
# Each file refers to a level of the configuration that is supplied to
# the Multiverse: base <- user <- model <- run <- update
cfg:
loader: yaml # The loader function to use
glob_str: 'config/*.yml' # Which files to load
ignore: # Which files to ignore
- config/parameter_space.yml
- config/parameter_space_info.yml
- config/full_parameter_space.yml
- config/full_parameter_space_info.yml
required: true # Whether these files are required
path_regex: config/(\w+)_cfg.yml # Extract info from the file path
target_path: cfg/{match:} # ...and use in target path
# Load the parameter space object into the MultiverseGroup attributes
pspace:
loader: yaml_to_object # Load into ObjectContainer
glob_str: config/parameter_space.yml
required: true
load_as_attr: true
unpack_data: true # ... and store as ParamSpace obj.
target_path: multiverse
# Load the configuration files that are generated for _each_ simulation
# These hold all information that is available to a single simulation and
# are in an explicit, human-readable form.
uni_cfg:
loader: yaml
glob_str: data/uni*/config.yml
required: true
path_regex: data/uni(\d+)/config.yml
target_path: multiverse/{match:}/cfg
parallel:
enabled: true
min_files: 1000
min_total_size: 1048576 # 1 MiB
# Load the binary output data from each simulation.
data:
loader: hdf5_proxy
glob_str: data/uni*/data.h5
required: true
path_regex: data/uni(\d+)/data.h5
target_path: multiverse/{match:}/data
enable_mapping: true # see DataManager for content -> type mapping
# Options for loading data in parallel (speeds up CPU-limited loading)
parallel:
enabled: false
# Number of processes to use; negative is deduced from os.cpu_count()
processes: ~
# Threshold values for parallel loading; if any is below these
# numbers, loading will *not* be in parallel.
min_files: 5
min_total_size: 104857600 # 100 MiB
# The resulting data tree is then:
# └┬ cfg
# └┬ base
# ├ meta
# ├ model
# ├ run
# └ update
# └ multiverse
# └┬ 0
# └┬ cfg
# └ data
# └─ ...
# ├ 1
# ...
# Plot Manager ................................................................
# The PlotManager, also from the dantro package, supplies plotting capabilities
# using the data in the DataManager.
plot_manager:
# Save the plots to the same directory as that of the data manager
out_dir: ""
# Whether to raise exceptions for plotting errors. false: only log them
raise_exc: false
# How to handle already existing plot configuration files
cfg_exists_action: raise
# NOTE If in cluster mode, this value is set to 'skip' by the Multiverse
# Save all plot configurations alongside the plots
save_plot_cfg: true
# Enable auto-detection of plot creators, e.g. via is_plot_func decorator
auto_detect_creator: true
# Can set creator initialization arguments here
creator_init_kwargs:
universe:
style: &default_style
# Choose a slightly wider figure (16:10 instead of 4:3)
figure.figsize: [8., 5.]
multiverse:
style:
<<: *default_style
# Parameter Space .............................................................
# Only entries below this one will be available to the Utopia model binaries.
#
# The content of the `parameter_space` level is parsed by the frontend and then
# dumped to a file, the path to which is passed to the binary as positional
# argument.
#
# IMPORTANT: In order to remain general, neither the base nor the user config
# should add any model-specific content here
#
parameter_space:
# NOTE: These settings only apply if appropriate depedencies are detected
parallel_execution:
# Enable parallel features of Utopia
enabled: false
# Set a default PRNG seed
seed: 42
# Number of steps to perform
num_steps: 3
# At which step the write_data method should be invoked for the first time
write_start: 0
# Starting from write_start, how frequently write_data should be called
write_every: 1
# NOTE `write_start` and `write_every` are passed along to sub-models. Every
# sub model can overwrite this entry by adding an entry in their model
# configuration level (analogous to `log_levels`.)
# By default, emit monitor data to the terminal every two seconds:
monitor_emit_interval: 2.
# The default logging levels and pattern
log_levels:
# level for backend internal operations
core: warning
# level for general I/O operations
data_io: warning
# level for the DataManager in WriteMode::managed
data_mngr: warning
# level for all models, if not specified otherwise in the run cfg
# NOTE The level is propagated hierarchically, with models 'inheriting' the
# log level of their parents if they don't receive a custom level.
model: info
log_pattern: "[%T.%e] [%^%l%$] [%n] %v"
# The path to the config file to load
# output_path: /abs/path/to/uni<#>/cfg.yml
# NOTE This entry is always added by the frontend. Depending on which
# universe is to be simulated, the <#> is set.
# In which mode the output file is to be created, can be: w, r+, a, x
output_file_mode: w
# Below here, the model configuration starts, i.e. the config that is used by
# a model instance. To add it
# <model_name>: !model
# model_name: <model_name> # which model's default config to add here
# ... will be added here
# NOTE This entry is added to the parameter space, if no run configuration is
# made available to the Multiverse.
Note
The parameter_space
key is by default (!) assumed to be a paramspace.paramspace.ParamSpace
object.
Defining sweep dimensions therein thus does not require to mark it with the !pspace
YAML tag.