Default Batch Configuration#
The file included below provides default values for setting up the Batch Framework framework, i.e. the utopya.batch.BatchTaskManager
.
It is similar to utopya_base_cfg which is the basis for the Multiverse
meta-configuration.
Same as with the meta-configuration, the batch configuration defaults are subsequently updated as described here.
# The default configuration for the BatchTaskManager
#
# Mainly takes care to set up the WorkerManager in a reasonable shape.
---
# .. BatchTaskManager Options .................................................
# In debug mode, a failing batch task will lead to all other tasks being
# stopped and the main process exiting with a non-zero exit status.
debug: false
# Paths configuration
paths:
# Where to store batch run configurations and metadata
out_dir: ~/utopya_output/_batch
# A note to append to the batch output directory
note: ~
# At which level to perform parallelization
parallelization_level: batch
# Two options:
# batch: parallelization is done on the level of the batch tasks, i.e.
# multiple tasks are worked on in parallel according to the
# configuration in `worker_manager.num_workers`.
# task: each individual task may work in parallel, thus requiring
# more than one CPU core. Subsequently, num_workers is set to 1.
# .. Task defaults ............................................................
task_defaults:
run: {} # Not implemented yet!
eval:
# The data output directory for evaluation tasks.
# May be a format string, allowed keys:
# model_name, task_name, timestamp, batch_name (== {timestamp}_{note})
# Relative paths are evaluated relative to the batch output directory.
out_dir: "{task_name:}"
# Whether to create symlinks that connect the batch run directory with
# related directories and files elsewhere:
# - Adds a symlink from the task configuration to the batch run directory
# - Adds a symlink from the *simulation output directory* (where the
# simulation data resides) to this evaluation output directory.
create_symlinks: true
# Task priority; tasks with a lower value are worked on first.
priority: ~
# Any further arguments here are used within the task to set up the
# meta-configuration of the FrozenMultiverse.
data_manager:
out_dir_kwargs:
# By default, make sure that the data directory does *not* yet exist
exist_ok: false
plot_manager:
# Raise exceptions, such that plots do not fail silently within a task
raise_exc: true
# The following options may be useful if desiring to allow that new plots
# overwrite existing plots ...
# cfg_exists_action: overwrite_nowarn
# creator_init_kwargs:
# universe:
# exist_ok: true
# multiverse:
# exist_ok: true
tasks:
run: {} # Not implemented yet. Adding a task here will raise an error.
eval: {}
# .. WorkerManager and Reporter ...............................................
worker_manager:
# Specify how many processes work in parallel
num_workers: auto
# can be: an int, 'auto' (== #CPUs). For values <= 0: #CPUs - num_workers
# NOTE: This value will be set to 1 if `parallelization_level = 'task'`,
# because in those cases the parallelization should not occur in this
# WorkerManager but within the individual tasks.
# Delay between polls [seconds]
poll_delay: 0.05
# NOTE: If this value is too low, the main thread becomes very busy.
# If this value is too high, the log output from simulations is not
# read from the line buffer frequently enough.
# Maximum number of lines to read from each task's stream per poll cycle.
# Choosing a value that is too large may affect poll performance in cases
# where the task generates many lines of output.
# Set to -1 to read *all* available lines from the stream upon each poll.
lines_per_poll: 20
# Periodic task callback (in units of poll events). Set None to deactivate.
periodic_task_callback: 20
# How to react upon a simulation exiting with non-zero exit code
nonzero_exit_handling: warn_all
# can be: ignore, warn, warn_all, raise
# warn_all will also warn if the simulation was terminated by the frontend
# raise will lead to a SystemExit with the error code of the simulation
# How to handle keyboard interrupts
interrupt_params:
# Which signal to send to the workers
send_signal: SIGTERM # should be SIGTERM for graceful shutdown
# How long to wait for workers to shut down before calling SIGKILL on them
grace_period: 5.
# WARNING Choosing the grace period too short may corrupt the output that
# is written at the time of the signal.
# Whether to exit after working; exit code will be 128 + abs(signum)
exit: false
# In which events to save streams *during* the work session
# May be: `monitor_updated`, `periodic_callback`
save_streams_on: [periodic_callback]
# Report format specifications at different points of the WM's operation
# These report formats were defined in the reporter and can be referred to
# by name here. They can also be lists, if multiple report formats should
# be invoked.
rf_spec:
before_working: []
while_working: [progress_bar]
task_spawned: [progress_bar]
monitor_updated: [progress_bar]
task_finished: [progress_bar, report_file]
after_work: [progress_bar, report_file]
after_abort: [progress_bar, report_file]
# The defaults for the worker_kwargs
# These are passed to the setup function of each MPProcessTask before spawning
worker_kwargs:
# Whether to save the streams of each individual batch process to a log file
save_streams: true
# The log file is saved only after the MPProcessTask has finished in order to
# reduce I/O operations on files
# Whether to save streams in raw format
save_raw: true
# Whether to remove ANSI escape characters (e.g. from color logging) when
# saving the stream
remove_ansi: true
# Whether to forward the streams to stdout. Output may be garbled!
forward_streams: false
# Whether to forward the raw stream output or only those lines that were not
# parsable to yaml, i.e.: only the lines that came _not_ from the monitor
forward_raw: true
# The log level at which the streams should be forwarded to stdout
streams_log_lvl: ~ # if None, uses print instead of the logging module
# Arguments to utopya.task.PopenMPProcess
popen_kwargs: {}
# Reporter configuration
reporter:
# Define report formats, which are accessible, e.g. from the WorkerManager
report_formats:
progress_bar: # Name of the report format specification
parser: progress_bar # The parser to use
write_to: stdout_noreturn # The writer to use
min_report_intv: 0.5 # Required time (in s) between writes
# -- All further kwargs on this level are passed to the parser
# Terminal width for the progress bar
# Can also be `adaptive` (poll each time), or `fixed` (poll once)
num_cols: adaptive
# The format string to use for progress information
# Available keys:
# - `total_progress` (in %)
# - `active_progress` (mean progress of _active_ simulations, in %)
# - `cnt` (dict of counters: `total`, `finished`, `active`)
info_fstr: "{total_progress:>5.1f}% ({cnt[finished]} / {cnt[total]})"
# Example of how to access counters in format string:
# info_fstr: "finished {cnt[finished]}/{cnt[total]} "
# Whether to show time information alongside the progress bar
show_times: true
# How to display time information.
# Available keys: `elapsed`, `est_left`, `est_end`, `start`, `now`
# (see `times` parser for more information)
times_fstr: "| {elapsed} elapsed"
times_fstr_final: "| finished in {elapsed:} "
times_kwargs:
# How to compute the estimated time left to finish the work session
# Available modes:
# - `from_start`: extrapolates from progress made since start
# - `from_buffer`: uses a buffer to store recent progress
# information and use the oldest value for
# making the estimate; see `progress_buffer_size`
mode: from_start
# Creates a report file containing runtime statistics
report_file:
parser: report
write_to:
file:
path: _report.txt
min_report_intv: 10 # don't update this one too often
min_num: 4 # min. number of universes for statistics
show_individual_runtimes: true # for large number of universes, disable
task_label_singular: task
task_label_plural: tasks
run_kwargs:
# Total timeout (in s) of a batch run; to ignore, set to ~
timeout: ~
# .. Cluster Support ..........................................................
# NOTE Not implemented!
cluster_mode: false
Hint
This default configuration is meant to be self-documenting, thus allowing to see which parameters are available.
If in doubt, refer to the individual docstrings, e.g. of the WorkerManager
or WorkerManagerReporter
.