Debugging DAG computations#

As you saw throughout the plotting tutorial, the data transformation framework can be a very powerful tool to prepare data for plotting.

But what about the case where this goes wrong? How can the DAG be debugged?

This page presents some approaches on how to address errors in DAG computations:

Read the error log#

This is the first step towards understanding what’s going on. The data transformation framework aims to make error messages as understandable and helpful as possible.

Let’s look at some examples.

Invalid operation name#

    path: densities
      - .sel: [!dag_prev , {kind: susceptible}]
      - square: [!dag_prev ]  # does not exist

Here, we are trying to select some data from the SEIRD model but have used an operation name (square) that does not exist. Creating a plot with this operation will definitely fail and generate an error message like this:

BadOperationName: Could not find an operation or meta-operation named 'square'!

No operation 'square' registered! Did you mean: squared, sqrt ?
Available operations:
  !=                                       .
  .()                                      .T
  .all                                     .any
  .argmax                                  .argmin
  .argpartition                            .argsort
  .assign                                  .assign_attrs

  . . .

From the error message, it’s quite clear what’s going on: We need to choose the correct operation name. We also get a list of available operations and even get suggestions for similar names – and we can just follow those: Using squared instead of square will solve our problems.

Failing operation#

The case of an operation failing is a bit trickier, as it depends on what the operation does in particular. Let’s look at an example where we pass a wrong argument to an operation:

    path: densities
      - .data   # to resolve the utopya XarrayDC into an xr.DataArray
      - .sel: [!dag_prev , {kind: SuSCePTIble}]

The error output from that will be something like the following:

DataOperationFailed: Operation '.sel' failed with a KeyError, see below!
It was called with the following arguments:
     0:  <xarray.DataArray 'densities' (time: 151, kind: 8)>
array([[0.5968, 0.4032, 0.    , ..., 0.    , 0.    , 0.    ],
       [0.5968, 0.4032, 0.    , ..., 0.    , 0.    , 0.    ],
       [0.5968, 0.4016, 0.0012, ..., 0.    , 0.    , 0.    ],
       [0.5968, 0.    , 0.    , ..., 0.    , 0.    , 0.    ],
       [0.5968, 0.    , 0.    , ..., 0.    , 0.    , 0.    ],
       [0.5968, 0.    , 0.    , ..., 0.    , 0.    , 0.    ]])
  * time  (time) int64 0 1 2 3 4 5 6 7 8 ... 143 144 145 146 147 148 149 150
  * kind  (kind) <U11 'empty' 'susceptible' 'exposed' ... 'source' 'inert'

     1:  {'kind': 'SuSCePTIble'}

  kwargs:  {}

KeyError: 'SuSCePTIble'

What can we learn from that message?

  • Operation .sel failed, so we know where the error occurred.

  • We got a KeyError for the given key SuSCePTIble

  • We see the arguments that were passed to .sel, marked as positional args 0 and 1 … and the given xarray.DataArray does not have a key SuSCePTIble in the kind coordinate dimension!

From that we can deduce: The key actually has to be susceptible.

Now this was comparably straight-forward, but you get the idea.


In the data operation above, we have added the .data operation, which resolves the previous object from a utopya.eval.containers.XarrayDC into a regular xarray.DataArray object. This makes debugging much easier because it shows the actual content of the array.

Look at the DAG visualization#

What about a case where it’s harder to locate where an error comes from, e.g. if there are multiple operations.

    path: densities

  - .sel: [!dag_tag kind, {kind: susceptible}]
    kwargs: {drop: true}
    tag: susceptible
  - .sel: [!dag_tag kind, {kind: exposed}]
    kwargs: {drop: true}
    tag: exposed
  - .sel: [!dag_tag kind, {kind: infected}]
    kwargs: {drop: true, bAd_ArGuMeNT: i should not be here! }
    tag: infected
  - .sel: [!dag_tag kind, {kind: recovered}]
    kwargs: {drop: true}
    tag: recovered

  - xr.Dataset:
    - susceptible: !dag_tag susceptible
      exposed: !dag_tag exposed
      infected: !dag_tag infected
      recovered: !dag_tag recovered
    tag: data

Again, one is obviously wrong here, but because there are many .sel operations, it’s not immediately clear which one.

Let’s have a look at the terminal log again. You may have noticed already earlier, that something like the following is printed alongside the error:

NOTE     base              Creating DAG visualization (scenario: 'compute_error') ...
NOTE     dag               Generating DAG representation for 7 tags ...

. . .

CAUTION  base              Created DAG visualization for scenario 'compute_error'. For debugging, inspecting the generated plot and the traceback information may be helpful: ... dag_compute_error.pdf
ERROR    plot_mngr         An error occurred during plotting with UniversePlotCreator ...

Here, the plotting framework automatically created a visualization of the DAG to help with debugging. It calls this scenario a compute_error, because that’s what happened: The DAG computation failed and that’s why such a visualization is created. The log also tells you where the file was saved to, typically it ends up right beside where the plot should have been created.

Let’s look at the generated DAG visualization:

A DAG visualization in a failed scenario

This tells us a lot:

  • The light red node is where the operation failed, while computing the infected tag.

  • Subsequently, the xr.Dataset operation in the end cannot be carried out.

  • The remaining node colors show which operations succeeded and which ones were only prepared for computation but not actually carried out.

The DAG visualization is a powerful way of understanding what is going on and if the DAG structure is actually the way you expected it to be.

The visualization feature is controlled via the dag_visualization entry of your plot configuration. Read more about this feature and available parameters in the dantro docs.

How to always create a DAG visualization

By default, DAG visualizations are only created if the DAG computation failed for whatever reason.

To always create a DAG visualization, regardless of that, inherit the .dag.vis.always base configuration.

    # ...
    - .dag.vis.always
    # ...

DAG visualization with multiverse plots

For multiverse plots, DAG visualization may not generate a usable figure, given the potentially large number of nodes. In such cases it makes sense to temporarily restrict the plot to a subspace:

      some_dim: [0, 1]
      another_dim: [foo, bar]

Ideally, use the same dimensionality as in the case you want to debug.


Node positioning drastically improves with pygraphviz installed in the utopia-env.


DAG visualization is only available after the DAG has been fully constructed. If you make errors during construction, like setting a tag multiple times, the visualization will not be able to help you.

Further approaches#

If the above does not help in isolating the error, there are a bunch of other things you can try:

  • Check the definition of a failing operation in the operations database and how certain operations are defined.

  • Have a look at the dantro DAG troubleshooting section.

  • In case the error appears not during computation but in the plot function, check the format (dimensionality, shape, coordinate labels etc.) in which the plot function expects to receive data.

  • If all of that does not work out, you can try to create a toy example in an interactive python session to find out how the objects behave.

Open an issue and ask for help#

If all of the above approaches did not succeed, we are more than happy to assist. Feel free to open an issue in the Utopia GitLab project.

For bug reports or suggestions to improve the DAG framework, we are welcoming your feedback in the dantro GitLab project.