Histograms

Histograms#

Summary

On this page, you will see how to

  • use .plot.facet_grid.hist to plot histograms

  • use .plot.multiplot to plot panels of histograms

Histograms are one of the most commonly used plots to visualise distributions. Let us run the SEIRD model with a fixed configuration multiple times over different seeds, and plot a histogram of the maximum peak height of the number of infected agents.

To plot a histogram, base your plot on .plot.facet_grid.hist. We can use the np.max operation in the data transformation process to select the maximum density for each run:

histogram:
  based_on:
    - .creator.multiverse
    - .plot.facet_grid.hist

  # Select only the infected population
  select_and_combine:
    fields:
      infected:
        path: densities
        transform:
          - .sel: [!dag_prev , { kind: infected }]

  # Get the maximum value
  transform:
    - np.max: [!dag_tag infected]
      kwargs:
        axis: 3
      tag: data

  # Helpers
  helpers:
    set_title:
      title: Maximum density of infected agents
    set_labels:
      x: Peak height
      y: ' '

Any additional entries are keyword arguments that will be passed to the low-level plotting function, in this case matplotlib.pyplot.hist. For example, we can specify the number of bins and color by adding

histogram:

  # all the previous entries ...

  color: mediumseagreen
  bins: 50

The output will look something like this:

Single histogram

Plotting facetted histograms#

For facetted histograms, use the multiplot functionality. Here, we are plotting the distribution of the peak of infection in the top plot, and the minimum number of susceptible agents in the bottom plot:

double_histogram:
  based_on:
    - .creator.multiverse
    - .plot.multiplot

  # Select the infected and susceptible populations
  select_and_combine:
    fields:
      infected:
        path: densities
        transform:
          - .sel: [ !dag_prev , { kind: infected } ]
      susceptible:
        path: densities
        transform:
          - .sel: [!dag_prev , {kind: susceptible} ]

  # Get the maximum value for the infected and the minimum for the suscpetible population
  transform:
    - np.max: [ !dag_tag infected ]
      kwargs:
        axis: 3
    - .values: [!dag_prev ]
    - .flatten: [!dag_prev ]
      tag: data_infected
    - np.min: [!dag_tag susceptible ]
      kwargs:
        axis: 3
    - .values: [ !dag_prev ]
    - .flatten: [ !dag_prev ]
      tag: data_susceptible

  bins: 50
  color: '#006666'

  # Use seaborn.histplot with kde to show densities
  to_plot:
    [0, 0]:
      - function: sns.histplot
        args:
          - !dag_result data_infected
        kde: true
    [0, 1]:
      - function: sns.histplot
        args:
          - !dag_result data_susceptible
        kde: true

  helpers:
    setup_figure:
      nrows: 2
      sharex: true

This will produce a plot like this:

Double histogram