Distributions
Contents
Distributions#
This module provides plotting functions to visualize distributions.
- utopya.plot_funcs.distribution.histogram(dm: utopya.datamanager.DataManager, *, uni: utopya.datagroup.UniverseGroup, hlpr: utopya.plotting.PlotHelper, model_name: str, path_to_data: str, histogram_kwargs: Optional[dict] = None, use_unique: bool = False, preprocess: Optional[Tuple[Union[dict, str]]] = None, postprocess: Optional[Tuple[Union[dict, str]]] = None, mask_repeated: bool = False, show_histogram_info: bool = True, transformations_log_level: int = 10, pyplot_func_name: str = 'bar', **pyplot_func_kwargs)[source]
Calculates a histogram from the data and plots it.
This function is very versatile. Its capabilities range from a plain old histogram (only required arguments set) to the plot of a complementary cumulative probability distribution function.
Don’t despair. The documentation of arguments below should give a good idea of what each parameter does.
- Parameters
dm (DataManager) – The data manager from which to retrieve the data
uni (UniverseGroup) – The selected universe data
hlpr (PlotHelper) – The PlotHelper that instantiates the figure and takes care of plot aesthetics (labels, title, …) and saving
model_name (str) – The model name that the data resides in
path_to_data (str) – The path to the data relative to the model data output
histogram_kwargs (dict, optional) – Passed to np.histogram. This can be used to adjust the number of bins or set the range the bins should be spread over; the latter also allows to pass a 2-tuple containing None, which will be resolved to data.min() or data.max(). See np.histogram documentation for other arguments.
use_unique (bool, optional) – If this option is set, will not do a regular histogram but count unique values.
preprocess (Tuple[Union[dict, str]], optional) – Apply pre-processing transformations to the selected data. With the parameters specified here, multiple transformations can be applied. This can be used for dimensionality reduction of the data, but also for other operations, e.g. to select a slice. The operations are carried out before calculating the histogram. For available parameters, see
utopya.dataprocessing.transform()
postprocess (Tuple[Union[dict, str]], optional) – Same as
preprocess
but applied _after_ the histogram was computed.mask_repeated (bool, optional) – In use_unique mode, will mask the counts such that repeated values are not shown.
show_histogram_info (bool, optional) – Whether to show an info box in the top right-hand corner
transformations_log_level (int, optional) – With which log level to perform the transformations. Useful for debugging.
pyplot_func_name (str, optional) – The name of the matplotlib.pyplot function to use for plotting. By default, a bar plot is performed. For unique data, it might make more sense to do a line or scatter plot. Note that for the bar plot, the bar widths are automatically passed to the plot call and can not be adjusted.
**pyplot_func_kwargs – The kwargs passed on to the pyplot function chosen via the pyplot_func_name argument.
- Raises
ValueError – When trying to make a bar plot with use_unique option enabled.
Histogram#
- utopya.plot_funcs.distribution.histogram(dm: utopya.datamanager.DataManager, *, uni: utopya.datagroup.UniverseGroup, hlpr: utopya.plotting.PlotHelper, model_name: str, path_to_data: str, histogram_kwargs: Optional[dict] = None, use_unique: bool = False, preprocess: Optional[Tuple[Union[dict, str]]] = None, postprocess: Optional[Tuple[Union[dict, str]]] = None, mask_repeated: bool = False, show_histogram_info: bool = True, transformations_log_level: int = 10, pyplot_func_name: str = 'bar', **pyplot_func_kwargs)[source]
Calculates a histogram from the data and plots it.
This function is very versatile. Its capabilities range from a plain old histogram (only required arguments set) to the plot of a complementary cumulative probability distribution function.
Don’t despair. The documentation of arguments below should give a good idea of what each parameter does.
- Parameters
dm (DataManager) – The data manager from which to retrieve the data
uni (UniverseGroup) – The selected universe data
hlpr (PlotHelper) – The PlotHelper that instantiates the figure and takes care of plot aesthetics (labels, title, …) and saving
model_name (str) – The model name that the data resides in
path_to_data (str) – The path to the data relative to the model data output
histogram_kwargs (dict, optional) – Passed to np.histogram. This can be used to adjust the number of bins or set the range the bins should be spread over; the latter also allows to pass a 2-tuple containing None, which will be resolved to data.min() or data.max(). See np.histogram documentation for other arguments.
use_unique (bool, optional) – If this option is set, will not do a regular histogram but count unique values.
preprocess (Tuple[Union[dict, str]], optional) – Apply pre-processing transformations to the selected data. With the parameters specified here, multiple transformations can be applied. This can be used for dimensionality reduction of the data, but also for other operations, e.g. to select a slice. The operations are carried out before calculating the histogram. For available parameters, see
utopya.dataprocessing.transform()
postprocess (Tuple[Union[dict, str]], optional) – Same as
preprocess
but applied _after_ the histogram was computed.mask_repeated (bool, optional) – In use_unique mode, will mask the counts such that repeated values are not shown.
show_histogram_info (bool, optional) – Whether to show an info box in the top right-hand corner
transformations_log_level (int, optional) – With which log level to perform the transformations. Useful for debugging.
pyplot_func_name (str, optional) – The name of the matplotlib.pyplot function to use for plotting. By default, a bar plot is performed. For unique data, it might make more sense to do a line or scatter plot. Note that for the bar plot, the bar widths are automatically passed to the plot call and can not be adjusted.
**pyplot_func_kwargs – The kwargs passed on to the pyplot function chosen via the pyplot_func_name argument.
- Raises
ValueError – When trying to make a bar plot with use_unique option enabled.