The BioSimulators standards are a set of formats and conventions for (a) describing the specifications of simulation tools (e.g., supported modeling frameworks, simulation algorithms, and modeling formats) and (b) the syntax and semantics of the inputs and outputs of biosimulation software tools. Collectively, the standards ensure that (a) investigators can communicate the specifications of simulation tools, (b) investigators can use this information to find simulation tools that have specific capabilities, (c) investigators can consistently use the same syntax to execute simulations with multiple simulation tools (e.g., same paths to the same files in the same formats), and (d) simulation tools produce consistent outputs (reports and plots at consistent paths in consistent formats).

The BioSimulators standards encompass multiple standards:

  • Format for describing the specifications of simulation tools: This format enables investigators to describe the modeling frameworks, simulation algorithms, and modeling formats supported by each simulation tool. The format also enables investigators to describe the parameters of each algorithm, their data types, and their default values.
  • Conventions for command-line applications and APIs for simulation tools: These conventions ensure that (a) simulation tools support consistent syntax for executing simulations and (b) simulation tools produce outputs at consistent locations in consistent formats.
  • Conventions for Docker images for simulation tools: These conventions ensure that the entry points of containerized simulation tools support consistent syntax for executing simulations and that containerized simulation tools provide consistent metadata.
  • Conventions for simulation experiments with SED-ML: These conventions ensure that the community consistently encodes simulation experiments into SED-ML. This includes conventions for targets for implicit elements of models which are not directly defined in models (e.g., reduced costs of FBA reactions, shadow prices of FBA species). This also delineates how to encode the values of model attribute changes and algorithm parameters into SED-ML, including encoding enumerated values, lists, dictionaries, and other complex data structures.
  • Format for data for reports (sedml:report) and plots (sedml:plot2D, sedml:plot3D) of simulation results: This format ensures that simulation tools produce data for reports and plots in a consistent format (e.g., HDF5) with consistent shapes (e.g., rows: data set (sedml:dataSet), columns: time) that can be consistently visualized and interpreted.
  • Guidelines for data visualizations with Vega: Vega format is a powerful format for describing interactive, two-dimensional data visualizations which make visualizations re-usable by separately capturing the visual marks of a visualization and how they should be painted with data. These guidelines provide recommendations for how to use Vega to visualize the results of simulation experiments captured by SED-ML reports.
  • Guidelines for using the OMEX Metadata format to annotate COMBINE/OMEX archives: BioSimulators recommends using the OMEX Metadata RDF-XML format to annotate the meaning, provenance, and credibility of COMBINE/OMEX archives and their contents. These guidelines recommend specific predicates and objects.

  • Format for logs of the execution of COMBINE/OMEX archives: This format enables simulation tools to communicate their progress in the execution of COMBINE/OMEX archives, such as which tasks have been executed, which tasks are queued for execution, and which tasks will be skipped because, for example, they require features of SED-ML that the simulation tool does not support.

    This format enables simulation tools to communicate the following information:

    • The status and outcome of the COMBINE archive and each SED document, task, and output (queued, running, succeeded, skipped, or failed).
    • Information about the simulation function that was used and the arguments that were used.
    • The standard output/error produced by the COMBINE archive and each SED document, task, and output.
    • The duration of the execution of the COMBINE archive and each SED document, task, and output.
    • The reason for each SED document, task or output that was skipped.
    • The reason for any failed SED documents, tasks, or outputs.
  • Sharing modeling studies: The standards make it easier for investigators to execute diverse simulations involved in collaborations with different colleagues and analyze their outputs.
  • Reusing modeling studies: Similarly, the standards make it easier for investigators to reuse published modeling studies.
  • Peer reviewing modeling studies: Similarly, the standards also make it easier for investigators to review modeling studies prior to publication in journal articles.
  • Comparing simulation tools: By making it easier to execute multiple simulation tools, the standards make it easier to compare the outputs of multiple simulation tools. The ability to easily compare simulation tools, in turn, makes it easier to find errors in simulation tools, debug simulation tools, and drive standardization in the implementation of simulation algorithms across tools.