Published January 16, 2024 | Version v1
Dataset Open

Write performance with different numbers of OSTs for BeeGFS in PlaFRIM

  • 1. ROR icon Université de Bordeaux
  • 2. ROR icon Centre de Recherche Inria Bordeaux - Sud-Ouest

Contributors

Project member:

  • 1. ROR icon Centre de Recherche Inria Bordeaux - Sud-Ouest

Description

This file contains performance measured with the IOR benchmarking tool when writing to the BeeGFS parallel file system following different strategies and using different numbers of OSTs. This data set was used for experiments reported in [1].


All experiments were conducted on the PlaFRIM platform (https://www.plafrim.fr/) between June and December 2023, using the Bora cluster. This I/O infrastructure has been described in [2] (the higher-speed network was used for these experiments). 


IOR version 4.1.0+dev was used, with the POSIX API (-a).


The file is a .csv in text format. The relevant columns are:

  • "nodes" is the number of compute nodes and "procs" is the total number of processes. procs/nodes gives hence the number of processes per node.
  • "filestrategy" is either shared-file (where a single file is accessed by all processes) or file-per-proc (where each process has its own file, created by adding the -F IOR option). For shared-file, "spatiality" may be contig (each process has a contiguous portion of the file) or strided (1D-strided access pattern, created using the -s option from IOR).
  • "reqsize" is the size of each request (IOR option: -t). K and M correspond to KiB and MiB, respectively.
  • "totaldata" is the total amount of data accessed in the experiment (the amount accessed per process, IOR option -b, will therefore be totaldata/procs).
  • "ost_number" is the number of BeeGFS OSTs used. That was configured on a per-directory basis by the system administrators.
  • Multiple repetitions of each configuration were executed, the "repetition" column's only use is to differentiate between them. However, they were executed in random order (so the actual number in "repetition" means nothing).
  • "time" is reported in seconds and corresponds to the total time (including open and close) reported by IOR.

[1] Alexis Bandet, Francieli Boito, Guillaume Pallez. Scheduling distributed I/O resources in HPC systems. 2024. https://inria.hal.science/hal-04394004

[2] Francieli Boito, Guillaume Pallez, Luan Teylo. The role of storage target allocation in applications' I/O performance with BeeGFS. CLUSTER 2022 - IEEE International Conference on Cluster Computing, Sep 2022, Heidelberg, Germany. https://inria.hal.science/hal-03753813 

Files

ost_experiments_results_done.csv

Files (2.3 MB)

Name Size Download all
md5:4a565cefb4606fff1fa2b850b2fc8792
2.3 MB Preview Download