Evaluation of HPC simulation tools for efficient and cost-effective resource provisioning
Description
At the beginning of my internship, I was introduced to a few tools used at CERN. These tools include the HPCBatch cluster, where I was going to develop the project.
The objective of the project is to study a new methodology of evaluating HPC clusters through simulations tools. Using simulation tools enable the possibility of knowing the cost-effectiveness and efficiency with precise results in advance. This methodology has the workflow shown below:
1. Obtain HPC application traces: by instrumenting applications using standard tools such as Score-P or Tau, we can generate detailed traces for each application. This way we can create a repository of HPC application traces. These can be generated by experts in the field.
2. Create simulation scenarios: by changing the component parameters, one can model faster CPUs or faster memory or storage according to the characteristics offered by vendors.
3. Explore cost-effective configurations: leveraging the application traces, we can use them as inputs for the simulator to evaluate different hypothetical scenarios. For instance, one where we invest more in a better network or a situation where we invest more in faster memory, or CPUs. Since the simulator has a graphical interface, this step (and the previous step) can be carried out by service managers, but also anybody in charge of resource procurement.
Due to their expensiveness, knowing where the bottlenecks can be will save a lot of effort trying to optimize applications. Also, it can be useful by testing which applications are better for the existing HPC clusters, with the advantage of not having to saturate the cluster, delaying other possible results executing at that moment. This is applied either way to cloud environments and bare-metal.
Files
CERN_openlab_SUM_report_Miguel_Perez.pdf
Files
(331.7 kB)
Name | Size | Download all |
---|---|---|
md5:7f47d85f11fa98d81584b9313127e7e4
|
331.7 kB | Preview Download |