MSP430FR5969 Basic Block Worst Case Energy Consumption (WCEC) and Worst Case Execution Time (WCET) dataset
Description
This dataset contains around 30 000 basic blocks whose energy consumption and execution time have been measured in isolation on the MSP430FR5969 microcontroller, at 1MHz. Basic blocks were executed in a worst case scenario regarding the MSP430 FRAM cache and CPU pipeline. The dataset creation process is described thoroughly in [1].
Folder structure
This dataset is composed of the following files:
basic_blocks.tar.xz
contains all basic blocks (BB) used in the dataset, in a custom JSON format,data.csv
/data.xlsx
contains the measured energy consumption and execution time for each basic block
We first details how the basic_blocks.tar.gz
archive is organized, and then present the CSV/XSLX spreadsheet format.
Basic Blocks
We extracted the basic blocks from a subset of programs of the AnghaBench benchmark suite [2]. The basic_blocks.tar.gz
archive consist of the extracted basic blocks organized as json
files. Each json
file correspond to a C source file from AnghaBench, and is given a unique identifier. An example json (137.json
) is available here:
{
"extr_pfctl_altq.c_pfctl_altq_init": [
# Basic block 1
[
# Instruction 1 of BB1
[
"MOV.W",
"#queue_map",
"R13"
],
# Instruction 2 of BB1
[
"MOV.B",
"#0",
"R14"
],
# Instruction 3 of BB1
[
"CALL",
"#hcreate_r",
null
]
],
# Basic block 2
[
....
]
]
}
The json contains a dict with only one key pointing to an array of basic blocks. This key is the name of the original C source file in AnghaBench from which the basic blocks were extracted (here extr_pfctl_altq.c_pfctl_altq_init.c). The array contains severals basic blocks, which are represented as an array of instructions, which are themselves represented as an array [OPCODE, OPERAND1, OPERAND2]
.
Then, each basic block can be identified uniquely using two ids : its file id and its offset in the file (id=<file id>_<offset>
). In our example, the basic block 1 can be identified by the json file id (137
) and its offset in the file (0
). Its ID is 137_0
. This ID is used to make the mapping between a basic block and its energy consumption/execution time, with the data.csv
/data.xlsx
spreadsheet.
Energy Consumption and Execution Time
Energy consumption and execution time data are stored in the data.csv
file. Here is the extract of the csv file corresponding to the basic block 137_0
. The spreadsheet format is described below.
bb_id;nb_inst;max_energy;max_time;avg_time;avg_energy;energy_per_inst;nb_samples;unroll_factor
137_0;3;8.77;7.08;7.04;8.21;2.92;40;50
Spreadsheet format :
bb_id
: the unique identifier of a basic block (cf. Basic Blocks)nb_inst
: the number of instructions in the basic blockmax_energy
: the maximum energy comsumption (in nJ) measured during the experimentmax_time
: the maximum execution time (in us) measured during the experimentavg_time
: the average execution time (in us) measured during the experimentavg_energy
: the average energy comsumption (in nJ) measured during the experimentenergy_per_inst
: the average energy consumption per instruction (correspond toavg_energy
/nb_inst
)nb_samples
: how much time the basic block energy consumption/execution time has been measuredunroll_factor
: how much time the basic block was unrolled (cf Basic Block Unrolling)
Basic Block Unrolling
To measure the energy consumption and execution time of the msp430, we need to be able to handle the scale difference between the measurement tool and the basic block execution time. This is achieved by duplicating the basic block multiple times while making sure to keep the worst-case memory layout as explained in the paper. The number of time the basic block has been duplicated is called the unroll_factor
.
Values of energy and time are always given per basic block, so they have already been divided by the unroll factor.
Dataset description
Features
The selected features after PCA analysis for both energy and time model are listed here: MOV.W_Rn_Rn, MOV.W_X(Rn)_X(Rn), CALL, MOV.B_#N_Rn, ADD.W_Rn_Rn, MOV.W_@Rn_Rn, MOV.W_X(Rn)_Rn, ADD.W_#N_Rn, PUSHM.W_#N_Rn, MOV.W_X(Rn)_ADDR, CMP.W_#N_Rn, MOV.W_&ADDR_X(Rn), MOV.W_Rn_X(Rn), BIS.W_Rn_Rn, RLAM.W_#N_Rn, SUB.W_#N_Rn, MOV.W_&ADDR_Rn, MOV.W_#N_X(Rn), CMP.W_Rn_Rn, BIT.W_ADDR_Rn, MOV.W_@Rn_X(Rn), ADD.W_#N_X(Rn), MOV.W_#N_Rn, AND.W_Rn_Rn, MOV.W_Rn_ADDR, SUB.W_Rn_Rn, MOV.W_ADDR_Rn, MOV.W_X(Rn)_&ADDR, MOV.W_ADDR_ADDR, JMP, ADD_#N_Rn, BIS.W_Rn_X(Rn), SUB_Rn_Rn, MOV.W_ADDR_X(Rn), ADDC_#N_X(Rn), MOV.B_Rn_Rn, CMP.W_X(Rn)_X(Rn), ADD_Rn_Rn, nb_inst, INV.W_Rn_, NOP__, ADD.W_X(Rn)_X(Rn), ADD.W_Rn_X(Rn), MOV.B_@Rn_Rn, BIS.W_X(Rn)_X(Rn), MOV.B_#N_X(Rn), MOV.W_#N_ADDR, AND.W_#N_ADDR, SUBC_X(Rn)_X(Rn), BIS.W_#N_X(Rn), SUB.W_X(Rn)_X(Rn), AND.B_#N_Rn, ADD_X(Rn)_X(Rn), MOV.W_@Rn_ADDR, MOV.W_&ADDR_ADDR, ADDC_Rn_Rn, AND.W_#N_X(Rn), SUB_#N_Rn, RRUM.W_#N_Rn, AND_ADDR_Rn, CMP.W_X(Rn)_ADDR, MOV.B_#N_ADDR, ADD.W_#N_ADDR, CMP.B_#N_Rn, SXT_Rn_, XOR.W_Rn_Rn, CMP.W_@Rn_Rn, ADD.W_@Rn_Rn, ADD.W_X(Rn)_Rn, AND.W_Rn_X(Rn), CMP.B_Rn_Rn, AND.W_X(Rn)_X(Rn), BIC.W_#N_Rn, BIS.W_#N_Rn, AND.B_#N_X(Rn), MOV.B_X(Rn)_X(Rn), AND.W_@Rn_Rn, MOV.W_#N_&ADDR, BIS.W_Rn_ADDR, SUB.W_X(Rn)_Rn, SUB.W_Rn_X(Rn), SUB_X(Rn)_X(Rn), MOV.B_@Rn_X(Rn), CMP.W_@Rn_X(Rn), ADD.W_X(Rn)_ADDR, CMP.W_Rn_X(Rn), BIS.W_@Rn_X(Rn), CMP.B_X(Rn)_X(Rn), RRC.W_Rn_, MOV.W_@Rn_&ADDR, CMP.W_#N_X(Rn), ADDC_X(Rn)_Rn, CMP.W_X(Rn)_Rn, BIS.W_X(Rn)_Rn, SUB_X(Rn)_Rn, MOV.B_X(Rn)_Rn, MOV.W_ADDR_&ADDR, AND.W_#N_Rn, RLA.W_Rn_, INV.W_X(Rn)_, XOR.W_#N_Rn, SUB.W_Rn_ADDR, BIC.W_#N_X(Rn), MOV.B_X(Rn)_ADDR, ADD_#N_X(Rn), SUB_Rn_X(Rn), MOV.B_&ADDR_Rn, MOV.W_Rn_&ADDR, ADD_X(Rn)_Rn, AND.W_X(Rn)_Rn, PUSHM.A_#N_Rn, RRAM.W_#N_Rn, AND.W_@Rn_X(Rn), BIS.B_Rn_X(Rn), SUB.W_@Rn_Rn, CLRC__, CMP.W_#N_ADDR, XOR.W_Rn_X(Rn), MOV.B_Rn_ADDR, CMP.B_X(Rn)_Rn, BIS.B_Rn_Rn, BIS.W_X(Rn)_ADDR, CMP.B_#N_X(Rn), CMP.W_Rn_ADDR, XOR.W_X(Rn)_Rn, MOV.B_Rn_X(Rn), ADD.B_#N_Rn
Code
The trained machine learning model, tests, and local explanation code can be generated and found here: WORTEX Machine learning code
Acknowledgment
This work has received a French government support granted to the Labex CominLabs excellence laboratory and managed by the National Research Agency in the “Investing for the Future” program under reference ANR-10-LABX-07-01
Licensing
Copyright 2024 Hector Chabot Copyright 2024 Abderaouf Nassim Amalou Copyright 2024 Hugo Reymond Copyright 2024 Isabelle Puaut
Licensed under the Creative Commons Attribution 4.0 International License
References
[1] Reymond, H., Amalou, A. N., Puaut, I. “WORTEX: Worst-Case Execution Time and Energy Estimation in Low-Power Microprocessors using Explainable ML” Not published yet
[2] Da Silva, Anderson Faustino, et al. “Anghabench: A suite with one million compilable C benchmarks for code-size reduction.” 2021 IEEE/ACM International Symposium on Code Generation and Optimization (CGO). IEEE, 2021.
Files
data.csv
Files
(5.7 MB)
Additional details
Related works
- Has part
- Dataset: 10.1109/CGO51591.2021.9370322 (DOI)
Funding
- Labex CominLabs/"Investing for the Future" Program ANR-10-LABX-07-01
- Agence Nationale de la Recherche