Dealing with Hardware Faults in Energy-Efficient Static Schedules of Multi-Variant Programs on Heterogeneous Platforms
We investigate the energy-efficient execution of programs with a sequence of program parts, each part executable by multiple variants on different execution units. We study their behaviour under the presence of crash faults on a computing platform with heterogeneous execution units like multicore, GPU, and FPGA. To this end, we extend a static scheduling algorithm for computing the sequence of variants leading to minimum runtime, minimum energy consumption, or a weighted sum of both, to consider cases where one or more program variants cannot be used anymore from some execution point on, due to failure of the underlying execution unit(s). This extension combines the advantageous results of static scheduling, known in the fault-free case, with avoidance of overhead for re-scheduling in case of a fault. We evaluate our algorithm with synthetically generated progam task graphs. The results indicate that, compared to computing a new schedule for each fault case, our algorithm only needs 55% of the scheduling time for 8 variants.