Silent Data Corruptions: Microarchitectural Perspectives
Description
Today more than ever before, academia, manufacturers, and hyperscalers acknowledge the major challenge of silent data corruptions (SDCs) and aim on solutions to minimize its impact by avoiding, detecting, and mitigating SDCs. Recent studies on large scale datacenters conducted by Meta and Google report an unexpected rate of silent data corruption incidents that are attributed to modern microprocessor generations. Despite the acknowledged severity of the phenomenon, particularly at the datacenter scale, there is no in-depth analysis of the microarchitectural locations in a complex microprocessor that are more likely to generate an SDC at the program outputs. In this paper, we present a detailed analysis of the faulty behavior of many critical microarchitectural structures of a modern out-of-order microprocessor generating silent data corruptions. Our analysis unveils several observations, including: (i) the magnitude of silent data corruptions attributed to different hardware structures, (ii) the instruction-related parameters that are more likely to result in a silent data corruption, (iii) the extent to which the operating system affects the silent data corruption occurrences, and (iv) the byte positions of a word which are more likely to result in silent data corruptions. Collectively, such findings can assist decisions for hardware and software schemes for the reduction of the likelihood of silent data corruptions generation.
Files
IEEE_TC_SDCs.pdf
Files
(508.8 kB)
Name | Size | Download all |
---|---|---|
md5:4580a11844e6dd6b40b62773caaecd70
|
508.8 kB | Preview Download |
Additional details
Funding
- NEUROPULS – NEUROmorphic energy-efficient secure accelerators based on Phase change materials aUgmented siLicon photonicS 101070238
- European Commission
- REBECCA – Reconfigurable Heterogeneous Highly Parallel Processing Platform for safe and secure AI 101097224
- European Commission
- Vitamin-V – Virtual Environment and Tool-boxing for Trustworthy Development of RISC-V based Cloud Services 101093062
- European Commission