Planned intervention: On Thursday 19/09 between 05:30-06:30 (UTC), Zenodo will be unavailable because of a scheduled upgrade in our storage cluster.
Published June 13, 2023 | Version v1
Journal article Open

Silent Data Corruptions: Microarchitectural Perspectives

  • 1. University of Athens

Description

Today more than ever before, academia, manufacturers, and hyperscalers acknowledge the major challenge of silent data corruptions (SDCs) and aim on solutions to minimize its impact by avoiding, detecting, and mitigating SDCs. Recent studies on large scale datacenters conducted by Meta and Google report an unexpected rate of silent data corruption incidents that are attributed to modern microprocessor generations. Despite the acknowledged severity of the phenomenon, particularly at the datacenter scale, there is no in-depth analysis of the microarchitectural locations in a complex microprocessor that are more likely to generate an SDC at the program outputs. In this paper, we present a detailed analysis of the faulty behavior of many critical microarchitectural structures of a modern out-of-order microprocessor generating silent data corruptions. Our analysis unveils several observations, including: (i) the magnitude of silent data corruptions attributed to different hardware structures, (ii) the instruction-related parameters that are more likely to result in a silent data corruption, (iii) the extent to which the operating system affects the silent data corruption occurrences, and (iv) the byte positions of a word which are more likely to result in silent data corruptions. Collectively, such findings can assist decisions for hardware and software schemes for the reduction of the likelihood of silent data corruptions generation.

Files

IEEE_TC_SDCs.pdf

Files (508.8 kB)

Name Size Download all
md5:4580a11844e6dd6b40b62773caaecd70
508.8 kB Preview Download

Additional details

Funding

NEUROPULS – NEUROmorphic energy-efficient secure accelerators based on Phase change materials aUgmented siLicon photonicS 101070238
European Commission
REBECCA – Reconfigurable Heterogeneous Highly Parallel Processing Platform for safe and secure AI 101097224
European Commission
Vitamin-V – Virtual Environment and Tool-boxing for Trustworthy Development of RISC-V based Cloud Services 101093062
European Commission