Published March 4, 2022 | Version v0.1
Dataset Open

A Study of Real-world Data Races in Golang (Artifact)

  • 1. Uber Technologies

Description

The concurrent programming literature is rich with tools and techniques for data race detection. Less, however, has been known about real-world, industry-scale deployment, experience, and insights about data races. Golang (Go for short) is a modern programming language that makes concurrency a first-class citizen. Go offers both message passing and shared memory for communicating among concurrent threads. Go is gaining popularity in modern microservice-based systems. Data races in Go stand in the face of its emerging popularity.

In this paper, using our industrial codebase as an example, we demonstrate that Go developers embrace concurrency and show how the abundance of concurrency alongside language idioms and nuances make Go programs highly susceptible to data races. 
Google's Go distribution ships with a built-in dynamic data race detector based on ThreadSanitizer. Dynamic race detectors pose scalability and flakiness challenges; we discuss various software engineering trade-offs to scale this detector to work effectively at scale.

We have deployed this detector in our 50-million lines of Go codebase hosting 2100 distinct microservices, found over 2000 data races, fixed over 1000 data races, spanning 790 distinct code patches submitted by 210 unique developers over a six-month period. Based on a detailed investigation of these data race patterns in Go, we make seven high-level observations relating to the complex interplay between the Go language paradigm and data races. 

Files

Files (11.1 kB)

Name Size Download all
md5:3f1d59e3a2ffda3795f790510b214683
11.1 kB Download