Journal article Open Access
NoSQL databases are increasingly used for storing and managing data in business-critical Big Data systems. The presence of software defects (i.e., bugs) in these databases can bring in severe consequences to the NoSQL services being offered, such as data loss or service unavailability. Thus, it is essential to understand the types of defects that frequently affect these databases, allowing developers take action in an informed manner (e.g., redirect testing efforts). In this paper, we use Orthogonal Defect Classification (ODC) to classify a total of 4096 software defects from three of the most popular NoSQL databases: MongoDB, Cassandra, and HBase. The results show great similarity for the defects across the three different NoSQL systems and, at the same time, show the differences and heterogeneity regarding research carried out in other domains and types of applications, emphasizing the need for possessing such information. Our results expose the defect distributions in NoSQL databases, provide a foundation for selecting representative defects for NoSQL systems, and, overall, can be useful for developers for verifying and building more reliable NoSQL database systems.