Bug Causes

We classified the examined bugs into categories based on their root cause. To do so, we studied the fix of each bug and identified which specific compiler’s procedure was buggy. From our manual inspection, we derived five categories that include bugs sharing common root causes:

  • Type-related Bugs

  • Semantic Analysis Bugs

  • Resolution Bugs

  • AST Transformation Bugs

  • Bugs Related to Error Handling & Reporting

In the following, we provide descriptions and examples for every category.

Semantic Analysis Bugs

Semantic analysis occupies an important space in the design and implementation of compiler front-ends. A compiler traverses the whole program and analyzes each program node individually (i.e., declaration, statement, and expression) to type it and verify whether it is well-formed based on the corresponding semantics. A semantic analysis bug is a bug where the compiler yields wrong analysis results for a certain program node. A semantic analysis bug occurs due to one of the following reasons:

  • Missing validation checks

  • Incorrect analysis mechanics

Missing Validation Checks

This sub-category of bugs include cases where the compiler fails to perform a validation check while analyzing a particular node. This mainly leads to unexpected compile-time errors because the compiler accepts a semantically invalid program because of the missing check. In addition to these false negatives, later compiler phases may be impacted by these missing checks. For example, assertion failures can arise, when subsequent phases (e.g., back-end) make assumptions about program properties, which have been supposedly validated by previous stages. Some indicative examples of validation checks include: validating that a class does not inherit two methods with the same signature, a non-abstract class does not contain abstract members, a pattern match is exhaustive, a variable is initialized before use.

Example:

Scala2-5878

This example demonstrates a semantic analysis bug related to a missing validation check. The program defines two value classes A and B with a circular dependency issue, as the parameter of A refers to B, and the parameter of B refers to A. This dependency problem, though, is not detected by scalac, when checking the validity of these declarations. As a result, scalac crashes at a later stage, when it tries to unbox these value classes based on the type of their parameter. The developers of scalac fixed this bug using an additional rule for detecting circular problems in value classes.

case class A(x: B) extends AnyVal
case class B(x: A) extends AnyVal

Incorrect Analysis Mechanics

A common issue related to semantic analysis bugs is incorrect analysis mechanics. This sub-category contains bugs with root causes that lie in the analysis mechanics and design rather the implementation of type-related operations, i.e., these bugs are specific to the compiler steps used for analyzing and typing certain language constructs. Incorrect analysis mechanics mostly causes compiler crashes and unexpected compile-time errors.

Example:

Dotty-4487

In this bug, the compiler crashes, when it types class A extends (Int => 1), because Dotty incorrectly treats Int => 1 as a term (i.e., function expression) instead of a type (i.e., function type). Specifically, Dotty invokes the corresponding method for typing Int => 1 as a function expression. However, this method crashes because the given node does not have the expected format. Dotty developers fixed this bug by typing Int => 1 as a type.

object 10 {
  def main(i1: Array[String]): Unit = {
    class i2
  }
  class i3(i4: => String) extends (i1 => (this 19)): Option[String, Int] => 1
}

Resolution Bugs

One of a compiler’s core data structures is that representing scope. Scope is mainly used for associating identifier names with their definitions. When a compiler encounters an identifier, it examines the current scope and applies a set of rules to determine which definition corresponds to the given name. In OO languages where features, such as nested scopes, overloading, or access modifiers, are prevalent, name resolution is a complex and error-prone task. A resolution bug is a bug where the compiler is either unable to resolve an identifier name, or the retrieved definition is not the right one. A resolution bug is caused by one of the following scenarios:

  • there are correctnessissues in the implementation of resolution algorithms

  • the compiler performs a wrong query

  • the scope is an incorrect state (e.g., there are missing entries)

The symptoms of resolution bugs are mainly unexpected compile-time errors (when the compiler cannot resolve a given name or considers it as ambiguous) or unexpected runtime behaviors (when resolution yields wrongdefinitions).

Example:

JDK-7042566

In this example, for the method call at line 4, javac finds out that there two applicable methods (see lines 6, 7). In cases where for a given call, there are more than one applicable methods, javac chooses the most specific one according to the rules of JLS. For our example, the method error defined at line 7 is the most specific one, as its signature is less generic than the signature of error defined at line 6. This is because the second argument of error at line 7 (Throwable) is more specific than the second argument of error (Object) at line 6. However, a bug in the way javac applies this applicability check to methods containing a variable number of arguments (e.g.,:code:Object…) makes the compiler treat these methods as ambiguous, and finally reject the code.

class Test {
  void test() {
    Exception ex = null;
    error("error", ex);
  }
  void error(Object o, Object... p) { }
  void error(Object, Throwable t, Object... p) { }
}

AST Transformation Bugs

The semantic analyses of a compiler works on a program’s abstract syntax tree (AST). Before or after typing, a compiler applies diverse transformations and expressed in terms of simpler constructs. For example, javac applies a transformation that converts a foreach loop over a list of integers for (Integer x: list) into a loop of the form for (Iterator<Integer> x = list.iterator(); x.hasNext();) An AST transformation bug is a bug where the compiler generates a transformed program that is not equivalent with the original one, something that invalidates subsequent analyses.

Example:

Scala2-6714

This Scala 2 program defines a class B overriding two special methods named apply, and update (lines 2–5). The function apply allows developers to treat an object as a function. For example, a variable x pointing to an object of class B can be used like x(10). This is equivalent to x.apply(10). Furthermore, the update method is used for updating the contents of an object. For example, a variable x of type B can be used in map-like assignment expressions of the form x(10) = 5. This is equivalent to calling x.update(10, 5). Notice that in our example, the apply method takes an implicit parameter of type A. This means that when calling this function, this parameter may be omitted, letting the compiler pass this argument automatically by looking into the current scope for implicit definitions of type A. Before scalac types the expression on line 9, it “desugars” this assignment, and expresses it in terms of method calls. For example, b(3) += 4 becomes b.update(3, b.apply(3)(a) + 4). However, due to a bug, scalac ignores the implicit parameter list of apply, and therefore, it expands the assignment of line 9 as b.update(3, b.apply(3) + 4). Consequently, the expanded method call does not type check, and scalac rejects the program.

class A
class B {
  def apply(x: Int)(implicit a: A) = 1
  def update(x: Int, y: Int) { }
}
object Test {
  implicit val a = new A()
  val b = new B()
  b(3) += 4 // compile-time error here
}