Practical fault detection in puppet programs

Puppet is a popular computer system configuration management tool. By providing abstractions that model system resources it allows administrators to set up computer systems in a reliable, predictable, and documented fashion. Its use suffers from two potential pitfalls. First, if ordering constraints are not correctly specified whenever a Puppet resource depends on another, the non-deterministic application of resources can lead to race conditions and consequent failures. Second, if a service is not tied to its resources (through the notification construct), the system may operate in a stale state whenever a resource gets modified. Such faults can degrade a computing infrastructure's availability and functionality. We have developed an approach that identifies these issues through the analysis of a Puppet program and its system call trace. Specifically, a formal model for traces allows us to capture the interactions of Puppet resources with the file system. By analyzing these interactions we identify (1) resources that are related to each other (e.g., operate on the same file), and (2) resources that should act as notifiers so that changes are correctly propagated. We then check the relationships from the trace's analysis against the program's dependency graph: a representation containing all the ordering constraints and notifications declared in the program. If a mismatch is detected, our system reports a potential fault. We have evaluated our method on a large set of popular Puppet modules, and discovered 92 previously unknown issues in 33 modules. Performance benchmarking shows that our approach can analyze in seconds real-world configurations with a magnitude measured in thousands of lines and millions of system calls.


INTRODUCTION
The prevalence of cloud computing and the advent of microservices have made the management of multiple deployment and testing environments a challenging and time-consuming task [8,23,25,34]. Infrastructure as Code (IaC) methods and tools automate the setup and provision of these environments, promoting reliability, documentation, and reuse [34]. Specifically, IaC (1) boosts the reliability of an infrastructure, because it minimizes the human intervention which is both laborious and error-prone; (2) ensures the predictability and consistency of the final product, because it eases the repetition of the steps followed to produce a specific outcome; and (3) allows the documentation and reuse of a system's configuration, because it associates the system's configuration with modular code [15,23,34,38,40].
Puppet [21] is one of the most popular system configuration tools used to manage infrastructures [28,32]. It abstracts the state of different system entities such as files, users, software packages, or running processes, in a declarative manner using built-in primitives called resources. A Puppet program consists of a collection of resources that the underlying execution engine applies one-by-one so that the system eventually reaches the desired state.
By default, any execution sequence of resources is valid, unless there are specific ordering constraints imposed by their interdependencies, e.g., an Apache service should be run only after the installation of the corresponding package. Developers need to declare these ordering constraints in the program to avoid erroneous execution sequences, such as trying to start a service before the installation of its package. Conceptually, Puppet captures all the ordering relationships defined in a program through a directed acyclic graph and applies each resource in topological ordering. In this context, all the unrelated resources are processed non-deterministically. Furthermore, Puppet allows programmers to apply certain resources whenever specific events take place via a feature called notification. Notifications propagate changes to related resources, ensuring that their state is up-to-date. For instance, when a configuration file changes the corresponding service has to be notified so that it will run with the new settings.
Tracking all the required ordering constraints and notifications is a complicated task though, mostly because developers are not always aware of the actual interactions of Puppet with the underlying operating system. Notably, such errors can have a negative impact on the reliability of an organization's infrastructure leading to inconsistencies [32] and outages [10]. For example, Github's services became unavailable when a missing notifier in their Puppet codebase caused a chain of failures, such as dns timeouts [10].
Approaches that automatically detect these mistakes in production code [13,32] have significant room for improvement, facing limitations that prevent them from being practical. Rehearsal [32] employs static code verification to detect faults in Puppet programs. Nevertheless, it cannot manage realistic cases because it is unable to handle Puppet resources that abstract arbitrary shell commands. Notably, such resources (e.g., exec) are highly pervasive as they appear in the 56% of the top-1000 most widely used Puppet modules found in the Forge api [27]. Additionally, the model-based testing approach adopted by Citac [13] imposes a significant overhead (analyzing around 100 modules with Citac takes roughly 9 days) and restrictions on the supported Puppet programs under test (they must be able to run in Docker containers). It also requires extra instrumentation to be added in the execution engine of Puppet. Finally, none of those tools addresses missing notification faults.
We have developed a practical and effective approach to identify faults involving ordering violations and notifiers in Puppet programs. To do so, we record the system call trace produced by a single Puppet execution. Then, we operate as follows. First, we model the system call trace of a Puppet execution in a representation (namely, FStrace) that allows us to precisely capture the interactions of higher-level programming constructs (Puppet resources) with the file system. By examining their interplays, we infer the set of the expected relationships of every Puppet resource with each other. These relationships correspond to either notifications (e.g., resource x should notify resource y) or ordering constraints (e.g., x should run before y). Then, for the given Puppet program, we statically build the dependency graph that reflects all the ordering relationships and notifications that have been specified by the developer. Finally, we verify whether the expected relationships (as specified from the analysis of traces) hold with respect to the dependency graph. Unlike previous tools [13,32], our approach (1) can reason about which system resources are affected by the program's execution and how, and (2) requires only a single Puppet run for discovering issues. Contributions. Our work makes the following contributions: • We introduce FStrace, a representation for system call traces that models the intricate semantics of system calls and their effects on the file system. Building upon FStrace, we propose a novel trace analysis that allows us to infer the inter-relationships among Puppet resources (Section 3).

OVERVIEW
We provide a brief overview of Puppet, two motivating examples that demonstrate the types of errors our approach detects, and how our approach is structured.
Puppet. Puppet allows developers to describe the desired state of a system through a declarative specification language. For example: 1 $service_name = "apache2" 2 $conf_file = "/etc/apache2/apache2.conf" 3 package {$service_name: ensure => "installed"} 4 file {$conf_file: ensure => "file"} 5 service {$service_name: ensure => "running"} The code above indicates that the apache2 package should be installed in the host, the file apache2.conf should exist in the /etc/apache2/ path, and that the Apache server should be running. There are diverse types for abstracting system resources, including file, package, service, exec. Also, the Puppet language provides variable declarations that begin with the "$" symbol (e.g., $service_name, $conf_file), conditionals, loops, and-for reusability-supports the creation of custom resources and classes.
Puppet code is stored in files called manifests. Puppet compiles manifests into executables called catalogs. Catalogs are json documents that specify all the resources that Puppet needs to apply in a particular system to reach the desired state [19]. The following json snippet shows a part of the catalog derived from the previous Puppet code: "title": "/etc/apache2/apache2.conf", 3 "type": "File", 4 "parameters": { "ensure": "file" } 5 }, { /* another resource... */ } ] The field resources contains the Puppet resources declared in the initial program along with their parameters. During catalog compilation, every variable defined in the manifests resolves to its value, e.g., the variable $conf_file resolves to /etc/apache/apache2.conf (line 2 of the earlier Puppet code).
Puppet evaluates the compiled catalogs and applies potential changes if the system is not in the appropriate state. For example, if a file does not exist at a certain location, Puppet will create it.
Motivating Examples. We present two examples of faulty programs that demonstrate the issues that our approach addresses.
Missing Ordering Relationships (MOR) occur when a developer fails to define a happens-before relation between two Puppet resources that depend on each other. This can lead to unstable code that behaves correctly in some circumstances, but breaks in others depending on the order in which Puppet processes resources.
1 package {["libssl", "apache2"]: ensure => "latest" } 2 file {"/etc/apache2/apache2.conf": File["/etc/apache2/apache2.conf"] ] 10 } Figure 2: A program missing a notifier. whose contents are specified by the content parameter at line 4. Note that Puppet evaluates the file resource after package. This is expressed through the require property at line 5. In lines 7 to 10, the program declares the execution of a shell script (mysqld --initialize) that prepares the database according to the settings specified in the /etc/mysql/my.cnf file. Although the shell command needs to be invoked only after the file /etc/mysql/my.cnf is configured (lines 2-6), the require parameter at line 9 omits this dependency. Therefore, applying the exec resource before file makes Puppet set up the database with incorrect settings. A static analyzer (such as Rehearsal) cannot extract this dependency because it is unable to infer that the underlying shell command (mysqld --initialize) consumes the database configuration file.
Missing Notifiers (MN). Notifiers are necessary for services. An update to a resource (e.g., configuration file) could directly affect the state of a service. To ensure that all services are running on a fresh environment, Puppet triggers the restart of a service whenever there is a change to one of the service's dependent resources via notifications declared by the programmers.
A missing notifier issue is illustrated in Figure 2. The code first installs the latest version of the libssl and apache2 packages (line 1), configures the file located at /etc/apache2/apache.conf (lines 2-5), and then, boots the Apache server (lines 6-10). The subscribe primitive (line 8) creates a notifier that restarts Apache whenever there is a change to the corresponding configuration file or an update to the apache2 package (e.g., a newer version is installed in the system). However, the code lacks a notifier from the libssl package to the Apache service. A change to libssl requires the restart of Apache so that the server maps the updated version of the library to its memory (i.e., Apache maps the /usr/lib/libssl.so file created during the installation of the libssl package). Failing to do so makes the server not get the latest updates or security patches of the library. Once again, a static analyzer is not capable of inferring such dependencies (i.e., the fact that Apache depends on /usr/lib/libssl.so), because they are hidden from the corresponding manifests. Framework. To address these issues, we propose a frameworkillustrated in Figure 3-that consists of the following components. The executor applies the Puppet manifests given as input by invoking the actual Puppet execution engine. Also, the executor intercepts Puppet as follows. First, it stores the compiled catalog of the program before it is applied to the system, and second, it monitors the system calls of the Puppet process and its descendants, generating the system call trace that stems from the catalog application.
The analyzer (Section 3) operates on the system call trace produced by the executor, and performs two steps. In the first step (parser), the analyzer splits system calls into different blocks corresponding to the execution of every Puppet resource defined in the initial program. Then, it parses the system call trace and transforms it into an FStrace representation, which is used to model the semantics of every system call and their effects on the file system. In the next phase (interpreter), the analyzer evaluates the resulting FStrace program and infers the set of relationships (i.e., ordering constraints and notifications) between the declared Puppet resources. To do so, it inspects their interactions with the file system in terms of the files consumed, produced, expunged by their execution.
Finally, the fault detector (Section 4) first builds the dependency graph by examining the parameters of every Puppet resource specified in the compiled catalog given as input. The dependency graph is a directed acyclic graph that contains all the actual ordering relationships and notifications declared in the original Puppet program by the programmer. Then, the fault detector compares the generated graph against the expected relationships inferred from the output of the analyzer. If a mismatch is identified, the fault detector reports a potential fault.
Trace Example. In order to generate traces, the executor employs a system call tracing program [22,30], namely, strace. Each line denotes an invocation of a system call along with the process (pid) that triggered it. For example, the entry 103 close(7) = 0 states that the process with id = 103, invoked close with 7 as an argument, and that system call returned 0. By further inspecting Figure 4, we observe that Puppet initially processes the Exec[Initialize MySQL DB] resource (lines 1-6), and then the File[/etc/mysql/my.cnf] resource (lines 7-12). Observe the calls of write at lines 1, 6-7, 12 that correspond to messages printed to the standard output by Puppet engine for debugging purposes. These messages indicate the points where the application of each resource starts and ends respectively. The analyzer exploits these points to classify system calls according to the Puppet resource they come from (Section 3.2).

ANALYZING SYSTEM CALL TRACES
To tackle the complexity and interactions of realistic system call traces (Section 3.1), we design our trace analysis as an interpretation over a trace language used to abstract each system call and map it to the Puppet resource it comes from (Section 3.2). The output of the analysis is the set of the expected relationships (ordering constraints and notifications) among declared resources. To produce this output, the analysis examines the file system interactions stemming from Puppet execution in terms of the files that are consumed, produced, or expunged (Section 3.3). The analysis output is later used by our fault detector (Section 4).

Motivation of Design Choices
Previous work on trace analysis has focused on extracting dependencies from build scripts (e.g., Make for C/C++ programs) for testing and refactoring purposes [11,20], or identifying license inconsistencies in software projects [37]. These approaches cannot apply to the domain of Puppet for the following two reasons.
Low Fidelity. Typically, the system call traces that come from the execution of Puppet manifests are far more complex than those of build scripts. Indeed, Puppet traces involve the application of diverse entities-such as, execution of arbitrary scripts, configuration and management of different services, installation of packages, and more-that apart from simply reading and writing files, they perform many operations on the transient OS structures (e.g., file descriptor table, process table, etc.).
For a precise trace analysis (e.g., correctly resolving the absolute file paths that a system call works on), we need to accurately track all the operations performed on these structures. This requires a careful design to deal with the complexities and semantics of the system calls as well as the underlying structure of the file system. For instance, many processes spawned by Puppet share their file descriptor table or working directory with their child processes. 1 Hence, any update to any of those entities performed by one process, also affects the other one. By ignoring this behavior, processes will hold stale information about their file descriptor table and working directory.
As another example, consider the following trace: 1 100 open("/usr/lib/perl5", O_RDONLY) = 3 2 100 rename("/usr/lib/perl5", "/usr/lib/perl") = 0 3 100 openat(3, "5.26/Socket6.pm", O_RDONLY) = 4 To determine the absolute path that openat operates on (/usr/lib/perl/5.26/Socket6.pm), we have to interpret the file 5.26/Socket6.pm relative to the directory related to the file descriptor 3. To do so, we need to consider that in a Unix-like file system, every file is associated with an inode rather than a path. Thus, the file-related OS structures (e.g., file descriptor table) have to refer to inodes instead of path names. Although the correct resolution of absolute paths is an important issue, existing approaches ignore the organization of the file system. For example, they either cannot resolve the absolute path of the openat system call [37] (they do not support file descriptors), or they describe files through their paths [20]. The latter makes the corresponding file descriptor table hold the stale entry (3, /usr/lib/perl5), after the rename at line 2.
Granularity. Given the system call trace of a build script such as Make, previous approaches assume that every build operation is performed through a separate process [20,37]. In this context, the analysis estimates the dependencies among source files (e.g., the object file foo.o depends on the header file bar.h) by computing the files read and written by each process, and taking into consideration that each output file depends on every input file. To verify the inferred dependencies, the existing work [20] triggers incremental builds by touching the input files, and then checks whether the output files are indeed changed in response to the updates of their dependencies.
However, the granularity of processes is not effective for domains such as Puppet, where the same system process may involve different operations (Puppet resources). In such cases, the existing work [20] is not able to distinguish which file system resources are affected by which operation (Puppet resource)-and therefore, it is unable to infer the precise inter-relationships among execution steps (Puppet resources).
We design an approach on analyzing system call traces that overcomes the limitations of the previous approaches as follows.
To tackle low fidelity, we introduce FStrace, a representation that enables us to: (1) translate every system call into primitives that conquer the large number and complexity of posix (Unix/Linux) system calls by decomposing them into simpler building blocks (Section 3.2), and (2) formally model the effects of system calls on the file system and the transient OS structures (Section 3.3.1).
To address granularity, we split the main process that governs the execution (Puppet process) into different blocks that indicate the boundaries of every operation (Puppet resource). This allows us to infer the inter-relationships between all Puppet resources by examining their interactions with the underlying operating system (Section 3.3.2). In turn, this enables us to combine (as explained in Section 4) the trace analysis output (low-level analysis) with the program's relationships inferred statically by analyzing compiled Puppet catalogs (high-level analysis). This makes our approach efficient, as we are able to detect faults by monitoring only one Puppet execution, i.e., we do not need to apply Puppet manifests multiple times (as in the case of incremental builds) to verify the relationships extracted by the trace analysis. Note that our treatment of system call traces is generic and can be applied to other domains such as Make or Java Maven. In this case, the boundaries of every execution phase correspond to the application of every build rule.

Modeling System Call Traces
The first step of our analyzer is to parse a given system call trace and transform it into an FStrace representation. FStrace primitives are designed to model system calls that operate on file system resources. Some of the constructs are generic enough so that they can represent a family of system calls. Complex system calls are represented with a number of FStrace primitives-something that decouples their intricacies from each other. We group system calls into execution blocks, and we assign a unique id to each of them.
The syntax of FStrace is shown in Figure 5. It consists of file names, paths-which are sequences of file names-and file descriptors represented by either an integer or the at_fdcwd construct. We also include: (1) the constructs fd and cwd that indicate what kind of entities a spawned process shares with its parent, (2) primitives (consumed, produced, expunged) that stand for the types of the effect that a system call has on a file, and (3) an infinite set of unique identifiers for execution blocks. A trace is a sequence of blocks. A block has the following syntax: begin b (z, s) * end, where b is its id and (z, s) * is a sequence of trace entries. Each pair (z, s) is a process id (pid), which is a positive integer, and a system call.
FStrace models every system call s ∈ Sys using eleven constructs. We have setcwd that changes the working directory of the current process, and three primitives for performing operations on file descriptors: (1) newfd creates a new file descriptor and relates it to the given path p, (2) delfd deletes the provided file descriptor from the corresponding table of the process, and (3) dupfd copies a given file descriptor and is used to model a number of system calls, such as dup-like system calls or fcntl(fd, F_DUPFD). FStrace supports hard and symbolic links through the link and symlink constructs respectively, while it offers newproc for spawning new processes. FStrace models operations on file paths explicitly through the hpath primitive. hpath d p m captures the effect m that a system call has on the path p. We consider p relative to the file descriptor d, when p is not an absolute path. When d is at_fdcwd, we interpret p as relative to the current working directory. hpath models the system calls that work on file paths. For instance, we represent the system call mkdir("foo/bar")-which creates a new directory at path foo/bar-as hpath at_fdcwd ("foo", "bar") produced. The hpathsym primitive operates in a way similar to hpath. In hpathsym though, if the given path is a symbolic link, we do not dereference it. Through hpathsym we express system calls that do not follow symbolic links such as lstat, lchown, lgetxattr. The rename primitive arranges that an existing path is accessed through a new file path. Finally,   FStrace has a dedicated construct (nop) to model all system calls (e.g., write, sync) that we do not need to take into account.
To leverage FStrace, the analyzer classifies system calls according to the applied Puppet resource that triggered them. Our analyzer presumes that an execution block begins or ends when the evaluation of the corresponding resource starts or terminates, because Puppet processes every resource atomically. In this context, the name of the execution block corresponds to the name of the Puppet resource. It is easy to identify the points where the evaluation of a Puppet resource starts/finishes by decoding the Puppet's debug messages. Recall from Figure 4 that those messages appear in the execution traces as writes to the standard output. During trace parsing, the analyzer detects those debug messages and marks them as the entry and exit points of execution blocks.
For example, consider again the trace in Figure 4. We can model the trace entry at line 1 as the entry point of an execution block whose name is "Exec[Initialize MySQL DB]", whereas the system call at line 6 signals the ending of that execution block. Hence, all system calls that appear between lines 1 and 6, are included in this block. Figure 6 shows the complete FStrace representation of the trace shown in Figure 4. Notice that some system calls are represented through a number of FStrace primitives. For instance, we model open("/etc", O_RDONLY)=3 with hpath to indicate that we consume the file /etc, and newfd to associate the provided path with the file descriptor 3 returned by open.

Interpreting FStrace Programs
To infer the ordering and notification relationships among Puppet resources, the analyzer enumerates the set of files consumed, produced or expunged in every execution block. This is done by interpreting the FStrace program produced by the parser (Section 3.2).

Domains and Semantics.
We define a semantics for FStrace that we use as a base for our interpretation. Figure 7 illustrates the semantic domains of FStrace. The FStrace state consists of six components: An inode table τ ∈ I NodeT is a map of a pair, consisting of an inode and a file name to another inode. The first element of the pair is the inode of the directory where the file name exists. An inode is a positive integer that acts as the identifier for a certain file system resource. Note that we also keep the special inode ι r which corresponds to the inode of the root directory "/". A file descriptor table π ∈ FdT maps an identifier and a file descriptor to an inode. We use this component to map the open file descriptors of a process to the resource they handle. The CwdT element maps a unique identifier to an inode. That inode stands for the current working directory of a process.
Each process points to a pair of unique identifiers (see the domain ProcT ). The first element of the pair is the identifier corresponding to the file descriptor table of the process. The second element of the pair reflects the identifier that stands for the current working directory of the process. This part of the state allows us to model the case where two different processes might share the same file descriptor table or working directory. For example, in the following entries: [(z 1 → (α 1 , α 2 )), z 2 → (α 1 , α 3 )], the processes z 1 and z 2 point to the same file descriptor table because the first elements of their pairs are identical (α 1 ). Similarly, since their second identifiers do not match (α 2 α 3 ), we presume that they do not share the same working directory; thus, any change imposed by one process does not affect the other one.
We use the table κ ∈ SymT to store symbolic links. Each symbolic link is an inode that points to a file path. The last component of FStrace (ρ ∈ FSAcc) maps path names to an element of the power set of blocks and effects. Specifically, this component tracks where and how each path is accessed. For example, the entry /foo → {(produced, b 1 ), (consumed, b 2 )} indicates that the path /foo is produced in the block b 1 and consumed in b 2 . We exploit this component later on to extract the ordering and notification relationships of every block with each other. The state ⟨τ , π , ϕ, ν, κ, ρ⟩ is a tuple consisting of the six components described above.

The relation
b,e − −− → indicates that given a trace entry e (a pair of a pid and a system call) in execution block b, the initial state ⟨τ , π , ϕ, ν, κ, ρ⟩ transitions to a new state ⟨τ ′ , π ′ , ϕ ′ , ν ′ , κ ′ , ρ ′ ⟩. The binary operator :: denotes the addition of an element to a set, while ↓ i manifests the projection of the i th element. The function Ab(d, p, . . . ) gives the absolute path of the path p relatively to the given file descriptor d. The function I (p, τ ) computes the inode that the path p points to based on the inode table τ . For example, the [hpath] rule records the effect m that a system call has in the current execution block b by updating the ρ component of the state.

Inferring
Relationships. Based on the state derived from the interpretation of an FStrace program, we now formally define the ordering and notification relationships between two Puppet resources. To achieve this, we exploit the computed file accesses performed in every block as defined in the resulting state (ρ ∈ FSAcc). Recall that the component ρ shows the set of files that are consumed, produced and expunged inside every execution block.
Ordering Relationship. For ordering relationships we consider that a block b 1 producing a certain file p must precede a block b 2 that consumes or expunges the same file p. If this is not the case-and assuming that b 2 does not create p-there is an ordering violation. Formally, given the ρ instance found in the FStrace state we define the ordering relationship ≺ ρ , which states that the block b 1 comes before b 2 , as follows: Consider again the program of Figure 6. After interpreting it, the analyzer outputs the following relationship: file ≺ ρ exec, where exec stands for the block Exec  Figure 9: The dependency graph of the program of Figure 1.
the Definition 1, the creation of the file must be processed fist, so the analyzer presumes that the block file must precede exec. Notification Relationship. In order to define the notification relationship we first need to identify pairs of execution blocks where the application of the first element should trigger the application of the second one. In the context of Puppet, such relationships involve service resources. Specifically, we look for blocks that produce a particular resource p. These blocks must have notification relationships with service-oriented blocks consuming the resource p. Formally, based on the component ρ ∈ FSAcc of the state, we introduce the notification relationship ρ that represents that the block b 1 notifies b 2 :

DETECTING FAULTS
Having introduced our approach for analyzing traces, we locate faults in Puppet manifests by combining the analysis output with the compiled catalog of a program. Our fault detection method performs two tasks: (1) given a catalog, it builds the dependency graph, a directed acyclic graph that captures all the ordering relationships and notifications declared by the developer (Section 4.1), and then, (2) it verifies that the relationships inferred by the trace analysis step appear in the dependency graph (Section 4.2).

Dependency Graph Construction
We define the dependency graph as DG = (V , E), where V is the set of Puppet resources, E ⊆ V × V × L is the set of edges, and L = {before, notify}, a set of labels that we assign to every edge.
An edge before → from a source node s to a target t indicates that Puppet applies s before t. An edge notify → denotes that, apart from preceding t, s also notifies the target whenever there is an update to itself. When a node t is reachable from s, we presume that the application of s happens before that of t. On the other hand, a node notifies a target when they are connected with a path where all edges are notify → . This is explained by the fact that the notify → edges transitively trigger updates to all the intermediate nodes between the source and the target resource.
To construct the graph, we parse a program's catalog, and we examine the parameters of every Puppet resource (recall Section 2). Specifically, given the parameters of a resource p, we create edges as follows.
• p has the parameter "before":v. This indicates that the resource p is applied before every resource included in the value

Algorithm 1 Detecting Faults
Require: ρ ∈ F S Acc, д ∈ DG 1: for all (b 1 , b 2 ) ∈ V × V do end if 8: end for of "before". In this case, we add a before → edge from p to every element of the list v. • p has the parameter "require":v. This indicates that Puppet processes p after every element included in the value of "require". Thus, we create a before → edge from every element of v to p. • p has the parameter "notify":v. The same as the "before" parameter, but this time, we create notify → edges. • p has the parameter "subscribe":v. The same as the "require" parameter, but this time, we create notify → edges. Figure 9 depicts the dependency graph of the program of Figure 1. We observe that the configuration file (i.e., the node conf) has neither an ordering nor a notification relationship with the service, because the corresponding nodes are not connected to each other.

Fault Detection
Algorithm 1 summarizes our fault detection approach. The algorithm expects as input the relationships of every Puppet resource as specified by the analysis of traces, and a dependency graph д ∈ DG generated by the previous step. For every ordering relationship between two resources (b 1 ≺ ρ b 2 ), the algorithm consults the dependency graph to determine whether there is path from b 1 to b 2 , i.e., it checks whether this ordering relationship actually appears in the program. If this is not the case, the algorithm reports a missing ordering relationship, that is, b 1 must be applied before b 2 . To identify missing notifiers, the algorithm operates in a similar manner. In this case though, the algorithm is interested in paths that contain only notify → edges (i.e., the function hasNotificationPath, line 8).
As as an example, recall that the analyzer yielded the file ≺ ρ exec relationship when it examined the trace file of Figure 4. The algorithm verifies this relationship by viewing the dependency graph of Figure 9. It then reports a missing ordering relationship because there is not a path from file to exec.
Remark: Observe that our approach does not make any assumption about the execution order imposed by Puppet, i.e., it does not require the resources with missing ordering relationships to be applied in the right order so that it infers their inter-dependencies. It is clear from the example trace of Figure 4 (where Puppet executes resources in the erroneous order) that our method is still able to observe that exec depends on file and eventually report the fault.

IMPLEMENTATION
We have developed a prototype that implements our approach in OCaml. The tool provides a command-line interface, and takes as input the path where the main Puppet manifest is located. It then stores the compiled catalog of the program, and executes it using strace to collect the system call trace. In turn, the tools employs the trace analyzer and fault detector as described in Sections 3, and 4.
We have implemented our method with efficiency in mind. Our tool is able to handle gb-sized traces with reasonable time and space requirements (see Section 6.4). This was made possible through a number of optimizations, such as the use of streams to process and analyze traces, a reversed inode table to lookup paths based on their inodes, and function memoization.
Currently, our tool has only been tested on Linux distributions. Applying our approach to other platforms (i.e., by modeling the corresponding system calls to FStrace) is straightforward. Also, as we discuss this in Section 6, our tool may produce false positives when two Puppet resources operate on the same file, but they are commutative to each other, i.e., the application order does not matter. Even though commutative pairs are not so common (see Section 6.5), we plan to address this issue in future work by examining Puppet catalogs to identify such pairs. For instance, we could identify resources whose execution is conditional, and depends on the presence of conflicting resources.

EVALUATION
We evaluate our framework by examining a large number of Puppet modules in order to answer the following research questions.
RQ1 Is the proposed approach effective for finding faults in Puppet manifests? (Section 6.2) RQ2 What are the main patterns of the detected faults? (Section 6.3) RQ3 What is the performance of our approach? (Section 6.4)

Experimental Setup
We collected a large number of Puppet modules taken from Forge api and Github. We were particularly interested in non-deprecated modules that support Debian Stretch, because Debian is one of the most popular Linux distributions [1]. We inspected the top-1000 modules returned by the Forge api that satisfied our search criteria. We used Docker to spawn a clean Debian environment efficiently. Then, we automatically ran every module separately through the include <module-name> statement. 2 We monitored the Puppet process and collected the system call trace of every program via strace. Through this process, we successfully applied 354 Puppet modules in total. The remaining modules failed because they required extra arguments or further setup before their invocation. For example, many of the failing modules required multiple preinstalled packages, or in other cases we needed to infer specific values for the modules' arguments including ips of dns servers and urls of specific upstream directories. Note that the failing modules and the successful ones were pretty similar in terms code size, popularity, and age in Puppet Forge. Also, the list of the successfully analyzed modules contains well-established ones, including modules developed by popular organizations, such as Puppet Inc., and Vox Pupuli. 3 Finally, for every Puppet module that succeeded, we  ran each step of our approach (trace analysis and fault detection) and logged the reports generated by our tool.
To compute the performance of our approach we ran the trace analysis and fault detection steps ten times to get reliable measurements. By examining the standard deviation, we observed that the running times did not vary significantly among different executions. All the experiments were run on a machine with an Intel i7 2.2ghz processor with 12 logical cores and 16gb of ram.

Fault Detection Results
Our framework detected 92 previously unknown faults in 33 Puppet modules. Table 1 presents the analysis results for each module. To the best of our knowledge, this is the first study that led to the disclosure of such a large number of faults in Puppet repositories. Our tool marks 70 out of 92 faults as missing ordering relationships (column mor). The remaining faults are related to missing notifiers (column mn). Remarkably, our tool found faults in modules that are widely used by the Puppet community e.g., puppetlabs-apache (> 9000k downloads), and deric-zookeeper (> 4500k downloads).
Based on the reports of our tool, we manually verified that each reported fault can lead to a problematic situation by reproducing 1 $package_path = "/tmp/al-agent" 2 exec {"download": 3 command => "/usr/bin/wget -O ${package_path} ${pkg_url}", 4 } 5 package {"al-agent": 6 ensure => "installed", 7 provider => "dpkg", 8 source => $package_path, 9 } Figure 10: An ordering violation between package and exec. each case. Specifically, we repeatedly applied every manifest in a clean container until Puppet applied resources in the wrong order, leading to a failure or an inconsistent state. Only few trials were needed (1-3) for that. In turn, we submitted fixes to the developers. The development teams of 25 projects confirmed and fixed 63/92 issues in total. The developers welcomed our patches. Notably, in some projects (e.g., deric-zookeeper, Slashbunny-phpfpm, cirrax-dovecot, and more), the developers provided instant bugfix releases after the approval of our patches. This indicates that our tool detects faults that are meaningful to developers.

Fault Patterns
Below, we categorize and discuss some of the faults identified by our tool. Most represent previously unknown to us fault patterns which we learned through our tool. Notably, these detected faults can lead to crashes, inconsistent states, and data loss.

Missing Ordering Relationships.
We have observed three types of missing ordering relationships issues.
Generate-Use Violation. The use of a resource must always succeed its creation. Many modules fail to preserve that ordering relationship. Consequently, the execution of Puppet may complete with failures, when resources are applied in an erroneous order. We observed this violation in 16 Puppet modules such as alertlogic-al_agents, hardening-os_hardening, and more. Figure 10 shows a fragment from alertlogic-al_agents [17]. The code first fetches a .deb package (a Debian archive) using the wget command (lines 2-4). The package is stored in the path specified by the $package_path variable whose value is /tmp/al-agent. Then, the code installs the Debian archive on the system (lines 5-9) through the dpkg package management system. It is easy to see that the package depends on the exec, because it requires $package_path (the .deb file) to exist in the system (line 8) so that it can install the package successfully. Otherwise, when Puppet processes package before exec the application of the catalog fails with the following error: "dpkg: error: cannot access archive "/tmp/al-agent": No such file or directory".
Configure-Use Violation. The configuration of a file must precede its use. For example, when a service starts, all the files consumed by that service have to be properly configured. This category differs from the previous one because when a Puppet resource attempts to use the file, the latter exists in the system. However, this is not in the expected state (e.g., the file does not have the right contents, permissions, etc). This error pattern appears in five modules, including saz-locales, istlab-stereo, and vpgrp-influxdbrelay. Figure 1 illustrates a program with an issue related to this category. When the shell script is invoked, the configuration file is guaranteed to be there because package creates it during installation. However, it is possible that exec does not read the desired contents of the /etc/mysql/my.cnf file specified by content => "user db settings.." (line 4), because there is a missing ordering relationship between file and exec. Note that this category-unlike the previous one-may lead to errors that are difficult to debug as the application of the catalog does not produce any error messages.
API Misuse. Many Puppet modules expose an api that other modules rely on to build their functionality. These apis may establish some constraints that the dependent modules need to respect to achieve the intended functionality. As with traditional software [2,18], failing to do so can have a negative impact on the reliability of applications. In particular, in Puppet, api misuses can lead to missing dependencies and race conditions. Eight modules (such as puppet-proxysql) do not properly use the api of their dependencies, causing the ordering violations reported by our tool.
For example, the puppetlabs-apt module provides an interface for managing apt 4 sources and keys. The api of this module includes-among other things-the apt::source resource and the apt::update class. The former is used for adding new repositories to the list of apt sources, while the latter retrieves all the essential information about the newly-added repositories by executing the apt update command. The puppet-proxysql module employs the apt::source resource to add the http://repo.proxysql.com/ repository from which it installs proxysql (via the package resource). The documentation of the puppetlabs-apt's api [26] states: "If you are adding a new source and trying to install packages from the new source on the same Puppet run, your package resource should depend on Class['apt::update'], as well as depending on the Apt::Source resource". However, the developers of puppet-proxysql consider only the Apt::Source dependency in their code, i.e., they omit the Class[apt::update] dependency. As a result, the code may crash with an "Unable to locate package proxysql" message, when Puppet tries to install proxysql, before invoking the apt update command first. The developers of puppet-proxysql immediately confirmed and fixed this fault.

Missing Notifiers.
We have identified three different categories of issues related to notifiers.
Configuration Files. A configuration file must always send notifications to a service, so that any change to that file triggers the restart of the corresponding service. Although this is a standard pattern, we observed that in six modules (shown in rows 9, 11, 12, 19, 20, 32 of Table 1) this is not the case.
Log Files. Typically, services log various events in dedicated files. For instance, the log file of an Apache server records every incoming http request. Log files are essential for debugging and monitoring purposes [35,Item 56]. When a service starts, it opens a corresponding log file, which remains open, while the service is up, to write any events that take place.
We discovered issues related to logging in two popular Puppet modules (puppetlabs-apache, and deric-zookeeper). These  Figure 11: The trace analysis and fault detection time as a function of the trace size. Each spot shows the average time spent on both the trace analysis and fault detection phases for a given trace obtained by the execution of a module. modules declare the log files for the apache and zookeeper services in their manifests. However, the log files do not have a notifier for their associated services. This may lead to data loss. Consider the case where the log file of a service is removed or renamed. When we remove or rename an open file, the underlying system call (unlink or rename) only changes the file entry, not the inode. This means that although the filename disappears from the file system and Puppet creates a new one, the service still uses a file descriptor that points to the inode of the original file. The issue is that after removal, the inode typically becomes an orphan (i.e., it is not linked with any file), which means that it is no longer accessible through a file path. Therefore, in the case of a missing notifier, the log history of the upcoming events is lost because the service writes to an orphan inode. To fix that issue, the log file should notify the service so that the service opens the newly-created log file. The developers of both projects confirmed this kind of fault.
Packages. When Puppet applies a package resource, the service that depends on that package should restart. This ensures that a service gets all the necessary updates, including, security patches, new features, and more. Our tool identified this kind of issue in twelve modules, including example42-apache, and puppet-telegraf. Specifically, the package resources that are responsible for installing Apache, and telegraf do not notify the running instances whenever there is a new version of those packages. Figure 11 shows the running times (in seconds) of the trace analyzer and fault detector relatively to the size of the provided traces (in mb). We observe that the correlation between the trace size and analysis time is almost linear. Notably, our framework is able to handle a large volume of traces (more than 1.2gb) in a reasonable amount of time (< 40 seconds). The average trace size and analysis time of the inspected modules is 72mb and 2 seconds respectively.

Performance
There are 4 cases out of 354 where the execution times were relatively high compared to the remaining modules. However, they all remain in acceptable limits. By examining the characteristics of the traces coming from these modules, we observed that they contain more unlink system calls than the rest of the modules. Such calls involve more expensive operations on the analysis state.
For collecting traces using strace, our tool imposes a 1.68 times slowdown on Puppet, on average. Unlike existing work [13], though, our approach requires only a single Puppet execution to locate faults. Overall, we argue that the overhead of our tool is relatively small, and our approach can be used as part of the testing process for Puppet manifests.

False Positives
Beyond the actual faults listed in Table 1, we have identified 17 false positives in 9 out of 354 Puppet modules. Fourteen false alarms are related to commutative resources reported as missing ordering relationships. For example, in the claranet-varnish module [5], the developers use two different resources to partially configure a certain file. On the one hand they use file to set the permissions and ownership of the file, and on the other, they use exec to initialize its contents. In this case the execution order in which Puppet processes resources does not matter. Specifically, Puppet can first use exec to create the file with the desired contents, and then apply file to set the appropriate file's attributes, or vice versa. However, as observed in the inspected modules, configuring a file through the combination of resources is not particularly common.
Only three false positives are associated with missing notifiers. The developers of bodgit-dbus [7] use a custom command (expressed via exec) to reload the configuration of the service. Consequently, the configuration files notify the exec resource instead of service. We did not observe this case elsewhere, because Puppet programmers typically employ the restart parameter of the service type to define a custom restart command in the following manner: "service { restart => "/custom/cmd", . . . }".

RELATED WORK
Our work is related to three research areas, namely quality in IaC, trace analysis, and modeling of file systems operations.
Quality in IaC. With the proliferation of the IaC process, there have been numerous attempts to identify defects and quality concerns in configuration code.
A number of studies focus on maintainability issues. Sharma et al. [33] design and implement a code-smell detection scheme for Puppet, which searches for issues related to naming conventions, code design, indentation, etc. Their findings suggest that such anti-patterns-as in the traditional programs-exist in many IaC repositories. Van der Bent et al. [36] introduce a quality model for Puppet programs which is empirically evaluated by interviewing practitioners from industry. Schwarz et al. [31] do similar work focusing on Chef recipes. Endeavors have recently moved to the identification of security issues. Rahman et al. [28] define and classify security smells into seven categories (such as hard-coded passwords, use of weak cryptographic algorithms), and then build a tool for statically detecting these smells in Puppet repositories.
Other studies attempt to extract error patterns and source code properties from the analysis of defective IaC programs. Rahman et al. [29] employ machine learning and text processing techniques to identify properties that faulty Puppet programs hold. Then, they build a prediction model for asserting whether IaC scripts manifest faults or not. Chen et al. [4] identify error patterns in Puppet manifests by following a different approach. First, they inspect the code changes from repositories' commits. Second, they construct an unsupervised learning model to detect error patterns based on the clustering of the proposed fixes. Their approach is based on the assumption that similar faults are fixed with similar patches [12].
There are few automated techniques proposed for improving the reliability of configuration management programs. Rehearsal [32] statically verifies that a given Puppet configuration is deterministic and idempotent. Rehearsal models a given Puppet manifest in a small language called fs and then it constructs logical formulas based on the semantics of each language's primitive. Then, an smt solver decides whether the initial program is non-deterministic or not. Compared to our approach, Rehearsal is less effective and practical. Specifically, Rehearsal employs a form of static analysis that can only handle a subset of Puppet programs. For example, the analysis does not support exec resources because it is unable to reason about the file system resources that shell commands process. Unlike Rehearsal, our approach works by reasoning actual system calls rather than Puppet manifests; thus, it can effectively determine which files are affected by a Puppet run and how.
Other advances [13,16] adopt a model-based testing approach for checking whether configuration scripts meet certain properties. Hummer et al. [16] focus on testing the idempotence of Chef scripts. Their proposed framework generates multiple test cases that explore different task schedules. By tracking the changes in the system trigged by a Chef script, they determine if idempotence holds for the given program. Hanappi et al. [13] extend the work of Hummer et al. and introduce Citac; a framework that can be applied to Puppet manifests to examine the convergence of programs. Convergence states that the system reaches a desired state even at the presence of failed Puppet resources. They formally express the properties of idempotence and convergence, and through test case generation, they verify if the provided manifests violate those properties. Contrary to Citac, we adopt a more lightweight approach applying manifests only once. Finally, neither Rehearsal nor Citac detect issues involving missing notifiers.
Unfortunately, a side-comparison between our tool and the stateof-the-art was neither feasible nor fair. Citac works on an instrumented version of Puppet 3.7. However, Puppet 3.x is not used any longer, and the vast majority of modules no longer support Puppet 3.x. Furthermore, as already discussed, Rehearsal is unable to analyze modules that use the exec resource. More than a half of the inspected modules contain exec, let alone their dependencies. Also, Rehearsal cannot statically analyze modules whose source code consists of multiple files. Finally, in most cases, Rehearsal's analysis aborts prematurely due to some bugs, e.g., its custom parser fails to parse complex Puppet constructs. For these reasons, Rehearsal cannot analyze any of the modules listed in Table 1.
Tortoise [39] employs a semi-automated approach for repairing faulty Puppet scripts. In particular, it monitors the fixes made by administrators via command-line, and follows a program-synthesisbased technique to generate a patch for the corresponding Puppet manifests. This patch reflects all the file system changes made by administrators through the execution of shell commands. Extending our approach to automatically generate fixes for the reported bugs is an interesting direction, and we leave it as future work. Based on the detected issues, we could identify where to apply the fix by inspecting the Abstract Syntax Tree (ast) of manifests to add missing dependencies and notifiers.
Trace Analysis. Analysis of system call traces has been widely used in the past, especially in the context of dependency inference for builds scripts. Many existing approaches are used for refactoring and testing [11], boosting performance [3,6] of builds, or detecting license inconsistencies in open-source projects [37]. As discussed in Section 3.1, most of them do not deal with the intricacies of system calls. This degrades the precision and effectiveness of the analysis. Also, unlike existing work, our trace analysis enables us to compute relationships in higher-level of abstraction, i.e., Puppet resources.
Modeling File System Operations. Several researchers have designed specifications for the posix file system [9,14,24]. The specifications mainly focus on program reasoning and verification. Shambaugh et al. [32] have introduced fs; a small language used to model the effects of Puppet resources on the file system. In this work, we model system calls rather than Puppet resources.

CONCLUSION
We have introduced an effective and practical approach for identifying missing dependencies and notifiers in Puppet manifests. Our method collects the system calls invoked by a Puppet program and models them in FStrace. Through FStrace, we design a trace analysis that captures how higher-level programming constructs, such as Puppet resources, interact with the operating system. This enables us to infer their inter-relationships, and check these relationships against the program's dependency graph for potential mismatches.
The effectiveness of our approach is exemplified by the uncovering of 92 previously unknown issues in 33 projects, including well-established ones, such as puppetlabs-apache. Notably, 63 of them were confirmed and fixed by the developers. We have further showed that our tool can handle realistic traces in a couple of seconds. Our results indicate that our tool can be used as part of the testing process for Puppet programs.
FStrace is a generic model that can be applied to other domains with partially ordered constructs. Consequently, future studies can build on our work to detect concurrency faults in many other areas.