Reducing DNN Properties to Enable Falsification with Adversarial Attacks

Deep Neural Networks (DNN) are increasingly being deployed in safety-critical domains, from autonomous vehicles to medical devices, where the consequences of errors demand techniques that can provide stronger guarantees about behavior than just high test accuracy. This paper explores broadening the application of existing adversarial attack techniques for the falsification of DNN safety properties. We contend and later show that such attacks provide a powerful repertoire of scalable algorithms for property falsification. To enable the broad application of falsification, we introduce a semantics-preserving reduction of multiple safety property types, which subsume prior work, into a set of equivalid correctness problems amenable to adversarial attacks. We evaluate our reduction approach as an enabler of falsification on a range of DNN correctness problems and show its cost-effectiveness and scalability.


I. I n t r o d u c t io n
As the performance and applicability of Deep Neural Net works (DNN) continues to increase, their deployment has been explored for use in safety-critical domains, such as autonomous vehicles [9], [15], [35] and medicine [17], [23], [56]. As a result, checking the correctness of DNNs has emerged as a key challenge to ensuring the safety of these systems. For example, for the DroNet DNN that predicts a steering angle and a probability of collision for an autonomous quadrotor system [35], one such correctness property may specify that if the probability of collision is low, then the steering angle should not be extremely large.
A complementary approach to verification is that of falsifi cation, which attempts to find violations of a property. While verifiers can show that a property is true, falsifiers can often find violations more quickly than verifiers when the property is false. Falsification is an active area of research spanning prop erty types and application domains [1], [4], [16], [18]. In the context of DNNs, one can view adversarial attacks [26], [32], [36], [37], [49], [50] as falsifiers for DNN local robustness properties. While these techniques often scale to real-world DNNs, they are currently limited in the range of properties they can falsify. For the DroNet example, adversarial attacks support the scale and complex structure of the DNN, but are designed only to find violations of robustness properties and not safety properties like those relating speed and probability of collision. As we discuss in §II-A, there is a broad range of properties that could benefit from the speed and applicability of these falsifiers. Driven in part by the cost of DNN verification, and in part by the limited property support for DNN falsification, we identify the key insight that many valuable property types could be reduced into the more commonly supported form of local robustness properties. We build on this insight to develop an approach for reducing properties in an expressive, general form to an equivalid set of local robustness properties, which have wide support among both falsifiers and verifiers. Such translation has the potential to bring existing techniques to bear on falsifying DNN properties, leaving verifiers to focus on proving that a property holds. our community has exploited property reduction for verifi cation and program analysis for decades. Perhaps best known is the reduction from stateful safety properties to reachability properties. For example, partial order reductions were broad ened in applicability by reducing stateful properties to a form of deadlock [25], and both data flow analyses [40], [48] and SAT solving [8] were applied to verify stateful properties by formulating the reachability of error states. A second use of reductions is to enable more efficient algorithmic methods to be employed. For instance, the -s a f e t y option of the SPIN model checker permits it to use a significantly faster reachability algorithm [27]. Such reductions are now consid ered standard in verification and program analysis. In the new domain of DNN verification and falsification, however, the lessons of such reductions have not yet taken root.
In this paper, we introduce an approach for reducing a DNN and an associated safety property -which we refer to as a correctness problem -into an equivalid set of correctness Fig. 1: Proposed approach reduces a DNN and its safety prop erty into an equivalid set of correctness problems formulated with robustness properties that can be processed by falsifiers.
problems formulated with robustness properties that can be processed by existing adversarial techniques. Figure 1 provides an overview of the approach. By preserving validity, the trans lation supports both falsification approaches, such as adver sarial attack algorithms, and existing verification techniques. The approach is fu lly automated which allows developers to specify properties in a convenient form while leveraging the complementary strengths of falsification and verification algorithms.
The primary contributions of this work are: (1) an automated approach for the reduction of DNN correctness problems as an equivalid set of robustness problems; (2) an implementation of our approach that employs a portfolio of falsifiers; and (3) a study demonstrating that property reduction yields costeffective violations of general DNN correctness problems.

II. B a c k g r o u n d
This section presents prior work on DNN property specifi cation and approaches for their falsification.

A. Properties o f DNNs
Given a DNN, J\f : M™ -» Rm, a property, defines a set of constraints over the inputs, f y -the pre-condition, and a set of constraints over the outputs, f y -the post condition. Checking property attempts to prove or falsify: Mx e R " : <t>y{x) -» <f>y{J\f(x)).
A survey on verification of neural networks [33] identifies different representations for the input and output constraints used by verification techniques; two of these representations are particularly useful in this work. A hyperrectangle is an n-dimensional rectangle where constraints are formulated as (xj > Ibf) A (xj < ubj), where lbj,ubj € M and 0 < i < n define the lower and upper bounds on the value of each dimen sion of x, respectively. A special case of hyperrectangles used in our approach is the u n it hypercube which is a hyperrectangle where Vi.(Z6,-= 0)A(w6; = 1); an ;?-dimensional hypercube is denoted [0 ,1]™. A halfspace-polytope is a polytope which can be represented as a set of linear inequality constraints, Ax < b, where A e Rfc ' ™ , be Rfc, k is the number of constraints and n is the dimension of x.
Using such encodings, researchers have specified a range of desirable properties of DNNs. Here we distinguish three broad categories: robustness, reachability, and differential properties.
Robustness properties originated with the study of adver sarial examples [50], [62]. Robustness properties apply to classification models and specify that inputs from a specific region of the input space must a ll produce the same output class. Robustness properties can be further classified as either local or global robustness; the former asserts robustness in a local region of the input domain and the latter over the entire input domain. Detecting violations of robustness properties has been widely studied, and they are a common type of property for evaluating verifiers [21], [46], [47], [51], [54].
R eachability properties define the post-condition using con straints over output values rather than output classes, and are thus not lim ited to classification models. Such properties have been used to evaluate DNN verifiers [29], [54]. Reachability properties specify that inputs from a given region of the input space must produce outputs that lie in a given region of the output space. For example, a DNN model controlling the velocity of an autonomous vehicle may have a safety property specifying that the model never produces a desired velocity value greater than the vehicles maximum physical speed for any input in the input domain. Sim ilarly to robustness, reach ability properties can be further classified as local or global. For example, a global halfspace-polytope reachability (GHPR) property would specify a halfspace-polytope constraint on network output values that must hold for all inputs.
D iffe re n tia l properties are the most recently introduced DNN property type [42]. These properties specify a difference (or lack thereof) between outputs of multiple DNNs. One type of differential property is equivalence, which states that for every input, two DNN models produce the same output. Such a property can be used to check that DNN semantics are preserved after some modification, such as quantization or pruning. Differential properties can be supported by combining multiple DNNs into a single network and expressing properties over their combined input and output domains.
In addition to these three categories and as alluded earlier, properties can also be classified by the form of their input pre-condition. G lo b a l properties have the most permissive pre condition, enforcing the post-condition for any input in the input domain of the DNN. For example, a DNN that operates on images may accept values in [0,1]™. The pre-condition of a global property would not restrict this domain any further. L o ca l properties only enforce the post-condition for inputs within a designated region of the input domain. For example, a local property for an image processing network may have the precondition that inputs are within distance s of some given input x. This is especially common in robustness properties.

B. A d ve rsa ria l Attacks and Fuzzing
One approach to checking properties of DNNs is through the use of algorithms that seek to find examples that violate a given specification for a given model. Two categories of techniques have been developed for DNNs that can be used to falsify DNN property specifications.
A d v e rs a ria l attacks are methods that are optimized to de tect violations of robustness properties [2], [62]. In general, adversarial attacks take in a DNN model and an in itia l input, and attempt to produce a perturbation that, when applied to the input, w ill change the class predicted by the given model. These perturbations are often also subject to some constraints, such as remaining within a given distance of some original input. A perturbed input, commonly known as an adversarial example is a violation to a local robustness property. To our knowledge, adversarial attacks are a method of falsification that only supports the falsification of robustness properties. Adversarial attacks can be classified based on characteristics of the attack, such as if they are white-box [26], [32], [36], [37], [50] or black-box [49]; targeted [26], [50] or untargeted [32], [37]; iterative [32], [36], [37] or one-shot [26], [50]; or by their perturbation constraint (e.g., Lo [49], Lg [14], or L TC [26], [50]). A more exhaustive taxonomy and description of existing adversarial attacks is available in the literature [2], [62].
Fuzzing involves randomly generating inputs within a given input region (often the fu ll input space), and checking whether the outputs they produce violate a specified post-condition. Fuzzing is more general than adversarial attacks, in that it can support the falsification of more than robustness properties, but requires specifying input mutation functions and objective functions (essentially an output oracle), for every type of property that needs support. Examples of existing fuzzing techniques include TensorFuzz [39] and DeepFIunter [60].

III. A p p r o a c h
The primary goal of our approach is to amplify the power of falsifiers, such as adversarial attacks, by increasing their applicability. Our approach takes in a correctness problem comprised of a DNN and a property, and encodes it as an equivalid set of robustness problems, which then enables us to run a portfolio of methods that are applicable to this restricted problem class to uncover general property violations.

A. D efining Property Reduction
A correctness problem is a pair, fi = (J\f, <f>), of a DNN, J\f, and a property specification < f> , formed to determine whether Jf |= < fi is va lid or invalid.
Reduction, reduce : 'F -» P(\F), aims to transform a correctness problem, (J\f, fi) = fi e T', to an equivalid form, As we demonstrate in §IV, reduction enables the application of a broad array of efficient DNN analysis techniques to compute problem validity and/or invalidity.
As defined, reduction has two key properties. The first property is that the set of resulting problems is equivalid with the original correctness problem (a proof of this theorem is included in Appendix A).
The second property is that the resulting set of problems a ll use the same property type, i.e., robustness; they all assert that Af(x)o is the output class for all inputs. Applying reduction enables verifiers or falsifiers to support a large set of correctness problems by implementing support for this single property type. We chose to reduce to robustness properties due to their broad support among existing falsifiers and verifiers.

B. Overview
To illustrate, consider a property for DroNet [35]; a DNN for controlling an autonomous quadrotor. Inputs to this network are 200 by 200 pixel grayscale images with pixel values between 0 and 1. For each image, DroNet predicts a steering angle and a probability that the drone is about to collide with an object. The property states that for all inputs, if the probability of collision is no greater than 0.1, then the steering angle is capped at ±5 degrees and is specified as: Adversarial attacks cannot be used o ff the shelf to falsify this property, since it is not a robustness property.
To enable the application of adversarial attacks, we reduce the property to a set of correctness problems with robustness properties, such as the one shown in Figure 2. This partic ular example is reduced to two correctness problems with robustness properties. Each of the problems pair a robustness property (shown in the bottom of Figure 2) with a modified version of the original DNN. The new DNN is created through two key transformations. First, incorporating a prefix network (shown in green in Figure 2) to reduce the input domain to a unit-hypercube. This modification ensures that the properties for reduced problems can all use the same pre-condition. Second, incorporating a suffix network (shown in blue in Figure 2) that takes in the inputs and outputs of the original DNN and classifies whether they constitute a violation of the original property. This suffix transforms the network into a classifier for which violations of a robustness property correspond to violations of the original property.

C. Reduction Transform ation
We rely on three assumptions to transform a correctness problem into a reduced form. First, the constraints on the network inputs must be represented as a union of convex polytopes. Second, the constraints on the outputs of the Third, we assume that each convex polytope is represented as a conjunction of linear inequalities. Complying with these assumptions still enables properties to retain a high degree of expressiveness as unions of polytopes are extremely general and subsume other geometric representations, such as inter vals and zonotopes. §IV-A shows that these assumptions are sufficient to support existing DNN correctness problems. Algorithm 1 defines the reduction transformation at a high level. We present each step of the algorithm and describe their application to the DroNet example described above.
1) Reformat the Property: Reduction first negates the origi nal property specification and converts it to disjunctive normal form (DNF) -line 2. Negating the specification means that a satisfying model falsifies the original property. The DNF representation allows us to construct a property for each disjunct, such that i f any are violated, the negated specification is satisfied and thus the original specification is falsified. For each of these disjuncts the approach defines a new robustness problem, as described below.
2) Transform into halfspace-polytopes: Constraints in each disjunct are converted to halfspace-polytope constraints, de fined over the concatenation of the input and output do- mainsdisjunct _to _hpolytope() on line 5. This conversion is described in Algorithm 2. A halfspace-polytope can be represented in the form Ax < b, where A is a matrix of k rows, where each row represents 1 constraint, and d columns, one for each variable. In this case, d is equal to m + n, the size of the output space, plus the size of the input space. This representation facilitates the transformation of constraints into network operations. To build the matrix A and vector b, we first transform all inequalities in the conjunction to < inequalities with variables on the left-hand-side and constants on the righthand-side. The transformation first converts > to < and > to < -lines 4-7 of Algorithm 2. Then, all variables are moved to the left-hand-side and all constants to the right-hand-side -line 8. Next, < constraints are converted to < constraints by decrementing the constant value on the right-hand-sidelines 9-10. This transformation assumes that there exists a representable number with greatest possible value that is less than the right-hand-side. Finally, each inequality is converted to a row of A and value in b -lines 11-12.

3) Prefix Construction:
Using the constructed halfspacepolytope, Algorithm 1 next constructs a prefix to the original network to ensure the input domain of the resulting network is [0,1]n, where n is the input dimensionality of the original network -construct_prefix() on line 6. The algorithm to construct the prefix is shown in Algorithm 3. The prefix is constructed by first extracting lower and upper bounds for every input variable -lines 2-8. This extracts the minimal axis-aligned bounding hyperrectangle. The lower and upper bounds can then be used to construct the prefix network, which is a single n-dimensional fully-connected layer, with no activation function, which has a diagonal weight matrix with values equal to the ranges of the input variables, and biases equal to the lower bounds of each input. The prefix operates on unit hypercubes, reducing the input space to the correctness problems. The prefix also encodes any interval constraints over the original input space, allowing them to be removed before suffix construction, which simplifies the suffix networks. For the DroNet example, the diagonal of this matrix is a vector of ones, while the biases are all 0.
Next, the original input values are forwarded to the end of the original network and concatenated with the original output layer -line 7. Because constraints w ill be encoded as a network suffix that classifies whether inputs are property violations, this step is necessary to enable the encoding of constraints over the inputs.
4) Suffix Construction: The suffix subnetwork classifies whether inputs satisfy the specification -construct_suffix() on line 8. The algorithm for constructing the suffix from the halfspace-polytope constraints is shown in Algorithm 4. The constructed suffix has two layers, a hidden fully-connected layer with ReLU activations, and dimension equal to the number of constraints in the halfspace-polytope defined by the current disjunct, and a final output layer of size 2.
The hidden layer of the suffix has a weight matrix equal to the constraint matrix, A, of the halfspace-polytope represen tation, and a bias equal to -b -line 2. With this construction, each neuron w ill only have a value greater than 0 if the corresponding constraint is not satisfied, otherwise it w ill have a value less than or equal to 0, which becomes equal to 0 after the ReLU activation is applied. In the DroNet problem for example, one of the constraints for a disjunct is ( N( x ) s < -5°). For this conjunct we define the weights for one of the neurons to have a weight of 1 from N ( x ) s, a weight of 0 from N (x)P, and a bias of 5°.
The output layer of the suffix has 2 neurons, each with no activation function. The first of these neurons is the sum of all neurons in the previous layer, and has a bias value of 0. Because the neurons in the previous layer each represent a constraint, and each of these neurons is 0 only when the constraint is satisfied, if the sum of all these neurons is 0, then the conjunction of the constraints is satisfied, indicating that a violation has been found. The second of these neurons has a constant value of 0 -all incoming weights and bias are 0. The resulting network w ill predict class 1 if the input satisfies the corresponding disjunct and class 0 otherwise. 5) Correctness Problem Construction: Lines 9-11 of A l gorithm 1 define the reduced subproblem comprised of the network that we have constructed and a robustness property. The robustness property specification is always the same and states that the network should classify all inputs in the ddimensional hypercube as class 0 -no violations. If a violation is found to this property, then, according to Theorem 2, the original property is violated by the unreduced input that violated the robustness property. In the end, we have generated a set of correctness problems such that, if any of the problems is violated, then the original problem is also violated. This comes from our construction of a property for each disjunct in the DNF of the negation of the original property.

D. Properties Over Multiple Networks
While Algorithm 1 is defined over properties with a single network, it can easily be applied to properties over multiple networks, by combining those networks into a single large network. This is specially relevant to check for equivalence properties. This can be done by concatenating their input and output vectors. This results in a single large network with a computation path for each network. The transformation algorithm can then be applied as before.

E. Implementation.
We implemented our approach in a system named DNNF1, which accepts a DNN property specification and correspond ing DNN as input, and returns whether a violation is found. Whereas the reduction algorithm in §III applies to properties with unions of polytopes as input constraints, the current implementation works on unions of hyperrectangles in the input space. This was a convenience choice to simplify the implementation while still accommodating most properties in the verification literature, as demonstrated in §IV-A.

IV. Em p i r i c a l Ev a l u a t io n
We now assess the cost-effectiveness of reducing DNN properties for falsification by applying it to a range of DNN property benchmarks that provide diversity in terms of prop erty types and DNN complexity. our evaluation w ill attempt to answer the following research questions: • RQ1: How expressive are the properties supported by property reduction? • RQ2: How cost-effective is falsification at finding prop erty violations? • RQ3: How scalable is falsification?

A. RQ1: On the Expressiveness o f Reduction
We first evaluate whether the assumptions about the prop erty specification required by reduction, namely that the original property is specified as a logical formula of linear inequalities, is expressive enough to support DNN correctness properties that have been proposed in existing work.
1) Setup: To evaluate the expressiveness of properties sup ported by our reduction, we analyze and catalog the bench marks used by the five verifiers used in our later study, as well as the benchmarks of a recent DNN verifier competition, VNN-Comp [34]. Additionally, we surveyed published papers on DNN verification in the past two years identifying 4 additional works [6], [22], [52], [53]. Finally, we include the 2 new benchmarks introduced in this work.
2) Results: We summarize the results in Table I, which lists the benchmarks used in each work, the type and number of properties in the benchmark and whether the properties are supported by Algorithm 1 and our current implementation. The property types use abbreviated names with the following encoding: the first symbol indicates whether the property is global (G) or local (L); the second symbol indicates whether 1https://github.com/dlshriver/DNNF the input constraint can be represented as a hyper-rectangle ( ) or not (K); the third symbol indicates whether the property is a robustness (r) property, a reachability (R) property, or a differential (D) property. Each cell under a property type indi cates the number of properties in the corresponding benchmark of that type. The bolded benchmarks are used later in the study for the evaluation of RQ2 and RQ3. We describe the details of these benchmarks in more detail below.
The next benchmark is from the evaluation of the Planet verifier. First is the Collision Avoidance benchmark, which consists of 500 safety properties that check the robustness of a network that classifies whether 2 simulated vehicles w ill collide, given their current state. A ll 500 properties are L Q properties, and are all fully supported. Second is a set of 7 properties on an MNIST network. The first 4 of these are G^R properties, while the next 2 are L^r properties, and the final property is an LK r property. In addition to restricting the amount of noise that can be added to each pixel in the input image, the final property constrains the difference in the noise between neighboring pixels. DNNF currently supports 6 of these properties, while the final is supported by Algorithm 1.
The Neurify verifier was evaluated on the ACAS Xu bench mark and on properties of 4 MNIST networks, 3 android app malware classification networks, and 1 self-driving car network. The evaluation on MNIST used 500 L^r properties across 4 networks, all of which we support. Neurify was also evaluated on 3 networks trained on the Drebin dataset [5] to classify apps as benign or malicious. This benchmark also in cludes 500 L^r properties, which are fully supported. Finally, Neurify was evaluated on local reachability properties for a modified version of the DAVE self-driving car network [10]. This benchmark consists of 200 local reachability properties, with 4 different types of input constraints (50 properties of each type). The first type of input constraint is an constraint, which is equivalent to a hyper-rectangle constraint.
The second type of input constraint is an L 1 constraint, which can be written as a halfspace polytope constraint. The third and fourth type of input constraint is an image brightness and contrast, which can also be written as a halfspace polytope constraints. DNNF currently supports the first 50 of these properties and the remainder are supported by Algorithm 1.
The DeepZono abstract domain of the ERAN verifier used in our study [46], was evaluated on 3300 L^r properties applied to a set of 24 MNIST networks and 13 CIFAR10 networks. A ll of the properties in this benchmark are fully supported by our approach.
The ReluDiff verifier was designed to support differential properties in order to show equivalance between two net works [42]. The verifier was evaluated on L^D properties. Each property was defined over a network, N and a modified version of the same network with quantized weights, N ' . The property checked whether |N (x ) -N '( x ) | < e held in a local region of the input space. 14 of these properties were verified over networks from the ACAS Xu benchmark [29], 200 properties on networks trained with the MNIST dataset, and 100 properties on a network trained for Human Activity Recognition [3]. A ll 314 differencing properties are fully supported by our approach.
The recent VNN-Comp competition used 3 benchmarks. The first is a benchmark with properties applied to networks with piecewise linear activation functions. This benchmark consists of the ACAS Xu benchmark [29] with 4 L Q prop erties and 6 L^R properties, as well as a set of 50 local robustness properties with hyper-rectangle input constraints applied to 3 MNIST networks. A ll of these properties are supported by our approach. The second is a set of 300 local robustness properties with hyper-rectangle input constraints applied to convolutional neural networks trained on MNIST and CIFAR10. A ll of these properties are supported by our approach. The final benchmark is a set of 32 local robustness properties with hyper-rectangle input constraints applied to neural networks with non-linear activation functions (sigmoid and tank) trained on MNIST. A ll of these properties are supported by our approach.

Several DNN verifiers have been introduced recently. The nnenum verifier [6]
and an abstraction-refinement approach for DNN verification [22] were evaluated on the ACAS Xu benchmark. The reachability set representation of ImageS tars [52] was evaluated on two benchmarks of local robustness properties applied to MNIST and ImageNet networks. The benchmark on the MNIST networks used a version of 900 local robustness where pixels could be independently dark ened, enabling input constraints to be represented as hyper rectangles. The benchmark on the ImageNet networks uses 6 properties created from an original image and a corresponding adversarial example. The properties specify that for a given re gion along the line between the original image and adversarial example, all inputs along the segment are classified as the cor rect class. While the MNIST benchmark is supported by our current reduction implementation, the ImageNet benchmark requires polytope constraints in the input space and is therefore supported just by Algorithm 1. The NNV verifier [53] also introduced a benchmark with an adaptive cruise control (ACC) system. It checks a temporal property not currently supported by Algorithm 1, but we see the potential to support such properties through unrolling in future work.
overall, we find that the property specifications accepted by Algorithm 1 are rich enough to express 7 of the 8 property types found in the explored benchmarks.
When considering the listed benchmarks, Algorithm 1 fully supports 16 of the 17 benchmarks. our current implementation completely supports the properties from 13 of the 17 bench-TABLE I: Property types of existing benchmarks and their support by reduction. The property type names use the following encoding: the first symbol indicates a global (G) or local (L) property; the second symbol indicates whether the input constraint can be represented as a hyper-rectangle ( ) or not (K); the third symbol indicates the property class as robustness (r), reachability (R), or differential (D). Bolded benchmarks are used later in the study to evaluate RQ2 and RQ3. marks, and supports a subset of the properties in 2 additional benchmarks. our results also show that the current space of DNN properties has limited diversity, with most benchmarks consisting primarily of local robustness properties. This points to the value added of the new benchmarks we introduce. It is also expected, as has happened in the verification community in the past, that as verification and falsification techniques improve, developers w ill want to apply them to reason about a broader range of correctness properties. The proposed algo rithm w ill enable them to do that, even if verifiers and falsifiers do not directly support them.

B. RQ2: On the Cost-Effectiveness o f Reduction-Enabled Fal sification
To evaluate the cost-effectiveness of falsification enabled by the proposed reduction, we identify a set of falsifiers and verifiers to compare their complementary performance, problem benchmarks, and metrics that constitute the basis for the studies around RQ2 and RQ3. 1) Setup: Falsifiers. As falsification methods, we w ill use several common adversarial techniques, as well as a DNN fuzzing tool. For adversarial attacks, we choose a subset of the methods from two surveys [2], [62]. We select the methods common to both surveys with L^ input constraints (which matches our implementation) and with implementations avail able in the cleverhans tool [41]. The chosen adversarial attacks are LBFGS [50], FGSM [26], Basic Iterative Method (BIM) [32], and DeepFool [37]. o f these attacks, none use random initialization, and thus w ill produce the same result over multiple runs. In order to observe the potential benefits of random initialization, we also include Projected Gradient Descent (PGD) [36], which was only included in one of the surveys. Therefore, we run each attack, except PGD, once, and if no adversarial example is found, we return an unknown result. For PGD, if no adversarial example is found, we try again, until one is found, or the given time limit is reached. We use the default parameters for each attack, as specified by cleverhans. For DNN fuzzing, we use TensorFuzz [39] for its easily accessible implementation [38]. TensorFuzz requires the definition of an oracle for recognizing property violations. We provide a version of TensorFuzz with an oracle that identifies violations by checking whether N (x )0 < N (x )12.
Verifiers. For comparison to verification, we select four veri fiers: Reluplex [29], Planet [21], ERAN using the DeepZono abstract domain [46], and Neurify [54]. Neurify and ERAN have been shown to be fastest and most accurate in recent studies [61], and all four verifiers are supported by DNNV, which makes them easy to run and allows us to use a common property specification for all verifiers and falsifiers. For differential properties we also consider ReluDiff [42] since it is currently the only verifier built to handle such properties. Portfolios. In addition to the individual falsifiers and verifiers, we simulate portfolios of these methods, which run analyses in parallel and return the first result. We use 3 portfolios: All Falsifiers, which includes the 6 falsifiers described above; All Verifiers which includes all verifiers run on each benchmark; Total which includes all methods used in this study. To simulate running each portfolio, we take the union of the violations found by each method in the portfolio, and consider the time to find each violation to be the fastest time among the methods in the portfolio which found that violation. Problem Benchmarks. We evaluate our approach on two common and representative benchmarks from the verification literature, and two created for this work to provide a range of networks and property types. our selection criteria was meant to achieve two objectives. First, we wanted to select enough benchmarks to explore all property types with hyper-rectangle input constraints. Second, we wanted to select benchmarks 2https://github.com/dlshriver/tensorfuzz with networks that varied in both size and structure since these factors have been shown to affect verifier performance [61].
From the verification literature, we select ACAS Xu, the most commonly used benchmark, and a slightly modified version of the Neurify-DAVE benchmark. For Neurify-DAVE, we select the 50 L^R properties supported by our current implemen tation, and we augment the benchmark with an additional network. The new network is the original DAVE DNN on which the smaller network in the benchmark was based. While the small DNN has 10277 neurons, the original DAVE network that we add has 82669 neurons, which w ill allow us to explore the scalability of reduction and falsification. The two networks in this benchmark are convolutional networks and are much larger than the networks in the ACAS Xu benchmark.
We developed 2 new benchmarks to cover property types that are not yet covered by existing benchmarks. The GHPR benchmark is a new DNN property benchmark that contains G^R properties applied to several network architectures of varying size and structure. It consists of 30 correctness prob lems, 20 of which are 10 GHPR properties applied to 2 MNIST networks, and 10 of which are GHPR properties applied to the DroNet DNN described previously. The DroNet DNN is one of the largest in our study, with more than 475,000 neurons. The MNIST properties are of the form: for all inputs, the output values for classes a and b are closer to one another than either is to the output value of class c. The DroNet properties are of the form: for all inputs, if the probability of collision is between pmin and pmax, then the steering angle is within d degrees of 0. These properties are described in more detail in the supplementary material3.
The CIFAR-EQ benchmark is a new DNN property bench mark with differential properties applied to large networks with complex structures. It contains a mix of both global and local equivalence properties. It is the only benchmark to contain G^D properties, which were absent in the property benchmarks that we found. It consists of 291 properties:  [31]. The first network is a large convolutional network with 62,464 neurons and the second is a ResNet-18 network with over 588,000 neurons. These properties are described in more detail in the supplementary material. Because the verifiers do not support the multiple computation path structure formed during network composition, we do not run them on this benchmark. Metrics. For each verification and falsification approach, we w ill measure the number of property violations found and the total time to find each violation. The total time to find a violation includes both the time to transform the property, as well as to run the falsifier on the resulting properties. Computing resources. Experiments were run on compute nodes with Intel Xeon Silver 4214 processers at 2.20 GHz and 512GB of memory. Jobs were allowed to use up to 8 processor cores, 64GB of memory, with a time limit of 1 hour.
2) Results: Figure 3 shows the number of violations found by each verifier and falsifier method on the five benchmarks. The y-axis is the proportion of non-verified properties for which the techniques could find violations. We eliminated correctness problems that were known to be unfalsifiable. For ACAS this leaves 37 correctness problems, and does not reduce any of the other benchmarks. The number above each bar in the plots indicates the number of violations found. An exclamation point indicates that the verifier could not be run due to the architecture of the networks being verified.
The ACAS Xu benchmark with its simple and small DNN models, often used in verifier evaluation, showcases where ver ifiers perform best today. However, even in this benchmark we notice that falsification can complement verification, finding an additional 3 violations.
On the Neurify-DAVE benchmarks, the verifiers find only 33 violations from the 100 DNN correctness problems, while the falsifiers find 82 violations, subsuming the 33 violations from the verifiers. The best performing falsification method on this benchmark was BIM, with 74 violations found. PGD and FGSM follow closely with 73 and 69 violations found, re spectively. TensorFuzz, the top performing falsification method on the ACAS benchmark, does not find any violations. We conjecture that this is due to the much larger input space. While the ACAS Xu networks have an input dimension of 5, the DAVE networks have an input dimension of 30000, which is more difficult to cover by random fuzzing. On the GHPR MNIST benchmark, the verifiers can find 17 violations for the 20 properties, while 3 falsifiers, BIM, PGD, and TensorFuzz, can find violations for every property.
On the GHPR DroNet benchmark, the verifiers cannot find any violations, due to not supporting the residual block structures present in the DroNet network. Many of the falsifiers also struggle on these properties, except for PGD and BIM, which can find violations to all 10 properties.
Finally, on the CIFAR-EQ benchmark, the verifiers did not find any violations because they could not be run. Reluplex, Planet, ERAN, and Neurify do not support properties over multiple networks or networks with multiple computation paths, while ReluDiff is limited to networks with only fullyconnected layers. Additionally, while PGD finds the most violations, it is complemented by the other falsification ap proaches, with BIM, DeepFool, and TensorFuzz each finding violations for at least 1 unique property. We conjecture that much of PGD's success is due to its random initialization, which allows it to be run multiple times with different results, increasing the chance of finding a violation.
Note that the Planet and ERAN verifiers find no violations for any benchmark. For these benchmarks ERAN cannot find violations since its algorithmic approach focuses on proving that a property holds, which suggests its complementarity with falsification methods. Planet fails to find violations due to the complexity of the problem, and internal tool errors that cause Planet to crash on almost 20% of the correctness problems. We also see that the Reluplex verifier only finds violations for the ACAS Xu benchmark. It cannot find violations on the other benchmarks, since it does not support the architectures of the networks in those benchmarks.
Overall, we find that falsifiers can detect many property violations usually complementing those found by veri fication, that applying them in a parallel portfolio can leverage their unique strengths, and that they successfully scale to more complex benchmarks.
Box plots of the distributions of time to find violations for each method are shown in Figure 4. Figure 4a shows that the verifiers can be effective on the ACAS Xu benchmark, with Neurify often out performing the falsifiers. This is likely due to the extremely small size of the ACAS Xu networks enabling verification to run efficiently. These plots also show the efficiency struggle of the verifiers as the network get larger. For example, on the Neurify-DAVE benchmark, even when the verifiers can find a property violation, the falsifiers can find violations an order of magnitude faster. For more complex benchmarks, the verifiers cannot find violations within the timeout, so we only report the time for the falsifiers.
We find that falsification can efficiently find property violations even for the most complex benchmarks, with a median time to find a violation across all benchmarks and falsifiers of 16 seconds.
Figures 3 and 4 also reveal that no single falsifier always outperforms the others. While PGD performs well for the benchmarks studied here, we can still increase the number of violations by running mutliple falsifiers. Additionally, the falsifiers that find the most violations, do not necessarily always find them the fastest. Based on these two observations, we recommend using a portfolio approach, running many falsifiers in parallel and stopping as soon as a violation is found by any technique, such as the A ll Falsifiers method shown in the previous figures. This approach finds all the violations found by the verifiers as quickly as the fastest falsifier. We also recommend using falsifiers in conjunction with verifiers, since while falsifiers can often quickly find violations, they cannot prove when a property holds. 2) Results: We present the results of checking the proper ties in Figures 5a and 5b, as well as the box plots of the times to find violations for each method in Figures 5c and 5d.
On the smaller DAVE network, the verifiers struggle to verify the properties. Reluplex does not run at all, due to its lack of support for convolutional layers, while Planet does run, but reaches the timeout for all properties. The ERAN verifier does not timeout on the small network, but cannot verify any of the properties. Neurify was the only verifier that returned accurate results on the small network, successfully falsifying 33 of the 50 properties, and reaching the time limit on the other 17. While only a single verifier was able to falsify any properties, 4 of the 6 falsification approaches were able to falsify properties, all of them finding more violations than Neurify. The falsifiers were also faster than Neurify, finding violations almost an order of magnitude faster than Neurify on the small DAVE network.
While one verifier was able to find violations on the smaller network, none of the verifiers were able to find violations on the larger DAVE network, which has more than 8 times more neurons. Similar to the small network, Reluplex does not support the network structure, while Planet reaches the time limit for all properties. However, ERAN and Neurify both perform slightly differently. While ERAN was previously able to finish its analysis on the small network, it reaches the time limit for 34 properties on the large network, indicating that it could not scale to the larger network size. Similarly, while Neurify previously found property violations for the small network, it reaches the memory limit on the large network before any violations are found. The falsifiers on the other Neurify-DAVE benchmark. The number above each An exclamation point indicates that a verifier could hand still perform well, with 3 of the 6 verifiers finding property violations. Surprisingly, the DeepFool falsifier goes from 44 violations found on the small network, to 0 violations on the large DAVE network. We conjecture that this may be due to the use of the default parameters for DeepFool, and that adjusting these parameters may yield better results. Additionally, the falsifiers show only a minor increase in the time needed to find a violation, from a median time of 20.2 seconds to 20.7 seconds.
Overall we find that, on the benchmarks explored here, DNN property reduction scales well to larger networks, and enables the application of scalable falsification ap proaches such as adversarial example generation.

V. Co n c l u s io n
In this work we present an approach for reducing DNN correctness problems to facilitate the application of falsifiers, in particular adversarial attacks, for finding DNN property violations. We implement our approach and apply it to a range of correctness problem benchmarks and find that 1) the reduction approach covers a rich set of properties, 2) reducing problems enables falsifiers to find property violations, and 3) since falsifiers tend to have different strengths, a portfolio approach can increase the violation finding ability. In future work we plan to extend DNNF to support the full range of properties supported by our algorithmic approach, perform a systematic evaluation of what factors may influence falsifier performance, and explore how reduction can also broaden the applicability of verifiers.

Da t a Av a i l a b i l i t y
We make DNNF available at https://github.com/dlshriver/ DNNF, and we provide an artifact containing the tool, as well as the data and scripts required to replicate our study at https: //doi.org/10.5281/zenodo.4439219.
Proof. Let f (x) = cP 1 ai xi We first show that every lin.
ineq. in the conjunction can be reformulated to the form f (x) < b. It is trivia l to show the ineq. can be manipulated to have variables on lhs and a constant on rhs, that > can be manipulated to an equivalent form with <, and > can be manipulated to become <. The < comparison can be changed to a < comparison by decrementing the rhs constant from b to b' where b' is the largest representable number less than b. We prove ineq. with < can be reformulated to use < by contradiction. Assume either f ( contradiction, since f (x) cannot be both larger than the largest representable number less than b and also less than b.4 Or b < f (x) < b^ a contradiction, since b' < b by definition.
Given a conjunction of lin . ineq. in the form f (x ) < b, Alg. 2 constructs A and b with a row in A and value in b corresponding to each conjunct. There are two cases: (Ax < b) ^ (x |= fi) and (x |= fi) ^ (Ax < b).
We prove case 1 by contradiction. Assume (Ax < b) and (x = fi). By construction of H in Alg. 2, each conjunct of fi is exactly 1 constraint in H . I f Ax < b, then all constraints in H must be satisifed, and thus a ll conjuncts in fi must be satisfied and x = fi, a contradiction.
We prove case 2 by contradiction. Assume (x = fi) and (Ax < b). By construction of H in Alg. 2, each conjunct of fi is exactly 1 constraint in H . If x = fi, then all conjuncts in fi must be satisfied, and thus all constraints in H must be satisifed and Ax < b, a contradiction. If input x satisfies the constraint, then the neuron value w ill be at most 0, otherwise it w ill be greater than 0. After the ReLu, each neuron w ill be equal to 0 if the corresponding constraint is satisfied by x and greater than 0 otherwise. The first output neuron sums all neurons in the hidden layer, while the second has a constant value of 0. If x € H , then all neurons in the hidden layer after activation must have a value of 0 since all constraints are satisfied. However, if all neurons 4We discuss the assumption that such a number exists in Appendix B.
have value 0, then their sum must also be 0, and therefore N s(x)0 = N s(x )1, a contradiction.
We prove case 2 by contradiction. Assume N s(x )0 < N s (x) 1 and x € H .If x € H , at least one neuron in the hidden layer must have a value greater than 0 after the ReLu since at least one constraint is not satisfied. Because some neuron has a value greater than 0, their sum must also be greater than 0, and therefore N s(x )0 > N s(x )1, a contradiction. Lemma 3. Let H = (A,b) where W = diag(ublb) and b = lb. This function is

B. On Existance o f a Bounded Largest Representable Number
Our proof that property reduction generates a set of ro bustness problems equivalid to an arbitrary problem relies on the assumption that strict inequalities can be converted to non strict inequalities. To do so we rely on the existance of a largest representable number that is less than some given value. While this is not necessarily true for all sets of numbers (e .g ., R ), it is true for for most numeric representations used in computation (e .g ., IEEE 754 floating point arithmetic).