Dear reviewer, This README file is organized as follows: - In part (i), we explain in detail the different files and directories of the artifact. - In part (ii), we explain how to configure and compile the Marabou framework, which is an ``under the hood" SMT-solver we used for our experiments. We also give precise instructions on how to run verification queries. - In part (iii), we explain how to read our summary of the original verification queries, used for assessing both mutual errors (among pairs of networks) and robustness (among full ensembles). - In part (iv), we provide information on how to run our gradient attacks. We note that the whole artifact (including the original DNN models and parsed results summarizing our verification queries) are publicly and permanently available online. ################################################################################################################################################ (i) Files and Directories in the Artifact ################################################################################################################################################ Below is a summary of the contents of the artifact (in alphabetical order): - experiments.zip: A compressed directory with the original trained DNNs, as well as CSV files summarizing our raw verification results as well as results obtained by competing methods (gradient attacks). The directory includes the following subdirectories: (*) MNIST - with experiments for the MNIST image classification dataset. (*) Fashion_MNIST - with experiments for the Fashion MNIST image classification dataset. Each of the aforementioned subdirectories has the following structure: (**) models - a directory with the original 10 models. We note that throughout the artifact the models are labeled alphabetically from (A) to (J), corresponding to models 1-10 (in the MNIST case) and 11-20 (in the Fashion-MNIST case). We also note that each model is named ``small_X" (for X=). The models are saved in TensorFlow 2 protobuf file format. (**) results - a directory with the following files: (#) a single CSV file - summarizing the verification results for calculating the mutual errors between DNN pairs. (#) ensemble_robustness - a directory with CSV files encoding the queries used to assess the robustness of various ensembles before and after conducting multiple swaps. - gradient_attacks.zip: A compressed directory with the code & results of the three gradient attacks checked (see appendix D of the original paper). This directory includes the following files and subdirectories: (**) attack_mnist.py - a python script for generating our gradient-based attacks. (**) computed_uniqueness_scores - a subdirectory with XLSX files summarizing the uniqueness scores calculated for various ensembles, based on all three gradient attacks, as well as the uniqueness scores calculated based on our verification-driven method. (**) MNIST_attacks / Fashion_MNIST_attacks - two subdirectories, each with 3 CSV files, each of them summarizing the adversarial inputs found via different gradient attacks on the fixed agreement points for a specific dataset (MNIST or Fashion_MNIST). - License.txt - a license for the use of our framework code - Marabou.zip: A compressed directory including the code of the Marabou verification engine. For an updated version please clone the cpp modules from the following repository - https://github.com/NeuralNetworkVerification/Marabou. The building process will take place by running files in this subdirectory, alongside a python interface for loading DNNs, and a cpp interface for running verification queries. Also, we note that this subdirectory includes a Marabou README file (Marabou_README.md) - README.txt - this file ################################################################################################################################################ (ii) Compiling Marabou & Running the Experiments ################################################################################################################################################ In this part, we will explain the 2 steps required to run the experiments described in our paper: 1. Compiling Marabou 2. Feeding Marabou the verification queries (after generating them with the supplied code) We note that for ease of explanation, ``" (quotations) were added to describe code and subdirectories in the below description of files. * Step 1: Compiling Marabou As mentioned, Marabou is the under-the-hood SMT solver we used for the experiments (other SMT solvers may be used as well). Create a ``build" directory then build and compile Marabou. In Linux run the following commands from the main artifact directory (after unzipping the ``Marabou" directory). ************************************************************************************************************************************************ cd Marabou mkdir build cd build cmake .. cmake --build . -j 8 ************************************************************************************************************************************************ This will create a ``build" subdirectory, and build and compile Marabou. We note that the compilation process itself may take time, due to running multiple tests for different aspects of the code. * Step 2: Generating the verification queries, and feeding them to the SMT solver Next, we can run the Marbaou SMT Solver on each of the verification queries. We note that the original verification queries are not supplied due to requiring a considerable amount of memory (dozens-hundreds of GB). Instead, we refer the user to the subdirectory which includes the following two python scripts: 1. - which encodes a MarbaouObject comprising two DNNs. 2. - which encodes a MarbaouObject comprising an ensemble of DNNs. After creating these objects, the Marabou framework allows the encoding of additional constraints (as elaborated in the paper) to construct the relevant ``input query" text files. These text files can later be fed to the verification engine, which outputs one of the following results: (*) UNSAT (*) MEMORY-OUT, in case the query was run on a machine with modest resources. (*) SAT, in which case Marabou will also return a satisfying assignment for the input. We note that receiving SAT means the epsilon-robustness does not hold, and the satisfying input serves as a counterexample. i.e. an input causing multiple DNNs to err simultaneously (for example, a Mutual Error in the case of encoding a pair of DNNs). In order to run Marabou on a specific input query generated we run the SMT solver via the cpp interface in the following way from the (unzipped) ``Marabou" directory (after building and compiling): ************************************************************************************************************************************************ cd Marabou/build ./Marabou --input-query=PATH_TO_QUERY ************************************************************************************************************************************************ for a specific query in the path ``PATH_TO_QUERY" - which must point to some text file generated, based on the python scripts, and after encoding the specific constraints based on the methods described in the appendix of our paper. ################################################################################################################################################ (iii) Verification Experimnts - Summary & Results ################################################################################################################################################ The raw summaries of our verification experiments can be found in the directory, and in the subdirectories, that include: (*) ``models" - a subdirectory with all original trained DNNs (in TensorFlow 2 savedModel format). (*) ``results" - a subdirectory with 2 files: (***) a summarizing CSV file - with 54,000 experimental results (10 choose 2) DNN pairs * 6 (epsilons) * 200 (agreement points). The CSV columns include: (###) ``test sample" -> includes the relevant index of the sample in the testset of MNIST/Fashion-MNIST (###) ``true label" -> either 0 (for MNIST) or 4 (for Fashion-MNIST) (###) ``epsilon" -> the perturbation size: 0.01, 0.02, 0.03, 0.04, 0.05, 0.06 (###) ``X disagrees" columns -> each row includes 2 columns (out of ``X_disagrees" for X ranging from `A' to `J'), indicating the (exactly) 2 DNNs which disagree, as represented in the experiment encoding the relevant row. For example, for analyzing the results of the experiments encoding a disagreement between DNNs (1) and (4), we will filter the CSV for rows with a value in both ``A disagrees" and ``D disagrees". (###) ``ipq result" -> a column indicating the result of the verified query (SAT/UNSAT/T.O./M.O.) (***) an subdirectory - Which includes various CSV files summarizing the robustness analysis (Figures 4 & 5 in the paper were plotted based on this analysis). The title of each CSV includes the original/new constituent members of the ensemble. Each CSV includes the test sample index (in the MNIST/Fashion_MNIST dataset), the epsilon perturbation and the results of the input query (SAT/UNSAT/T.O./M.O). In order to recreate the verification queries, please run the maraboupy extension for encoding a single network consisting of the pair/full-ensemble of constituent members, and the matching epsilon and runner-up label (based on the method elaborated in the appendix), and feed the generated verification query to the Marbaou verification engine, as explained in the previous part. ################################################################################################################################################ (iv) Running Gradient Attacks ################################################################################################################################################ All the files in this part can be found in the directory (after unzipping gradient_attacks.zip). In order to run any of our three gradient attacks (which are targeted/untargeted variations of the popular FGSM and I-FGSM attacks), we refer the user to the supplied python script, which (based on user-chosen values) conducts the relevant attack. Due to incompleteness of gradient attacks, the results can be either SAT (a counterexample was found) or UNKNOWN. The results for all three attacks are available in three CSV files, which can be found in the and subdirectories. The relevant attack can be found, based on the following keywords in the files' names: 1. `vanilla_FGSM' - a CSV summarizing gradient attack # 1. 2. `FGSM_constant_runnerup' - a CSV summarizing gradient attack # 2. 3. `FGSM_dynamic_runnerup' - a CSV summarizing gradient attack # 3. Each of the CSV files represents a different experiment, indicating the true sample, and the pair of ensemble members for which the attack was optimized (these are the members with a value in the matching columns). Finally, each of the CSV files includes 3 columns indicating possible criteria for counting (possible) SAT results, depending on the user's chosen level of conservativeness: (a) ``any_error" - will include cases in which the correct label is not chosen for the new input. (b) ``RU_geq_label" - will include cases in which the runner-up label (the one with the 2nd highest value on the original, correctly classified, input) receives a higher score than the true label, on the perturbed image. (c) ``RU_predicted" - will include cases in which the runner-up receives a higher score, relative to ALL competing labels. based on the nature of the above properties, we note that for each experiment (meaning each row): [(c) is SAT -> (b) is SAT] and [(b) is SAT -> (a) is SAT] In our experiments, in order to conduct a fair comparison with the verification queries (encoding that the runner-up label receives a higher score than all other labels) we calculated the SAT results of the gradient-based methods, based on criteria (c). Based on this analysis, we computed for each of the three attacks (and in accordance with criteria (c)) the uniqueness scores of DNNs, comprising constituent members of various ensembles. These results can be observed in the subdirectory. This subdirectory has 4 files: 3 of them are ``uniqueness_score_ga_[i]_dynamic_iterative_criteria_3_both_datasets.xlsx" for [i] = 1/2/3 (corresponding to the relevant attack). The 4th and last file is named ``uniqueness_score_marabou_verification_both_datasets.xlsx" and includes the uniqueness scores calculated based on the results from our verification-driven approach; these results are also summarized and referred to in part (iii) of this artifact. ------------------------------------------------------------------------------------------------------------------------------------------------ Thank you for your time! Guy Amir Tom Zelazny Guy Katz Michael Schapira