Software Artifact for the SIGCOMM'20 Paper Titled "GRoot: Proactive Verification of DNS Configurations"
- 1. University of California, Los Angeles
- 2. Microsoft Research
- 3. University of California, Los Angeles & Intentionet
Description
Abstract:
The Domain Name System (DNS) plays a vital role in today’s Internet but relies on complex distributed management of host records. DNS misconfigurations are responsible for many outages that have rendered popular services such as GitHub, Twitter, HBO, LinkedIn, Yelp, and Azure inaccessible for extended periods of time. This paper introduces GRoot, the first verifier that performs static analysis of DNS configuration files, enabling proactive and exhaustive checking for common DNS bugs; by contrast, existing solutions are reactive and incomplete. GRoot uses a new, fast verification algorithm based on generating and enumerating DNS query equivalence classes. GRoot symbolically executes the set of queries in each equivalence class to efficiently find (or prove the absence of) any bugs such as rewrite loops or no response. To prove the correctness of our approach, we develop a formal semantic model of DNS resolution. Applied to a set of configuration files obtained from a large campus network with over a hundred thousand records, GRoot revealed 109 bugs, analyzing the network in seconds. When applied to internal zone files consisting of over 3.5 million records from a large CDN provider, GRoot revealed around 160k issues of blackholing, which initiated a cleanup. Finally, on a synthetic dataset created from over 65 million real records, we find that GRoot can scale to networks with tens of millions of records.
- - - - - - - -
The Artifact:
This software artifact contains the created census dataset, working code for generating the equivalence classes, constructing the interpretation graphs, for checking various properties over the constructed interpretation graphs, and scripts to reproduce the empirical claims made in the paper:
- Census Dataset statistics as shown in Figure 7(b)
- Performance claims made in §7.3 on DNS Census dataset and the corresponding plot shown in Figure 8
- - - - - - - -
Instructions:
- Census Dataset Organization
-
Unzip the
census.zip
dataset. Let thecensus
folder be placed in a folder nameddata
. -
The compressed dataset is ~3 GB and consists of ~8.1 M files.
-
🚨 Linux: When decompressed the folder consumes ~38 GB on Linux due to the default 4 KB block size on ext4. One of the plot generation scripts also generates ~1.3 M files. The Linux system might give the error message
No space left on device
when decompressing even if there is plenty of disk place. This happens when the filesystem runs out of inodes. The Census dataset and the plot generation scripts together require at least ~45 GB of unused disk space and also ~10.7 M free inodes (can be checked usingdf -ih
). -
⚠️ Windows: When decompressed the folder consumes only ~4 GB on Windows since the majority of the files are less than 1 KB.
-
-
Installation
-
Using
docker
(strongly recommended)-
Pull our docker image:
docker pull dnsgt/2020_sigcomm_artifact_157
. -
Alternatively, you could also build the Docker image locally:
docker build -t dnsgt/2020_sigcomm_artifact_157 github.com/dns-groot/2020_sigcomm_artifact_157
. -
Docker containers are isolated from the host system. Therefore, to run Groot on zones files residing on the host system, you must first [bind mount] them while running the container:
-
docker run -v <absolute path to the above data folder>:/home/groot/groot/shared -it dnsgt/2020_sigcomm_artifact_157.
-
This would give you a
bash
shell within groot directory.
-
-
The
data
folder on the host system would then be accessible within the container at~/groot/shared
(with read+write permissions).
-
- Verification of Claims
- All commands must be run within
~/groot/scripts/
directory. - Figure 7(b)
- To generate the plot shown in Figure 7(b) run the script
Figure7.py
.python3 Figure7.py
- Est. Time: 5 min, generates the plot
Figure7.pdf
directly in theshared
folder.
- To generate the plot shown in Figure 7(b) run the script
- Figure 8
- To generate the plot shown in Figure 7(b) run the script
Figure8.py
.python3 Figure8.py <path_to_the_groot_executable>
- If the script is run from a docker container, then the script can be run as follows:
python3 Figure8.py ../build/bin/groot
- If the script is run on Windows then the script can be run as follows:
python3 Figure8.py ..\x64\Release\groot.exe
- The script dumps the log for each domain into the
shared/logs/
subdirectory and in the end generates a summary fileAttributes.csv
in theshared
folder. Attributes.csv
contains the following information for each domain:- Number of resource records (RRs)
- Number of interpretation graphs built
- Time taken to parse zone files and build the label graph (Label graph building)
- Time taken to construct the interpretation graphs and check properties on them (Property checking)
- Total execution time (T)
- Label graph size (number of vertices and edges)
- Statistics across interpretation graphs (mean, median, min and max of vertices and edges)
- After running GRoot on all the ~1.3 M domains, the script calculates the median T for each distinct value of RRs and plots the median T vs the RRs.
- Est. Time: 10 hours, generates the plot
Figure8.pdf
fromAttributes.csv
in theshared
folder.
- To generate the plot shown in Figure 7(b) run the script
- All commands must be run within
Notes
Files
census.zip
Files
(3.1 GB)
Name | Size | Download all |
---|---|---|
md5:e277845bd804b589aafa81378c5ee16c
|
3.1 GB | Preview Download |
md5:b7d5b8a7c2748eb3121a2a4fb33f45ed
|
94.7 kB | Preview Download |
Additional details
Related works
- Is previous version of
- https://github.com/dns-groot/groot (URL)
- Is supplement to
- https://github.com/dns-groot/2020_SIGCOMM_Artifact_157 (URL)