Published June 10, 2021 | Version 1.0
Dataset Open

Large-scale Latency Measurements in the Tor Network

  • 1. Technische Universität Ilmenau

Description

This dataset supplements our following research paper:

[1] Schatz, David; Rossberg, Michael; Schaefer, Guenter: Optimizing Packet Scheduling and Path Selection for Anonymous Voice Calls. Accepted at ARES 2021.

Measurement Method

The basic idea to estimate the latency (one-way delay) between any two Tor relays u, v is as follows [2]:

  1. Use a measurement agent a (a custom Tor client [3]).
  2. Pick an arbitrary entry relay e and build a three hop circuit (e, u, v), i.e. network packets would normally traverse the path (a, e, u, v) and back.
  3. Agent a asks the relay v to open a connection to localhost, which will be refused with a special error message. The agent a can measure the RTT (round-trip time) of this failed attempt and divide it by 2 to get an estimate of the one-way latency dl of the path (a, e, u, v).
  4. Due to the "leaky pipe" design of Tor, the agent a can do the same with relay u to estimate the one-way latency ds of the path (a, e, u).
  5. The estimation for the one-way delay between relays u and v is du,v = dl - ds.

Measurement Series

As agent, we used a PC at the TU Ilmenau in Germany, connected to the DFN network. As entry relays, we always used Tor relays from Germany as well. To later reduce the influence of random jitter in our estimations, we measured both dl and ds for a pair u, v 100 times (once overy 0.5 seconds). The measurements took place from 2019-12-19 to 2021-01-06.

Dataset Content

The dataset consists of three files:

  • paths: This file contains the measurement results for dl and ds for each pair u, v we measured. Each line consists of 4 entries (separated by a space): Our internal ids for u and v, an indication if it is dl (value = 3) or ds (value = 2), and a comma separated list of the 100 (or less due to potential packet loss) measurements. Each measurement is the one-way latency in microseconds.
  • probedRelays: This file contains the mapping from our internal ids of relays to their Tor fingerprints, one in each line, separated by a space.
  • relayInfo: This file contains a subset of the relay descriptors for all active relays during our measurement period, one per line. Each line contains 4 entries: The fingerprint, the IP address, the consensus weight and a comma seperated list of flags. Note that we only measured relays which include the flags stable, running, and valid. The file was last updated on 2021-01-06 (relays that were not online at this date show their last known infos before that date).

Our Pre-Processing

For our research [1], we used the minium values for dl and ds (of the 100 each) to get one latency estimate du,v for each pair u, v. We further filtered 90 duplicates, i.e. 90 internal ids that actually mapped to the same fingerprint as some other id, leaving a total of 4102 probed relays. Further filtering was done as described in [1].

Limitations

Unfortunately, we did not record the timestamp of measurements. Nevertheless, the paths file lists our measurements in chronological order and the measurement "speed" was constant during the ~2 years of measurements. Furthermore, we did not record the fingerprint of the selected entry relays.

Further note that the long measurement period of ~2 years implies that estimated latencies do not capture a "snapshot" of the Tor network. For example, if a relay v got an "upgraded" access links during the two years, early estimates containing v will be higher than later estimates.

References

[2] Panchenko, Andriy; Renner, Johannes. Path Selection Metrics for Performance-Improved Onion Routing. SAINT 2009. Pages 114–120.

[3] To build custom Tor circuits, we used the Python library stem, which uses the control port of a local Tor client.

Files

Files (3.6 GB)

Name Size Download all
md5:9a14e64a1ab93d6cabc806c4364f1076
3.6 GB Download

Additional details

Related works

Is supplement to
Conference paper: 10.1145/3465481.3465768 (DOI)
References
Conference paper: 10.1109/SAINT.2009.26 (DOI)

References

  • Andriy Panchenko and Johannes Renner. 2009. Path Selection Metrics for Performance-Improved Onion Routing. In SAINT. 114–120