Drug trade messages on Tor in Finnish, English and Polish
Authors/Creators
Contributors
Description
Within the Tor network, we compared the Finnish, English and Polish drug trade. To accomplish this, we selected three active onion websites. We compile and publish these three datasets, freely available to the academic community.
1. A Finnish website within the Tor network offers an anonymous chat messenger and chat rooms for the biggest towns in Finland, where most of the messages are associated with the illicit drug trade. Finnish sample / Finnish.zip: 1,500 messages from the Finnish onion website “Tsatti”. We collected these in November and December of 2022. Anonymous chat Tsatti—Finnish discussions on Tor: http://tsattickdplsh2i2xqzlybvreiuppgoqsicmzkrotuudnk7h665ukgid.onion/
2. The Polish forum website “Cebulka” operates anonymously within Tor and facilitates drug trafficking. Polish sample / Polish.zip: 965 advertisement pages from the “Cebulka” onion website in Polish. We collected these pages in January 2023. Anonymous Forum Cebulka: http://cebulka7uxchnbpvmqapg5pfos4ngaxglsktzvha7a5rigndghvadeyd.onion/
3. Established in 2021, “Nemesis” is a combination of a darknet market and a forum; in addition, the products are displayed publicly, and registration is not necessary. English sample / English.zip: “Nemesis” Market offers illegal products. Data consists of 1,334 pages. We collected these pages in January 2023. Anonymous Marketplace Nemesis Market: http://nemesis555nchzn2dogee6mlc7xxgeeshqirmh3yzn4lo5cnd4s5a4yd.onion/
| Language and sellers | Unique sellers in the sample | Selling Advertisements |
| English | 223 | 1334 |
| Polish | >200 | 965 |
| Finnish | >200 | >500 |
| Language and contact methods | Wickr | Session | Telegram | |
| English | 20 | 2 | 0 | 13 |
| Polish | 1864 | 698 | 15 | 91 |
| Finnish | 0 | 588 | 35 | 9 |
The datasets are under CC BY 4.0 license: You are free to copy, share, redistribute, remix, transform, and build upon the material for any purpose, even commercially. You must give attribution and appropriate credit. Uncompressed data contains a “README.md” file, all of the web pages in a JSON file in a machine-readable format, screenshots from the websites and a “textpages” folder contains readable text versions of the web pages.
Data collector and curation: Juha Nurmi
Funding: This work was supported by the European Commission under the Horizon Europe funding programme, as part of the project SafeHorizon (Grant Agreement 101168562). Views and opinions expressed are however those of the author(s) only and do not necessarily reflect those of the European. Neither the European Union nor the granting authority can be held responsible for them. The authors declare no competing interests.
Files
English.zip
Additional details
Funding
Dates
- Collected
-
2022-12-01Data collection started
- Collected
-
2023-01-31Data collection ended