VeriDark SilkRoad1 Authorship Verification Dataset

Manolache, Andrei; Brad, Florin; Barbalau, Antonio; Ionescu, Radu; Popescu, Marius

doi:10.5281/zenodo.6998371

Published July 7, 2022 | Version 0.1.0

Dataset Restricted

VeriDark SilkRoad1 Authorship Verification Dataset

1. Bitdefender
2. University of Bucharest

VeriDark (Authorship Verification in the DarkNet) is a benchmark for evaluating authorship analysis methods in a cybersecurity context, by introducing datasets gathered from the DarkNet marketplace forums or from Darknet-related discussions on Reddit. This benchmark contains three datasets for authorship verification and one dataset for authorship identification.

Files

Restricted

The record is publicly accessible, but files are restricted to users with access.

Request access

If you would like to request access to these files, please fill out the form below.

Due to ethical concerns regarding the potential misuse of our benchmark, we require the following additional information for granting permission to use our datasets:

The name of the person requesting access, together with their affiliations, job title and an e-mail address. If the person holds an institutional e-mail address, we strongly recommend using it instead of a personal e-mail address.
The intended usage for the dataset.
An acknowledgement that the dataset will be strictly used in an ethical manner. Non-ethical uses of the dataset include, but are not limited to:
- using the datasets for the task of Language Modeling or similar generative algorithms.
- building algorithms that could aid criminals to evade law enforcement organizations.
- building algorithms that have the aim of unmasking undercover law enforcement agents.
- building algorithms that could interfere with the activity of law enforcement agencies.
- building algorithms that could lead to violating any article of the United Nations Universal Declaration of Human Rights.
- building algorithms with the purpose of exposing the identity of reporters, individuals in the political realms, leakers, whistleblowers, dissidents, or other persons who are seeking to express an opinion about what they perceive is a particular injustice in the world, without regard to what that injustice may be.
- building algorithms that can help entities discriminate, or exacerbate bias against other persons on the basis of race, color, religion, gender, gender expression, age, national origin, familiar status, ancestry, culture, disability, political views, sexual orientation, marital status, military status, social status, or who have other protected characteristics.

We strongly encourage the inclusion of an ethical statement and discussion in any work based on this dataset.

We do not encourage the distribution of the dataset in its current form to any other parties without our consent.

DISCLAIMER: Any personal information provided when requesting access to the dataset will be used just for deciding whether access to the dataset should be granted or not. We will not disclose your personal data.

You are currently not logged in. Do you have an account? Log in here

Additional details

Is part of: Preprint: 10.48550/arXiv.2207.03477 (DOI)

	All versions	This version
Views	564	563
Downloads	11	11
Data volume	4.3 GB	4.3 GB

VeriDark SilkRoad1 Authorship Verification Dataset

Creators

Description

Files

Restricted

Request access

Additional details

Related works