Published June 27, 2024 | Version v1
Dataset Open

Counter DataSet Public | Cloaked Classifiers: Pseudonymization Strategies on Sensitive Classification Tasks (

Description

Cloaked Classifiers: Pseudonymization Strategies on Sensitive Classification Tasks (Counter DataSet)

Official repository of Counter DataSet, the pseudoanonymized dataset for Radicalization Detection with Named Entity Recognition annotations. You can read the paper here

Annotated examples for every language are avilable in the folder 'Examples'.

WARNING: The datasets contain content that is racist, sexist, homophobic, and offensive in many other ways.

Training and test sets available filling in this form; an email notification will be sent with instructions and details about how to download the data.

Please cite our paper in any published work that uses any of these resources.

@inproceedings{,
  title = {Cloaked Classifiers: Pseudonymization Strategies on Sensitive Classification Tasks},
  author = {Arij Riabi, Menel Mahamdi, Virginie Mouilleron, Djamé Seddah}, 
  booktitle = {Proceedings of the fifth Workshop on Privacy in Natural Language Processing},
  year = {2024},
  location = {Bangkok, Thailand},
  }

Contact

If you have any questions please contact djame dot seddah at inria dot fr or arij dot riabi at inria dot fr.

Maintainers: djame dot seddah at inria dot fr arijriabi96 at gmail dot com

https://counter-project.eu/

This project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No. 101021607. The contents of this website are the sole responsibility of the CounteR consortium and can in no way be taken to reflect the views of the European Union.

Files

counter-dataset-public-main.zip

Files (7.1 kB)

Name Size Download all
md5:b98f35e0a5d64d5c600433c2d9d49597
7.1 kB Preview Download