Published February 22, 2023 | Version v1
Dataset Open

SceneFake

  • 1. Institute of Automation, Chinese Academy of Sciences

Description

Many datasets have been designed to further the development of fake audio detection. However, fake utterances in previous datasets are mostly generated by altering timbre, prosody, linguistic content or channel noise of original audio. These datasets leave out a scenario, in which the acoustic scene of an original audio is manipulated with a forged one. It will pose a major threat to our society if some people misuse the manipulated audio with malicious purpose. Therefore, this motivates us to fill in the gap. This paper proposes such a dataset for scene fake audio detection named SceneFake, where a manipulated audio is generated by only tampering with the acoustic scene of an real utterance by using speech enhancement technologies. The results show that scene fake utterances cannot be detected reliably by the baseline models trained using the ASVspoof 2019 dataset. When the models are trained using the training set of SceneFake, they perform well when evaluated with the seen testing set, but still perform poorly when dealing with the unseen test set.

The SceneFake dataset is publicly available. The source code of baselines is available on GitHub https://github.com/ADDchallenge/SceneFake

This data set is licensed with a CC BY-NC-ND 4.0 license.

Files

SceneFake.zip

Files (5.8 GB)

Name Size Download all
md5:35a44681ef1300d795e417c3269a710a
5.8 GB Preview Download