Published June 18, 2019 | Version v1
Dataset Open

SimSceneTVB Perception

  • 1. LS2N, UMR CNRS 6004, Ecole Centrale de Nantes
  • 2. IFSTTAR, CEREMA, UMRAE
  • 3. ETIS, UMR CNRS 8051, University of Paris Seine, University of Cergy-Pontoise, ENSEA

Description

This is a corpus of 100 sound scenes of 45s each representing urban sound environments, including:

  • 6 scenes recorded in Paris,
  • 19 scenes simulated using simScene (https://bitbucket.org/mlagrange/simscene) to replicate recorded scenarios, including the 6 recordings in this corpus,
  • 75 scenes simulated using simScene with diverse new scenarios, containing traffic, human voices and bird sources.

The base audio files used for simulation are obtained from Freesound (https://freesound.org) and LibriSpeech (http://www.openslr.org/12).

This corpus has been evaluated by a panel of participants in a listening experiment, with assessments on the following 0-10 Likert scales:

  1. Pleasantness: Unpleasant - Pleasant,
  2. Liveliness: Inert, amorphous - Lively, eventful,
  3. Overall loudness: Quiet - Noisy,
  4. Interest: Boring, uninteresting - Stimulating, interesting,
  5. Calmness: Agitated, chaotic - Calm, peaceful,
  6. Sound level of passing vehicles: Very low - Very high,
  7. Time of presence of traffic: Never - Continuously,
  8. Time of presence of voices: Never - Continuously,
  9. Time of presence of birds: Never - Continuously.

Assessments from 23 subjects are available for the 6 recorded and 19 simulated scenes, and from 7 to 8 subjects for the 75 simulated scenes.

The contents of this dataset are as follow:

  • assessments: contains evaluations by 23 subjects of perceptual scales on the corpus
    • sXX: Folder corresponding to participant XX
      • Pt_.txt: Contains assessments. Each line corresponds to one scene, columns correspond to (resp.) scene number (see audio_list.txt for correspondance), pleasantness, liveliness, overall loudness, interest, calmness, sound level of passing vehicles, time of presence of traffic, human voices, bird sources.
  • audio
    • rec: contains the 6 recorded 45s scenes
    • rep: contains the 19 replicated 45s scenes, with separated tracks for source contributions
    • sim: contains the 75 simulated 45s scenes, with separated tracks for source contributions
  • audio_list.txt: list of audio files in the corpus, line ordering corresponds to numbers in the first column of assessments
  • rep_exp.mat, sim_exp.mat: Additional information about the corpus, with playback Leq and physical estimations of the perceptual time of presence of sources for each scene.

Files

simSceneTVB_perception.zip

Files (785.2 MB)

Name Size Download all
md5:66cd11cd53829b071bf68aceea138ee4
785.2 MB Preview Download