Planned intervention: On Thursday 19/09 between 05:30-06:30 (UTC), Zenodo will be unavailable because of a scheduled upgrade in our storage cluster.
Published December 7, 2023 | Version v1
Dataset Restricted

SLURP-Fr Real

  • 1. ROR icon Idiap Research Institute

Description

Description

This is the real test portion of the SLURP-Fr dataset, which is a part of the dataset created for the studies on interpreter-aided spoken language understanding (SLU) in the paper below, with three different parts:

  1. SLURP-Fr, an end-to-end SLU dataset based on the French portion of MASSIVE (https://github.com/alexa/massive), containing 16,521 synthetic audio samples created using Google TTS, accompanied with 477 real test samples collected from two French speakers at Idiap.
  2. SLURP -Es, a similar dataset based on the parallel Spanish portion of MASSIVE, containing only synthetic samples.
  3. Spoken Gigaword, a speech summarization dataset generated from Gigaword (https://www.tensorflow.org/datasets/catalog/gigaword), containing 51,385 synthetic audio samples created using Google TTS.

 

Reference

If you use this dataset, please cite the following publication:

He, Mutian, and Philip N. Garner. "The Interpreter Understands Your Meaning: End-to-end Spoken Language Understanding Aided by Speech Translation." Findings of EMNLP 2023.

Files

Restricted

The record is publicly accessible, but files are restricted to users with access.

Request access

If you would like to request access to these files, please fill out the form below.

You need to satisfy these conditions in order for this request to be accepted:

We will provide an End-User License Agreement. The use of the dataset is strictly restricted to non-commercial research.

Please provide us the following information about the authorized signatory (MUST hold a permanent position):

  • Full name
  • Name of organization
  • Position / job title
  • Academic / email address
  • URL where we can verify the information details

Only valid academic email addresses from the same organization as the signatory are accepted for the online request. All online requests coming from generic email providers such as gmail will be rejected.

You are currently not logged in. Do you have an account? Log in here

Additional details

Funding

Storytelling and first impressions in face-to-face and algorithm-powered digital interviews 197479
Swiss National Science Foundation