SED-Augmented Adobe Audition Sound Effects dataset (ASFX-SED)

Wu, Yusong; Tsirigotis, Christos; Chen, Ke; Huang, Cheng-Zhi Anna; Courville, Aaron; Nieto, Oriol; Seetharaman, Prem; Salamon, Justin

doi:10.5281/zenodo.15866339

Published July 11, 2025 | Version 1.0.0

Dataset Open

SED-Augmented Adobe Audition Sound Effects dataset (ASFX-SED)

1. Université de Montréal
2. Adobe Research
3. Massachusetts Institute of Technology

Overview

This repository contains the SED-Augmented SFX dataset (ASFX-SED) used in the paper [FLAM: frame-wise language-audio modeling](https://arxiv.org/abs/2505.05335). The dataset is designed for research and development in open-set sound event detection, and can also be used in event separation and other related machine learning tasks.

- **Original Source:** Adobe Audition sound effects dataset (https://www.adobe.com/products/audition/offers/adobeauditiondlcsfx.html). The same dataset is also used in other audio research (e.g. https://arxiv.org/abs/2308.09089).
- **Format:** Parquet (tabular metadata) + JSON (per-sample metadata) + WAV (audio files)
- **License:** ADOBE RESEARCH LICENSE (see LICENSE.md)

Dataset Structure

```
├── asfx_sed_metadata.parquet # Metadata (Parquet)
├── asfx_sed/ # Dataset folder
│ ├── 0000000.json # Per-sample metadata (JSON)
│ ├── 0000000_mix.wav # Mixed audio
│ ├── 0000000_event_0.wav # Event audio
│ └── ...
```

All audio files are mono with a 48kHz sample rate.

Parquet File (`asfx_sed_metadata.parquet`)

Each row corresponds to a single audio sample. The following fields are included:

events (list): List of event dicts before RMS relabeling (see below)
background (dict): Background audio metadata
background_caption (str): Description of the background audio
events_loudness (list): Loudness values for each event (in dB) before RMS relabeling
events_caption (list): Caption for each event
events_ucs_category (list): UCS category for each event (https://universalcategorysystem.com/)
events_caption_range (list): Start and end times for each event occurrence, in seconds
events_id (list): Event IDs
id (str): Unique sample ID for mixture

RMS relabeling:

During dataset synthesis, we analyze the RMS (root mean square) energy of each event to identify and relabel silent segments as negative examples. As a result, a single original event may be split into two or more events after relabeling. The "events" and "events_loudness" fields contain metadata for each event before RMS relabeling, while "events_caption", "events_ucs_category", "events_caption_range", and "events_id" correspond to each event after relabeling. If an event is split into multiple segments, the lists in these latter fields will be longer than those in the former.

Example of an `events` entry (list of dicts):

```
[
{
"id": "...",
"sample_rate": 48000,
"wav": "...wav",
"duration": 1.23,
"caption": "...",
"ucs_category": "...",
"start_time": 0.0,
"end_time": 1.23
},
...
]
```

Example of a `background` entry (dict):

```
{
"id": "...",
"sample_rate": 48000,
"wav": "...wav",
"duration": 90.1,
"caption": "...",
"ucs_category": "..."
}
```

JSON Files

Each JSON file in `asfx_sed/` contains the same fields as a row in the Parquet file, but for a single sample. The corresponding audio files are in the same folder.

Usage Example

Loading the Parquet Metadata

```python
import pandas as pd
metadata = pd.read_parquet('asfx_sed_metadata.parquet')
print(metadata.head())
```

Accessing Audio and JSON

```python
import json
with open('asfx_sed/0000000.json', 'r') as f:
sample = json.load(f)
print(sample['background_caption'])
```

PyTorch DataLoader Example

A simple PyTorch `Dataset` and `DataLoader` for this dataset is provided in `dataloader_example.py`.

Example Usage

```python
from dataloader_example import ASFX_SED_Dataset
from torch.utils.data import DataLoader

dataset = ASFX_SED_Dataset(
parquet_path='asfx_sed_metadata.parquet',
audio_dir='asfx_sed/'
)
dataloader = DataLoader(dataset, batch_size=8, shuffle=True)

for batch in dataloader:
print(batch['id'])
print(batch['background_caption'])
# batch['audio'] is a list of numpy arrays (waveforms)
break
```

See the code and comments in `dataloader_example.py` for details on how to customize loading, audio processing, and batching.

Citation

If you use this dataset in your research or find it helpful, please cite the following paper:

```
@inproceedings{wu2025flam,
title={{FLAM}: Frame-Wise Language-Audio Modeling},
author={Yusong Wu and Christos Tsirigotis and Ke Chen and Cheng-Zhi Anna Huang and
Aaron Courville and Oriol Nieto and Prem Seetharaman and Justin Salamon},
booktitle={Forty-second International Conference on Machine Learning},
year={2025},
}
```

---

**Contact:** Yusong Wu (wu.yusong@mila.quebec), Justin Salamon (salamon@adobe.com)

**License:** ADOBE RESEARCH LICENSE (see LICENSE.md)

Files

LICENSE.md

Files (15.8 GB)

Name	Size	Download all
asfx_sed.tar.gz md5:bcd40f74c0d4cd7cb7a62e991d1e31a3	15.8 GB	Download
asfx_sed_metadata.parquet md5:c35fa08053a526604e21e9eb6cf52fb9	4.3 MB	Download
dataloader_example.py md5:1950dbb5ffc9cabdd866865f666b4d60	3.3 kB	Download
LICENSE.md md5:44f444e2d3d55cb3cbc9ec8a8511e278	2.3 kB	Preview Download
README.md md5:4f5cbb1639c21f2f98ca3a20800b6ad0	5.2 kB	Preview Download

Additional details

arXiv: arXiv:2505.05335

Is described by: Conference paper: arXiv:2505.05335 (arXiv)

200

Views

108

Downloads

Show more details

	All versions	This version
Views	200	200
Downloads	108	108
Data volume	410.7 GB	410.7 GB

More info on how stats are collected....

DOI

Resource type

Dataset

Publisher

Zenodo

Languages

English

License: ADOBE RESEARCH LICENSE

Copyright 2025, Adobe Inc. and its licensors. All rights reserved. ADOBE RESEARCH LICENSE Adobe grants any person or entity ("you" or "your") obtaining a copy of these certain research materials that are owned by Adobe ("Licensed Materials") a nonexclusive, worldwide, royalty-free, revocable, fully paid license to (A) reproduce, use, modify, and publicly display the Licensed Materials; and (B) redistribute the Licensed Materials, and modifications or derivative works thereof, provided the following conditions are met: The rights granted herein may be exercised for noncommercial research purposes (i.e., academic research and teaching) only. Noncommercial research purposes do not include commercial licensing or distribution, development of commercial products, or any other activity that results in commercial gain. You may add your own copyright statement to your modifications and/or provide additional or different license terms for use, reproduction, modification, public display, and redistribution of your modifications and derivative works, provided that such license terms limit the use, reproduction, modification, public display, and redistribution of such modifications and derivative works to noncommercial research purposes only. You acknowledge that Adobe and its licensors own all right, title, and interest in the Licensed Materials. All copies of the Licensed Materials must include the above copyright notice, this list of conditions, and the disclaimer below. Failure to meet any of the above conditions will automatically terminate the rights granted herein. THE LICENSED MATERIALS ARE PROVIDED "AS IS" WITHOUT WARRANTY OF ANY KIND. THE ENTIRE RISK AS TO THE USE, RESULTS, AND PERFORMANCE OF THE LICENSED MATERIALS IS ASSUMED BY YOU. ADOBE DISCLAIMS ALL WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, WITH REGARD TO YOUR USE OF THE LICENSED MATERIALS, INCLUDING, BUT NOT LIMITED TO, NONINFRINGEMENT OF THIRD-PARTY RIGHTS. IN NO EVENT WILL ADOBE BE LIABLE FOR ANY ACTUAL, INCIDENTAL, SPECIAL OR CONSEQUENTIAL DAMAGES, INCLUDING WITHOUT LIMITATION, LOSS OF PROFITS OR OTHER COMMERCIAL LOSS, ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THE LICENSED MATERIALS, EVEN IF ADOBE HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES.

Technical metadata

Created: July 12, 2025
Modified: July 12, 2025

SED-Augmented Adobe Audition Sound Effects dataset (ASFX-SED)

Overview

Dataset Structure

Parquet File (`asfx_sed_metadata.parquet`)

RMS relabeling:

Example of an `events` entry (list of dicts):

Example of a `background` entry (dict):

JSON Files

Usage Example

Loading the Parquet Metadata

Accessing Audio and JSON

PyTorch DataLoader Example

Example Usage

Citation

Files

LICENSE.md

Files (15.8 GB)

Additional details

Identifiers

Related works

SED-Augmented Adobe Audition Sound Effects dataset (ASFX-SED)

Creators

Description

Overview

Dataset Structure

Parquet File (`asfx_sed_metadata.parquet`)

RMS relabeling:

Example of an `events` entry (list of dicts):

Example of a `background` entry (dict):

JSON Files

Usage Example

Loading the Parquet Metadata

Accessing Audio and JSON

PyTorch DataLoader Example

Example Usage

Citation

Files

LICENSE.md

Files (15.8 GB)

Additional details

Identifiers

Related works