Published July 8, 2023 | Version v1
Dataset Open

Annotation-free Audio-Visual Segmentation

  • 1. Shanghai Jiao Tong University

Description

## AVS-Synthetic Dataset
**********
### Updated 2023-08-22
1. The paper [`Annotation-free Audio-Visual Segmentation`](https://arxiv.org/abs/2305.11019v3) with the dataset is accepted by WACV2024. The project page is [https://jinxiang-liu.github.io/anno-free-AVS/](https://jinxiang-liu.github.io/anno-free-AVS/). 

2. We release the codes at [https://github.com/jinxiang-liu/anno-free-AVS](https://github.com/jinxiang-liu/anno-free-AVS).

3. Due to some technical reasons, there some missing audio clips (for training) in the orginal `audio.zip` file. If you download the dataset before August 22th, please re-download the `audios.zip` to replace the original one; Otherwise, just ignore this message and download the dataset.

4. If you have any problems, feel free to contact `jinxliu#sjtu.edu.cn` (replace `#` with `@`). 

**********
- Note, the dataset corresponds to the arxiv paper https://arxiv.org/abs/2305.11019v3 .


 
- The `images` and `masks` folders provide the image-mask pairs from LVIS and OpenImages.

- The `audios` folder contains the 3-second long audio clips from the VGGSound, please using the center 1-second sub-clip for training and evaluating. And the pickle file `category_for_vggsound_audios.pkl` describes the labels of the audios. The labels are in according with the `cls_id` column in the `annotations.csv` file for model training. 

- The `annotations.csv` file provides the annotations for each training, validation and testing samples. For the training samples, we do not sepcify the audios. In pratice, just randomly sample the vggsound audios with the `cls_id` in each epoch to compose the (image, masl, audio) triplet. For validation and test sets, we designate the audio sample from VGGSound for each image-mask sample.

Files

annotations.csv

Files (18.6 GB)

Name Size Download all
md5:99093c7cc909b95bc130b7c2ce9ddc4a
5.2 MB Preview Download
md5:7f68890d66f3f5c84cd9e4f35ebed1ba
3.4 GB Preview Download
md5:bcc7e78b3ece4c26352050fefd4d5f12
884.7 kB Download
md5:46c929be5e949fcbe0e603ea8fb03ca5
15.0 GB Preview Download
md5:fb63ec734c8d7f249d07b46a668eec38
258.0 MB Preview Download
md5:56afbca0e8e1d5bbf15d77ea1c864326
1.8 kB Preview Download