Annotation-free Audio-Visual Segmentation
- 1. Shanghai Jiao Tong University
Description
## AVS-Synthetic Dataset
**********
### Updated 2023-08-22
1. The paper [`Annotation-free Audio-Visual Segmentation`](https://arxiv.org/abs/2305.11019v3) with the dataset is accepted by WACV2024. The project page is [https://jinxiang-liu.github.io/anno-free-AVS/](https://jinxiang-liu.github.io/anno-free-AVS/).
2. We release the codes at [https://github.com/jinxiang-liu/anno-free-AVS](https://github.com/jinxiang-liu/anno-free-AVS).
3. Due to some technical reasons, there some missing audio clips (for training) in the orginal `audio.zip` file. If you download the dataset before August 22th, please re-download the `audios.zip` to replace the original one; Otherwise, just ignore this message and download the dataset.
4. If you have any problems, feel free to contact `jinxliu#sjtu.edu.cn` (replace `#` with `@`).
**********
- Note, the dataset corresponds to the arxiv paper https://arxiv.org/abs/2305.11019v3 .
- The `images` and `masks` folders provide the image-mask pairs from LVIS and OpenImages.
- The `audios` folder contains the 3-second long audio clips from the VGGSound, please using the center 1-second sub-clip for training and evaluating. And the pickle file `category_for_vggsound_audios.pkl` describes the labels of the audios. The labels are in according with the `cls_id` column in the `annotations.csv` file for model training.
- The `annotations.csv` file provides the annotations for each training, validation and testing samples. For the training samples, we do not sepcify the audios. In pratice, just randomly sample the vggsound audios with the `cls_id` in each epoch to compose the (image, masl, audio) triplet. For validation and test sets, we designate the audio sample from VGGSound for each image-mask sample.
Files
annotations.csv
Files
(18.6 GB)
Name | Size | Download all |
---|---|---|
md5:99093c7cc909b95bc130b7c2ce9ddc4a
|
5.2 MB | Preview Download |
md5:7f68890d66f3f5c84cd9e4f35ebed1ba
|
3.4 GB | Preview Download |
md5:bcc7e78b3ece4c26352050fefd4d5f12
|
884.7 kB | Download |
md5:46c929be5e949fcbe0e603ea8fb03ca5
|
15.0 GB | Preview Download |
md5:fb63ec734c8d7f249d07b46a668eec38
|
258.0 MB | Preview Download |
md5:56afbca0e8e1d5bbf15d77ea1c864326
|
1.8 kB | Preview Download |