Published September 16, 2025 | Version v2
Dataset Open

CLAP features for Audio Moment Retrieval

Description

This page includes CLAP features of three datasets used in Language-based audio moment retrieval [1].

  • Clotho-Moment
  • UnAV100-subset
  • TUT Sound Events 2017

Raw wav files are also publicly available here.

[1] H. Munakata, T. Nishimura, S. Nakada, T. Komatsu, "Language-based Audio Moment Retrieval", In Proc. ICASSP, 2024.

How to Use

We can train/evaluate audio moment retrieval models using these features in Lighthouse.
Please check the instructions of Lighthouse.

  1. Unzip the file with the following commands
    Clotho-moment: 
    for file in clotho-moment_features.tar.part-*.gz; do gunzip "$file"; done
    clotho-moment_features.tar.part-* > clotho-moment_features.tar
    tar -xvf clotho-moment_features.tar

    UnAV100-subset, TUT Sound Events 2017: 
    tar -xvf tut2017_features.tar.gz
    tar -xvf unav100-subset_features.tar.gz
  2. Set symbolic links in Lighthouse
    ln -s features/{dataset_name} {lighthouse_dir}/features
  3. Train the model
    python training/train.py --model qd_detr --dataset clotho-moment --feature clap

  4. Evaluate the model
    model=qd_detr
    dataset=unav100-subset
    feature=clap
    model_path={lighthouse_dir}/results/qd_detr/clotho-moment/clap/best.ckpt
    eval_split_name=val
    eval_path=data/unav100-subset/unav100-subset_test_release.jsonl

    python training/evaluate.py \
    --model $model \
    --dataset $dataset \
    --feature $feature \
    --model_path $model_path \
    --eval_split_name $eval_split_name \
    --eval_path $eval_path




Files

Files (22.8 GB)

Name Size Download all
md5:f766f2c2650f5f192e0718cc9d56f363
992.2 MB Download
md5:909a021ff8c430ca0fc0153fdbed64c8
992.2 MB Download
md5:e4dd353a39e981ebf3dc5d1e0c6189d6
992.2 MB Download
md5:9c6aeb7fb376f5e1ab3d577f2db6f495
992.2 MB Download
md5:4045e808008d4f0c4dd59b1f78a9a60f
992.2 MB Download
md5:dbb343f6b3dce264aa8e3c3d7333e36d
992.2 MB Download
md5:7bb11a0d6f0916adffe1eb1d120eae88
992.2 MB Download
md5:1cf9f245d9f0090ce8a774da34737328
992.2 MB Download
md5:7a87934c0e4dfac83461d66efb4cf5b5
992.2 MB Download
md5:1bd266ea64b6bc225967f416c313a591
992.2 MB Download
md5:a34f3127d4e379da16bc20b249e5423c
992.2 MB Download
md5:00028b81c2293bff21093c2bd20f6d47
992.2 MB Download
md5:1b54bec9aa984a97bbea6bb2c283beee
992.2 MB Download
md5:86e99130f54c431c56e4112c689e933c
992.2 MB Download
md5:fbd020afab983003fe3e357dd65f8a56
992.2 MB Download
md5:6a2f1ece61478a0b942b233b5545ae02
992.2 MB Download
md5:976451dfb9c656190679975098844a0a
992.2 MB Download
md5:403bd6d1a27d84ce166976105121cece
992.2 MB Download
md5:022ab4011f48345e299acee133a7aa97
992.2 MB Download
md5:3213346d30a0f6a06003946fc01eeebf
992.2 MB Download
md5:c9b6902ffd8b48e19ba2881c1e0e0fe9
991.8 MB Download
md5:021ea1cdc56f10f57f852af0f606e2da
984.6 MB Download
md5:2c30318b30890b0a97d37d62d170f0a9
923.8 MB Download
md5:4d9fe3abef550c5f16495080fca172cf
7.7 MB Download
md5:96f0e63d3571c8145d768c723f949a9c
13.3 MB Download

Additional details

Related works

Is derived from
Dataset: 10.5281/zenodo.3490684 (DOI)