Published September 16, 2024 | Version 1.0
Dataset Open

WaivOps EDM-HSE: Open Audio Resources for Machine Learning in Music

Description

EDM-HSE Dataset

EDM-HSE is an open audio dataset containing a collection of code-generated drum recordings in the style of modern electronic house music. It includes 8,000 audio loops recorded in uncompressed stereo WAV format, created using custom audio samples and a MIDI drum dataset. The dataset also comes with paired JSON files containing MIDI note numbers (pitch) and tempo data, intended for supervised training of generative AI audio models.

Overview

The EDM-HSE Dataset was developed using an algorithmic framework to generate probable drum notations commonly played by EDM music producers. For supervised training with labeled data, a variational mixing technique was applied to the rendered audio files. This method systematically includes or excludes drum notes, assisting the model in recognizing patterns and relationships between drum instruments, thereby enhancing its generalization capabilities.

The primary purpose of this dataset is to provide accessible content for machine learning applications in music and audio. Potential use cases include generative music, feature extraction, tempo detection, audio classification, rhythm analysis, drum synthesis, music information retrieval (MIR), sound design and signal processing.

Specifications

  • 8,000 audio loops (approximately 17 hours)
  • 16-bit WAV format
  • Tempo range: 120–130 BPM
  • Paired label data (WAV + JSON)
  • Variational drum patterns
  • Subgenre styles (Big room, electro, minimal, classic)

A JSON file is provided for referencing and converting MIDI note numbers to text labels. You can update the text labels to suit your preferences.

License

This dataset was compiled by WaivOps, a crowdsourced music project managed by the sound label company Patchbanks. All recordings have been compiled by verified sources for copyright clearance.

The EDM-HSE dataset is licensed under Creative Commons Attribution 4.0 International (CC BY 4.0).

Additional Info

Please note that this dataset has not been fully reviewed and may contain minor notational errors or audio defects.

For audio examples or more information about this dataset, please refer to the GitHub repository.

Files

key_map_drum_note_labels.json

Files (7.6 GB)

Name Size Download all
md5:4df7113fbc876e2a6f5268496bca8e8d
245.3 kB Download
md5:9605e1217e6c1159f23fcb3a820578e7
7.6 GB Download
md5:05fa858c94fc3e7b21e618cdb1ee7e24
1.2 kB Preview Download