Published August 24, 2020
| Version 0.0.1
Software
Open
popcornell/DeMask_Surgical_mask_speech_enhancement_v1
- 1. Università Politecnica delle Marche
- 2. Universitè de Lorraine, CNRS, Inria, LORIA
- 3. Inria and LIRMM, University of Montpellier
Description
Description:
This model was trained by popcornell using the asteroid/demask recipe in Asteroid. It was trained on the enhancement
task of the Surgical_mask_speech_enhancement_v1 dataset.
Training config:
- positional arguments:
- filterbank:
- fb_type: stft
- n_filters: 512
- kernel_size: 512
- stride: 256
- demask_net:
- input_type: mag
- output_type: mag
- hidden_dims: [1024]
- dropout: 0
- activation: relu
- mask_act: relu
- norm_type: gLN
- data:
- fs: 16000
- length: 4
- optim:
- lr: 0.001
- weight_decay: 1e-05
- training:
- epochs: 200
- batch_size: 8
- gradient_clipping: 5
- accumulate_batches: 1
- save_top_k: 10
- num_workers: 8
- patience: 30
- half_lr: True
- early_stop: True
- gaussian_mask_noise_snr_dB: np.random.randint(3, 12)
- white_noise_dB: np.random.randint(-3, 30)
- speed_augm: np.random.uniform(0.95, 1.05)
- gain_augm: np.random.randint(-30, -2)
- n_taps: 97
- main_args:
- help: None
- exp_dir: exp/tmp
Results:
License notice:
This work "DeMask_Surgical_mask_speech_enhancement_v1_enhancement" is a derivative of LibriSpeech ASR corpus by Vassil Panayotov, used under CC BY 4.0; of Free Universal Sound Separation Dataset by Scott Wisdom; Hakan Erdogan; Dan Ellis and John R. Hershey, used under Creative Commons Attribution 4.0 International. "DeMask_Surgical_mask_speech_enhancement_v1_enhancement" is licensed under Attribution-ShareAlike 3.0 Unported by popcornell.
Files
Files
(4.2 MB)
Name | Size | Download all |
---|---|---|
md5:165f2b25019469aae15e1a2eef3a1173
|
4.2 MB | Download |