Published August 24, 2020 | Version 0.0.1
Software Open

popcornell/DeMask_Surgical_mask_speech_enhancement_v1

  • 1. Università Politecnica delle Marche
  • 2. Universitè de Lorraine, CNRS, Inria, LORIA
  • 3. Inria and LIRMM, University of Montpellier

Description

Description:

This model was trained by popcornell using the asteroid/demask recipe in Asteroid. It was trained on the enhancement task of the Surgical_mask_speech_enhancement_v1 dataset.

 

 

Training config:

  • positional arguments:
  • filterbank:
    • fb_type: stft
    • n_filters: 512
    • kernel_size: 512
    • stride: 256
  • demask_net:
    • input_type: mag
    • output_type: mag
    • hidden_dims: [1024]
    • dropout: 0
    • activation: relu
    • mask_act: relu
    • norm_type: gLN
  • data:
    • fs: 16000
    • length: 4
  • optim:
    • lr: 0.001
    • weight_decay: 1e-05
  • training:
    • epochs: 200
    • batch_size: 8
    • gradient_clipping: 5
    • accumulate_batches: 1
    • save_top_k: 10
    • num_workers: 8
    • patience: 30
    • half_lr: True
    • early_stop: True
    • gaussian_mask_noise_snr_dB: np.random.randint(3, 12)
    • white_noise_dB: np.random.randint(-3, 30)
    • speed_augm: np.random.uniform(0.95, 1.05)
    • gain_augm: np.random.randint(-30, -2)
    • n_taps: 97
  • main_args:
    • help: None
    • exp_dir: exp/tmp

 

Results:

 

License notice:

This work "DeMask_Surgical_mask_speech_enhancement_v1_enhancement" is a derivative of LibriSpeech ASR corpus by Vassil Panayotov, used under CC BY 4.0; of Free Universal Sound Separation Dataset by Scott Wisdom; Hakan Erdogan; Dan Ellis and John R. Hershey, used under Creative Commons Attribution 4.0 International. "DeMask_Surgical_mask_speech_enhancement_v1_enhancement" is licensed under Attribution-ShareAlike 3.0 Unported by popcornell.

Files

Files (4.2 MB)

Name Size Download all
md5:165f2b25019469aae15e1a2eef3a1173
4.2 MB Download