Other Open Access
Chenda Li
{ "publisher": "Zenodo", "DOI": "10.5281/zenodo.4498562", "author": [ { "family": "Chenda Li" } ], "issued": { "date-parts": [ [ 2021, 2, 4 ] ] }, "abstract": "<p>This model was trained by Chenda Li using wsj0_2mix recipe in <a href=\"https://github.com/espnet/espnet/\">espnet</a>.</p>\n\n<p> </p>\n\n<ul>\n\t<li><strong>Python API</strong>\n\n\t<pre><code class=\"language-python\">See https://github.com/espnet/espnet_model_zoo</code></pre>\n\t</li>\n\t<li><strong>Evaluate in the recipe</strong>\n\t<pre><code class=\"language-bash\">git clone https://github.com/espnet/espnet\ncd espnet\ngit checkout a3334220b0352931677946d178fade3313cf82bb\npip install -e .\ncd egs2/wsj0_2mix/enh1\n./run.sh --skip_data_prep false --skip_train true --download_model Chenda Li/wsj0_2mix_enh_train_enh_conv_tasnet_raw_valid.si_snr.ave</code>\n</pre>\n\t</li>\n\t<li><strong>Results</strong>\n\t<pre><code>\n# RESULTS\n## Environments\n- date: `Thu Feb 4 01:16:18 CST 2021`\n- python version: `3.7.6 (default, Jan 8 2020, 19:59:22) [GCC 7.3.0]`\n- espnet version: `espnet 0.9.7`\n- pytorch version: `pytorch 1.5.0`\n- Git hash: `a3334220b0352931677946d178fade3313cf82bb`\n - Commit date: `Fri Jan 29 23:35:47 2021 +0800`\n\n\n## enh_train_enh_conv_tasnet_raw\n\nconfig: ./conf/tuning/train_enh_conv_tasnet.yaml\n\n|dataset|STOI|SAR|SDR|SIR|\n|---|---|---|---|---|\n|enhanced_cv_min_8k|0.949205|17.3785|16.8028|26.9785|\n|enhanced_tt_min_8k|0.95349|16.6221|15.9494|25.9032|</code></pre>\n\t</li>\n\t<li><strong>ASR config</strong>\n\t<pre><code>config: ./conf/tuning/train_enh_conv_tasnet.yaml\nprint_config: false\nlog_level: INFO\ndry_run: false\niterator_type: chunk\noutput_dir: exp/enh_train_enh_conv_tasnet_raw\nngpu: 1\nseed: 0\nnum_workers: 4\nnum_att_plot: 3\ndist_backend: nccl\ndist_init_method: env://\ndist_world_size: null\ndist_rank: null\nlocal_rank: 0\ndist_master_addr: null\ndist_master_port: null\ndist_launcher: null\nmultiprocessing_distributed: false\ncudnn_enabled: true\ncudnn_benchmark: false\ncudnn_deterministic: true\ncollect_stats: false\nwrite_collected_feats: false\nmax_epoch: 100\npatience: 4\nval_scheduler_criterion:\n- valid\n- loss\nearly_stopping_criterion:\n- valid\n- loss\n- min\nbest_model_criterion:\n- - valid\n - si_snr\n - max\n- - valid\n - loss\n - min\nkeep_nbest_models: 1\ngrad_clip: 5.0\ngrad_clip_type: 2.0\ngrad_noise: false\naccum_grad: 1\nno_forward_run: false\nresume: true\ntrain_dtype: float32\nuse_amp: false\nlog_interval: null\nunused_parameters: false\nuse_tensorboard: true\nuse_wandb: false\nwandb_project: null\nwandb_id: null\npretrain_path: null\ninit_param: []\nfreeze_param: []\nnum_iters_per_epoch: null\nbatch_size: 8\nvalid_batch_size: null\nbatch_bins: 1000000\nvalid_batch_bins: null\ntrain_shape_file:\n- exp/enh_stats_8k/train/speech_mix_shape\n- exp/enh_stats_8k/train/speech_ref1_shape\n- exp/enh_stats_8k/train/speech_ref2_shape\nvalid_shape_file:\n- exp/enh_stats_8k/valid/speech_mix_shape\n- exp/enh_stats_8k/valid/speech_ref1_shape\n- exp/enh_stats_8k/valid/speech_ref2_shape\nbatch_type: folded\nvalid_batch_type: null\nfold_length:\n- 80000\n- 80000\n- 80000\nsort_in_batch: descending\nsort_batch: descending\nmultiple_iterator: false\nchunk_length: 32000\nchunk_shift_ratio: 0.5\nnum_cache_chunks: 1024\ntrain_data_path_and_name_and_type:\n- - dump/raw/tr_min_8k/wav.scp\n - speech_mix\n - sound\n- - dump/raw/tr_min_8k/spk1.scp\n - speech_ref1\n - sound\n- - dump/raw/tr_min_8k/spk2.scp\n - speech_ref2\n - sound\nvalid_data_path_and_name_and_type:\n- - dump/raw/cv_min_8k/wav.scp\n - speech_mix\n - sound\n- - dump/raw/cv_min_8k/spk1.scp\n - speech_ref1\n - sound\n- - dump/raw/cv_min_8k/spk2.scp\n - speech_ref2\n - sound\nallow_variable_data_keys: false\nmax_cache_size: 0.0\nmax_cache_fd: 32\nvalid_max_cache_size: null\noptim: adam\noptim_conf:\n lr: 0.001\n eps: 1.0e-08\n weight_decay: 0\nscheduler: reducelronplateau\nscheduler_conf:\n mode: min\n factor: 0.5\n patience: 1\ninit: xavier_uniform\nmodel_conf:\n loss_type: si_snr\nuse_preprocessor: false\nencoder: conv\nencoder_conf:\n channel: 256\n kernel_size: 20\n stride: 10\nseparator: tcn\nseparator_conf:\n num_spk: 2\n layer: 8\n stack: 4\n bottleneck_dim: 256\n hidden_dim: 512\n kernel: 3\n causal: false\n norm_type: gLN\n nonlinear: relu\ndecoder: conv\ndecoder_conf:\n channel: 256\n kernel_size: 20\n stride: 10\nrequired:\n- output_dir\nversion: 0.9.7\ndistributed: false</code></pre>\n\t</li>\n</ul>", "title": "ESPnet2 pretrained model, Chenda Li/wsj0_2mix_enh_train_enh_conv_tasnet_raw_valid.si_snr.ave, fs=8k, lang=en", "type": "article", "id": "4498562" }
All versions | This version | |
---|---|---|
Views | 321 | 317 |
Downloads | 463 | 463 |
Data volume | 16.3 GB | 16.3 GB |
Unique views | 297 | 295 |
Unique downloads | 356 | 356 |