There is a newer version of this record available.

Software Open Access

Transformers: State-of-the-Art Natural Language Processing

Wolf, Thomas; Debut, Lysandre; Sanh, Victor; Chaumond, Julien; Delangue, Clement; Moi, Anthony; Cistac, Perric; Ma, Clara; Jernite, Yacine; Plu, Julien; Xu, Canwen; Le Scao, Teven; Gugger, Sylvain; Drame, Mariama; Lhoest, Quentin; Rush, Alexander M.

New Model additions WavLM

WavLM was proposed in WavLM: Large-Scale Self-Supervised Pre-Training for Full Stack Speech Processing by Sanyuan Chen, Chengyi Wang, Zhengyang Chen, Yu Wu, Shujie Liu, Zhuo Chen, Jinyu Li, Naoyuki Kanda, Takuya Yoshioka, Xiong Xiao, Jian Wu, Long Zhou, Shuo Ren, Yanmin Qian, Yao Qian, Jian Wu, Michael Zeng, Furu Wei.

WavLM sets a new SOTA on the SUPERB benchmark.

Compatible checkpoints can be found on the hub: https://huggingface.co/models?other=wavlm

Wav2Vec2Phoneme

Wav2Vec2Phoneme was proposed in Simple and Effective Zero-shot Cross-lingual Phoneme Recognition by Qiantong Xu, Alexei Baevski, Michael Auli. Wav2Vec2Phoneme allows to do phoneme classification as part of automatic speech recognition

Compatible checkpoints can be found on the hub: https://huggingface.co/models?other=phoneme-recognition

UniSpeech-SAT

Unispeech-SAT was proposed in UNISPEECH-SAT: UNIVERSAL SPEECH REPRESENTATION LEARNING WITH SPEAKER AWARE PRE-TRAINING by Sanyuan Chen, Yu Wu, Chengyi Wang, Zhengyang Chen, Zhuo Chen, Shujie Liu, Jian Wu, Yao Qian, Furu Wei, Jinyu Li, Xiangzhan Yu.

UniSpeech-SAT is especially good at speaker related tasks.

Compatible checkpoints can be found on the hub: https://huggingface.co/models?other=unispeech-sat

UniSpeech

Unispeech was proposed in UniSpeech: Unified Speech Representation Learning with Labeled and Unlabeled Data by Chengyi Wang, Yu Wu, Yao Qian, Kenichi Kumatani, Shujie Liu, Furu Wei, Michael Zeng, Xuedong Huang. Three new models are released as part of the ImageGPT integration: ImageGPTModel, ImageGPTForCausalImageModeling, ImageGPTForImageClassification, in PyTorch.

Compatible checkpoints can be found on the hub: https://huggingface.co/models?other=unispeech

New Tasks Speaker Diarization and Verification

Wav2Vec2-like architecture now have a speaker diarization and speaker verification head added to their architectures. You can try out the new task here: https://huggingface.co/spaces/microsoft/wavlm-speaker-verification

What's Changed New Contributors

Full Changelog: https://github.com/huggingface/transformers/compare/v4.14.0...v4.15.0

If you use this software, please cite it using these metadata.
Files (9.7 MB)
Name Size
huggingface/transformers-v4.15.0.zip
md5:00aeb7dde459eec035b9ade9c20805dd
9.7 MB Download
38,180
1,296
views
downloads
All versions This version
Views 38,180976
Downloads 1,29626
Data volume 10.0 GB251.4 MB
Unique views 31,894902
Unique downloads 67025

Share

Cite as