Transformers: State-of-the-Art Natural Language Processing

doi:10.5281/zenodo.5911363

Published October 1, 2020 | Version v4.16.0

Software Open

Transformers: State-of-the-Art Natural Language Processing

New models Nyströmformer

The Nyströmformer model was proposed in Nyströmformer: A Nyström-Based Algorithm for Approximating Self-Attention by Yunyang Xiong, Zhanpeng Zeng, Rudrasis Chakraborty, Mingxing Tan, Glenn Fung, Yin Li, and Vikas Singh.

The Nyströmformer model overcomes the quadratic complexity of self-attention on the input sequence length by adapting the Nyström method to approximate standard self-attention, enabling longer sequences with thousands of tokens as input.

Add Nystromformer by @novice03 in https://github.com/huggingface/transformers/pull/14659

Compatible checkpoints can be found on the hub: https://huggingface.co/models?other=nystromformer

REALM

The REALM model was proposed in REALM: Retrieval-Augmented Language Model Pre-Training by Kelvin Guu, Kenton Lee, Zora Tung, Panupong Pasupat and Ming-Wei Chang.

It's a retrieval-augmented language model that firstly retrieves documents from a textual knowledge corpus and then utilizes retrieved documents to process question answering tasks.

Add REALM by @qqaatw in https://github.com/huggingface/transformers/pull/13292
Add FastTokenizer to REALM by @qqaatw in https://github.com/huggingface/transformers/pull/15211

Compatible checkpoints can be found on the hub: https://huggingface.co/models?other=realm

ViTMAE

The ViTMAE model was proposed in Masked Autoencoders Are Scalable Vision Learners by Kaiming He, Xinlei Chen, Saining Xie, Yanghao Li, Piotr Dollár, Ross Girshick.

The paper shows that, by pre-training a Vision Transformer (ViT) to reconstruct pixel values for masked patches, one can get results after fine-tuning that outperform supervised pre-training.

Add MAE by @NielsRogge in https://github.com/huggingface/transformers/pull/15120

Compatible checkpoints can be found on the hub: https://huggingface.co/models?other=vit_mae

ViLT

The ViLT model was proposed in ViLT: Vision-and-Language Transformer Without Convolution or Region Supervision by Wonjae Kim, Bokyung Son, Ildoo Kim.

ViLT incorporates text embeddings into a Vision Transformer (ViT), allowing it to have a minimal design for Vision-and-Language Pre-training (VLP).

Add ViLT by @NielsRogge in https://github.com/huggingface/transformers/pull/14895

Compatible checkpoints can be found on the hub: https://huggingface.co/models?other=vilt

Swin Transformer

The Swin Transformer was proposed in Swin Transformer: Hierarchical Vision Transformer using Shifted Windows by Ze Liu, Yutong Lin, Yue Cao, Han Hu, Yixuan Wei, Zheng Zhang, Stephen Lin, Baining Guo.

The Swin Transformer serves as a general-purpose backbone for computer vision. The shifted windowing scheme brings greater efficiency by limiting self-attention computation to non-overlapping local windows while also allowing for cross-window connection. This hierarchical architecture has the flexibility to model at various scales and has linear computational complexity with respect to image size.

Add Swin Transformer by @novice03 in https://github.com/huggingface/transformers/pull/15085

Compatible checkpoints can be found on the hub: https://huggingface.co/models?other=swin

YOSO

The YOSO model was proposed in You Only Sample (Almost) Once: Linear Cost Self-Attention Via Bernoulli Sampling by Zhanpeng Zeng, Yunyang Xiong, Sathya N. Ravi, Shailesh Acharya, Glenn Fung, Vikas Singh.

YOSO approximates standard softmax self-attention via a Bernoulli sampling scheme based on Locality Sensitive Hashing (LSH). In principle, all the Bernoulli random variables can be sampled with a single hash.

Add YOSO by @novice03 in #15091

Compatible checkpoints can be found on the hub: https://huggingface.co/models?other=yoso

Add model like

To help contributors add new models more easily to Transformers, there is a new command that will clone an existing model and set the various hooks in the library, so that you only have to write the tweaks needed to the modeling file. Just run transformers-cli add-new-model-like and fill the questionnaire!

Add model like by @sgugger in https://github.com/huggingface/transformers/pull/14992

Training scripts

New training scripts were introduced, for speech seq2seq models and an image pre-training script leveraging the ViTMAE models. Finally, an image captioning example in Flax gets added to the library.

Add Speech Seq2Seq Training script by @patrickvonplaten in https://github.com/huggingface/transformers/pull/14792
[ViTMAE] Add image pretraining script by @NielsRogge in https://github.com/huggingface/transformers/pull/15242
Add Flax image captioning example by @ydshieh in https://github.com/huggingface/transformers/pull/14864

Pipelines

Adding support for long files on automatic-speech-recognition (ASR) as well as supporting audio models with LM which increases the WER on many tasks See the blogpost. Also continuously increasing homogeneity in arguments, framework support on all pipelines.

Large audio chunking for the existing ASR pipeline by @anton-l in https://github.com/huggingface/transformers/pull/14896
Enabling TF on image-classification pipeline. by @Narsil in https://github.com/huggingface/transformers/pull/15030
Pipeline ASR with LM. by @Narsil in https://github.com/huggingface/transformers/pull/15071
ChunkPipeline: batch_size enabled on zero-cls and qa pipelines. by @Narsil in https://github.com/huggingface/transformers/pull/14225

PyTorch improvements

The ELECTRA model can now be used as a decoder, enabling an ELECTRA encoder-decoder model.

Add ElectraForCausalLM -> Enable Electra encoder-decoder model by @stancld in https://github.com/huggingface/transformers/pull/14729

TensorFlow improvements <FILL ME>

Keras metric callback by @Rocketknight1 and @merveenoyan in https://github.com/huggingface/transformers/pull/14867

The vision encoder decoder model can now be used in TensorFlow.

Add TFVisionEncoderDecoderModel by @ydshieh in https://github.com/huggingface/transformers/pull/14148

CLIP gets ported to TensorFlow.

Add TFCLIPModel by @ydshieh in https://github.com/huggingface/transformers/pull/13967

Flax improvements

RoFormer gets ported to Flax.

Add Flax RoFormer by @stancld in https://github.com/huggingface/transformers/pull/15005

Deprecations

Deprecates AdamW and adds --optim by @manuelciosici in https://github.com/huggingface/transformers/pull/14744

Documentation

The documentation has been fully migrated to MarkDown, if you are making contribution, make sure to read the upgraded guide on how to write good docstrings.

Convert rst files by @sgugger in https://github.com/huggingface/transformers/pull/14888
Doc styler v2 by @sgugger in https://github.com/huggingface/transformers/pull/14950
Convert last rst file by @sgugger in https://github.com/huggingface/transformers/pull/14952
Doc styler examples by @sgugger in https://github.com/huggingface/transformers/pull/14953
[doc] consistent True/False/None default format by @stas00 in https://github.com/huggingface/transformers/pull/14951
[doc] :obj: hunt by @stas00 in https://github.com/huggingface/transformers/pull/14954
[doc] :class: hunt by @stas00 in https://github.com/huggingface/transformers/pull/14955

Bugfixes and improvements

Fix installation instructions for BART ONNX example by @lewtun in https://github.com/huggingface/transformers/pull/14885
Fix doc examples: ... takes no keyword arguments by @ydshieh in https://github.com/huggingface/transformers/pull/14701
Fix AttributeError from PreTrainedTokenizerFast.decoder by @aphedges in https://github.com/huggingface/transformers/pull/14691
Add 'with torch.no_grad()' to ALBERT integration test forward pass by @henholm in https://github.com/huggingface/transformers/pull/14808
Add ONNX support for MarianMT models by @lewtun in https://github.com/huggingface/transformers/pull/14586
add custom stopping criteria to human eval script by @lvwerra in https://github.com/huggingface/transformers/pull/14897
Set run_name in MLflowCallback by @YangDong2002 in https://github.com/huggingface/transformers/pull/14894
[AutoTokenizer] Fix incorrect from pretrained by @patrickvonplaten in https://github.com/huggingface/transformers/pull/14900
[Tests] Update speech diarization and WavLM tolerances by @anton-l in https://github.com/huggingface/transformers/pull/14902
[doc] post-porting by @stas00 in https://github.com/huggingface/transformers/pull/14890
[Generate] Remove attention_mask and integrate model_main_input_name by @patrickvonplaten in https://github.com/huggingface/transformers/pull/14856
Fix failing GPU trainer tests by @sgugger in https://github.com/huggingface/transformers/pull/14903
Better logic for getting tokenizer config in AutoTokenizer by @sgugger in https://github.com/huggingface/transformers/pull/14906
[doc] install - add link to jax installation by @stas00 in https://github.com/huggingface/transformers/pull/14912
[WavLM] fix wavlm docs by @patrickvonplaten in https://github.com/huggingface/transformers/pull/14910
Fix Perceiver docs by @Sanster in https://github.com/huggingface/transformers/pull/14917
fix to issue #14833 in data_collator - consider no labels by @kleinay in https://github.com/huggingface/transformers/pull/14930
Fix duplicate call to save_checkpoint when using deepspeed by @MihaiBalint in https://github.com/huggingface/transformers/pull/14946
[WavLM] give model more precision tolerance in tests by @patrickvonplaten in https://github.com/huggingface/transformers/pull/14958
[Speech Recognition Examples] Update README.md by @patrickvonplaten in https://github.com/huggingface/transformers/pull/14965
[Tests] Speed up tokenizer tests by @patrickvonplaten in https://github.com/huggingface/transformers/pull/14964
[Wav2Vec2] Rename model's feature extractor to feature encoder by @patrickvonplaten in https://github.com/huggingface/transformers/pull/14959
Replace assertion with exception by @jaketae in https://github.com/huggingface/transformers/pull/14970
remove absl workaround as it's no longer needed by @stas00 in https://github.com/huggingface/transformers/pull/14909
Fixing a pathological case for slow tokenizers by @Narsil in https://github.com/huggingface/transformers/pull/14981
[AutoProcessor] Correct AutoProcessor and automatically add processor… by @patrickvonplaten in https://github.com/huggingface/transformers/pull/14881
[Generate] correct encoder_outputs are passed without attention_mask by @patrickvonplaten in https://github.com/huggingface/transformers/pull/14980
Adding num_return_sequences support for text2text generation. by @Narsil in https://github.com/huggingface/transformers/pull/14988
Enabling tokenizers upgrade. by @Narsil in https://github.com/huggingface/transformers/pull/14941
Allow training to resume even if RNG states are not properly loaded by @sgugger in https://github.com/huggingface/transformers/pull/14994
Map model_type and doc pages names by @sgugger in https://github.com/huggingface/transformers/pull/14944
Fixing t2t pipelines lists outputs. by @Narsil in https://github.com/huggingface/transformers/pull/15008
Improve truncation_side by @Narsil in https://github.com/huggingface/transformers/pull/14947
Fix doc examples: name 'torch' is not defined by @ydshieh in https://github.com/huggingface/transformers/pull/15016
[Tests] Correct Wav2Vec2 & WavLM tests by @patrickvonplaten in https://github.com/huggingface/transformers/pull/15015
[doc] Update parallelism.mdx by @hyunwoongko in https://github.com/huggingface/transformers/pull/15013
Fix Code block speech pretraining example by @flozi00 in https://github.com/huggingface/transformers/pull/14983
Fix a little typo by @milyiyo in https://github.com/huggingface/transformers/pull/15002
Hotfix chunk_length_s instead of _ms. by @Narsil in https://github.com/huggingface/transformers/pull/15029
[doc] Update parallelism.mdx by @hyunwoongko in https://github.com/huggingface/transformers/pull/15018
[megatron convert] PYTHONPATH requirements by @stas00 in https://github.com/huggingface/transformers/pull/14956
Fix doc example: mask_time_indices (numpy) has no attribute 'to' by @ydshieh in https://github.com/huggingface/transformers/pull/15033
Adding QoL for batch_size arg (like others enabled everywhere). by @Narsil in https://github.com/huggingface/transformers/pull/15027
[CLIP] Fix PT test by @patrickvonplaten in https://github.com/huggingface/transformers/pull/15041
[SpeechEncoderDecoder] Fix from pretrained by @patrickvonplaten in https://github.com/huggingface/transformers/pull/15043
[CLIP] Fix TF test by @patil-suraj in https://github.com/huggingface/transformers/pull/15042
Wrap Roberta integration test forward passes with torch.no_grad() by @mattchurgin in https://github.com/huggingface/transformers/pull/15037
Add Detectron2 to Github actions by @NielsRogge in https://github.com/huggingface/transformers/pull/15053
Remove old asserts. by @Narsil in https://github.com/huggingface/transformers/pull/15012
Add 'with torch.no_grad()' to BertGeneration integration test forward passes by @itsTurner in https://github.com/huggingface/transformers/pull/14963
Update run_speech_recognition_seq2seq.py (max_eval_samples instead of train_samples) by @flozi00 in https://github.com/huggingface/transformers/pull/14967
[VisionTextDualEncoder] Fix doc example by @ydshieh in https://github.com/huggingface/transformers/pull/15057
Resubmit changes after rebase to master by @kct22aws in https://github.com/huggingface/transformers/pull/14982
[Fix doc examples] missing from_pretrained by @ydshieh in https://github.com/huggingface/transformers/pull/15044
[VisionTextDualEncoder] Add token_type_ids param by @ydshieh in https://github.com/huggingface/transformers/pull/15073
Fix convert for newer megatron-lm bert model by @yoquankara in https://github.com/huggingface/transformers/pull/14082
[Wav2Vec2 Speech Event] Add speech event v2 by @patrickvonplaten in https://github.com/huggingface/transformers/pull/15083
fix model table cell text alignment by @ydshieh in https://github.com/huggingface/transformers/pull/14999
Update check_repo.py by @kamalkraj in https://github.com/huggingface/transformers/pull/15014
Make OpenAIGPTTokenizer work with SpaCy 2.x and 3.x by @cody-moveworks in https://github.com/huggingface/transformers/pull/15019
Change assignee for tokenizers by @LysandreJik in https://github.com/huggingface/transformers/pull/15088
support the trocr small models by @liminghao1630 in https://github.com/huggingface/transformers/pull/14893
[Fix doc example] RagModel by @ydshieh in https://github.com/huggingface/transformers/pull/15076
Model summary doc page horizontal banners by @mishig25 in https://github.com/huggingface/transformers/pull/15058
Use tqdm.auto in Pipeline docs by @bryant1410 in https://github.com/huggingface/transformers/pull/14920
[doc] normalize HF Transformers string by @stas00 in https://github.com/huggingface/transformers/pull/15023
Happy New Year! by @sgugger in https://github.com/huggingface/transformers/pull/15094
[DOC] fix doc examples for bart-like models by @patil-suraj in https://github.com/huggingface/transformers/pull/15093
[performance doc] Power and Cooling by @stas00 in https://github.com/huggingface/transformers/pull/14935
Add test to check reported training loss by @sgugger in https://github.com/huggingface/transformers/pull/15096
Take gradient accumulation into account when defining samplers by @sgugger in https://github.com/huggingface/transformers/pull/15095
[Fix doc example] Speech2TextForConditionalGeneration by @ydshieh in https://github.com/huggingface/transformers/pull/15092
Fix cookiecutter by @NielsRogge in https://github.com/huggingface/transformers/pull/15100
[Wav2Vec2ProcessorWithLM] improve decoder download by @patrickvonplaten in https://github.com/huggingface/transformers/pull/15040
Adds IBERT to models exportable with ONNX by @MaximovaIrina in https://github.com/huggingface/transformers/pull/14868
change metric_key_prefix in seq2seq_trainer.py by @JejuWayfarer in https://github.com/huggingface/transformers/pull/15099
Print out durations of all scheduled tests by @LysandreJik in https://github.com/huggingface/transformers/pull/15102
Fix failing W2V2 test by @LysandreJik in https://github.com/huggingface/transformers/pull/15104
Doc styler tip by @sgugger in https://github.com/huggingface/transformers/pull/15105
Update ONNX docs by @lewtun in https://github.com/huggingface/transformers/pull/14904
Fix saving FlaubertTokenizer configs by @vmaryasin in https://github.com/huggingface/transformers/pull/14991
Update TF test_step to match train_step by @Rocketknight1 in https://github.com/huggingface/transformers/pull/15111
use block_size instead of max_seq_length in tf run_clm example by @riklopfer in https://github.com/huggingface/transformers/pull/15036
fix: switch from slow to generic tokenizer class by @lvwerra in https://github.com/huggingface/transformers/pull/15122
Fix TFEncoderDecoder labels handling #14357 by @ydshieh in https://github.com/huggingface/transformers/pull/15001
Add ONNX configuration classes to docs by @lewtun in https://github.com/huggingface/transformers/pull/15121
Add with torch.no_grad() to DistilBERT integration test forward pass by @jaketae in https://github.com/huggingface/transformers/pull/14979
mBART support for run_summarization.py by @banda-larga in https://github.com/huggingface/transformers/pull/15125
doc-builder -> doc-build by @LysandreJik in https://github.com/huggingface/transformers/pull/15134
[Fix doc example] - ProphetNetDecoder by @ydshieh in https://github.com/huggingface/transformers/pull/15124
[examples/flax/language-modeling] set loglevel by @stas00 in https://github.com/huggingface/transformers/pull/15129
Update model_sharing.mdx by @carlos-aguayo in https://github.com/huggingface/transformers/pull/15142
Enable AMP for xla:gpu device in trainer class by @ymwangg in https://github.com/huggingface/transformers/pull/15022
[deepspeed tests] fix summarization by @stas00 in https://github.com/huggingface/transformers/pull/15149
Check the repo consistency in model templates test by @sgugger in https://github.com/huggingface/transformers/pull/15141
Add TF glu activation function by @gante in https://github.com/huggingface/transformers/pull/15146
Make sure all submodules are properly registered by @sgugger in https://github.com/huggingface/transformers/pull/15144
[Fix doc example] - OpenAIGPTDoubleHeadsModel by @ydshieh in https://github.com/huggingface/transformers/pull/15143
fix BertTokenizerFast tokenize_chinese_chars arg by @SaulLu in https://github.com/huggingface/transformers/pull/15158
Fix typo in test_configuration_common.py by @novice03 in https://github.com/huggingface/transformers/pull/15160
Add "open in hf spaces" gradio button issue #73 by @AK391 in https://github.com/huggingface/transformers/pull/15106
TF Bert inference - support np.ndarray optional arguments by @gante in https://github.com/huggingface/transformers/pull/15074
Fixing flaky test (hopefully). by @Narsil in https://github.com/huggingface/transformers/pull/15154
Better dummies by @sgugger in https://github.com/huggingface/transformers/pull/15148
Update from keras2onnx to tf2onnx by @gante in https://github.com/huggingface/transformers/pull/15162
[doc] performance: Efficient Software Prebuilds by @stas00 in https://github.com/huggingface/transformers/pull/15147
[Speech models] Disable non-existing chunking in tests by @patrickvonplaten in https://github.com/huggingface/transformers/pull/15163
Added forward pass of test_inference_image_classification_head by @MrinalTyagi in https://github.com/huggingface/transformers/pull/14777
Fix dtype issue in TF BART by @Rocketknight1 in https://github.com/huggingface/transformers/pull/15178
[doc] new MoE paper by @stas00 in https://github.com/huggingface/transformers/pull/15184
Mark bad tokenizers version by @sgugger in https://github.com/huggingface/transformers/pull/15188
[Fix doc example] UniSpeechSatForPreTraining by @ydshieh in https://github.com/huggingface/transformers/pull/15152
is_ctc needs to be updated to `self.type == "ctc". by @Narsil in https://github.com/huggingface/transformers/pull/15194
[Fix doc example] TFRagModel by @ydshieh in https://github.com/huggingface/transformers/pull/15187
Error when code examples are improperly closed by @sgugger in https://github.com/huggingface/transformers/pull/15186
Fix deprecation warnings for int div by @sgugger in https://github.com/huggingface/transformers/pull/15180
Copies and docstring styling by @sgugger in https://github.com/huggingface/transformers/pull/15202
[ASR pipeline] correct with lm pipeline by @patrickvonplaten in https://github.com/huggingface/transformers/pull/15200
Remove dependency to quiet Dependabot by @sgugger in https://github.com/huggingface/transformers/pull/15205
Ignore empty subfolders when identifying submodules by @sgugger in https://github.com/huggingface/transformers/pull/15204
[MBartTokenizer] remove dep on xlm-roberta tokenizer by @patil-suraj in https://github.com/huggingface/transformers/pull/15201
fix: #14486 do not use BertPooler in DPR by @PaulLerner in https://github.com/huggingface/transformers/pull/15068
[Fix doc example] Wrong checkpoint name by @ydshieh in https://github.com/huggingface/transformers/pull/15079
[Robust Speech Event] Add guides by @patrickvonplaten in https://github.com/huggingface/transformers/pull/15155
Enable tqdm toggling by @jaketae in https://github.com/huggingface/transformers/pull/15167
[FLAX] glue training example refactor by @kamalkraj in https://github.com/huggingface/transformers/pull/13815
Rename compute_loss in TF models by @Rocketknight1 in https://github.com/huggingface/transformers/pull/15207
Build dev documentation by @LysandreJik in https://github.com/huggingface/transformers/pull/15210
[Fix doc example] TFFunnelTokenizer' is not defined by @ydshieh in https://github.com/huggingface/transformers/pull/15225
Correct Speech Event Readme by @patrickvonplaten in https://github.com/huggingface/transformers/pull/15226
[ViTMAE] Various fixes by @NielsRogge in https://github.com/huggingface/transformers/pull/15221
[Speech Event] Fix speech event readme by @patil-suraj in https://github.com/huggingface/transformers/pull/15227
Fix typo in BERT tokenization file by @qqaatw in https://github.com/huggingface/transformers/pull/15228
Fix PR number by @LysandreJik in https://github.com/huggingface/transformers/pull/15231
Adapt Common Voice Talk Title and Abstract by @patrickvonplaten in https://github.com/huggingface/transformers/pull/15233
Update Trainer code example by @NielsRogge in https://github.com/huggingface/transformers/pull/15070
Make chuking smartly (long files) work on asr ctc_with_lm. by @Narsil in https://github.com/huggingface/transformers/pull/15219
Fix usage of additional kwargs in from_encoder_decoder_pretrained in encoder-decoder models by @jsnfly in https://github.com/huggingface/transformers/pull/15056
Update README.md by @anton-l in https://github.com/huggingface/transformers/pull/15239
Update README.md by @anton-l in https://github.com/huggingface/transformers/pull/15246
Update pipelines.mdx by @kamalkraj in https://github.com/huggingface/transformers/pull/15243
[Fix doc example] missing import by @ydshieh in https://github.com/huggingface/transformers/pull/15240
Fixes tf_default_data_collator sometimes guessing the wrong dtype for labels by @Rocketknight1 in https://github.com/huggingface/transformers/pull/15234
Make sure to raise NotImplementedError with correct method name by @kumapo in https://github.com/huggingface/transformers/pull/15253
Fix crash when logs are empty because Keras has wiped them out of spite by @Rocketknight1 in https://github.com/huggingface/transformers/pull/15258
Tentative workflow improvement by @LysandreJik in https://github.com/huggingface/transformers/pull/15255
Fix code examples by @NielsRogge in https://github.com/huggingface/transformers/pull/15257
Adds missing module_specs for usages of _LazyModule by @jkuball in https://github.com/huggingface/transformers/pull/15230
Prepare ONNX export for torch v1.11 by @lewtun in https://github.com/huggingface/transformers/pull/15270
Fix by @novice03 in https://github.com/huggingface/transformers/pull/15276
Move BART + ONNX example to research_projects by @lewtun in https://github.com/huggingface/transformers/pull/15271
Specify providers explicitly in ORT session initialization by @wangyems in https://github.com/huggingface/transformers/pull/15235
Fixes Benchmark example link by @evandrosks in https://github.com/huggingface/transformers/pull/15278
[Robust Speech Challenge] Add timeline by @patrickvonplaten in https://github.com/huggingface/transformers/pull/15274
[Fix doc example] TFLayoutLMForTokenClassification: missing import tf by @ydshieh in https://github.com/huggingface/transformers/pull/15268
[Wav2Vec2ProcessorWithLM] improve multi processing by @patrickvonplaten in https://github.com/huggingface/transformers/pull/15247
Refine errors for pretrained objects by @sgugger in https://github.com/huggingface/transformers/pull/15261
[PyTorch-nightly-test] Fix Wav2Vec2 LM & Phoneme tests by @patrickvonplaten in https://github.com/huggingface/transformers/pull/15272
Update eval.py by @patrickvonplaten in https://github.com/huggingface/transformers/pull/15310
Update CONTRIBUTING.md by @kamalkraj in https://github.com/huggingface/transformers/pull/15290
Fix a typo in tag addition by @sgugger in https://github.com/huggingface/transformers/pull/15286
Remove old debug code leftover. by @Narsil in https://github.com/huggingface/transformers/pull/15306
[Fix doc example] fix missing import jnp by @ydshieh in https://github.com/huggingface/transformers/pull/15291
[LayoutLMV2 Tests] Make sure input is on GPU by @patrickvonplaten in https://github.com/huggingface/transformers/pull/15314
Replace NystromformerTokenizer with AutoTokenizer by @novice03 in https://github.com/huggingface/transformers/pull/15312
[Beam Search] Correct returned beam scores by @patrickvonplaten in https://github.com/huggingface/transformers/pull/14654
[Examples] Correct run ner label2id for fine-tuned models by @patrickvonplaten in https://github.com/huggingface/transformers/pull/15017
Avoid using get_list_of_files by @sgugger in https://github.com/huggingface/transformers/pull/15287
[Tests] Fix test by @NielsRogge in https://github.com/huggingface/transformers/pull/15324
Add 🤗 Accelerate tutorial by @stevhliu in https://github.com/huggingface/transformers/pull/15263
Added missing code in exemplary notebook - custom datasets fine-tuning by @Pawloch247 in https://github.com/huggingface/transformers/pull/15300
Fix encoder-decoder models when labels is passed by @ydshieh in https://github.com/huggingface/transformers/pull/15172
Fix table formatting in SegFormer docs by @deppen8 in https://github.com/huggingface/transformers/pull/15337
Fix deepspeed docs by @ngoquanghuy99 in https://github.com/huggingface/transformers/pull/15346
Fix 'eval_split_name' described as defaulting to 'train' by @FremyCompany in https://github.com/huggingface/transformers/pull/15348
Update doc writing guide by @sgugger in https://github.com/huggingface/transformers/pull/15350
Add YOSO by @novice03 in https://github.com/huggingface/transformers/pull/15091
[docs] post-PR merge fix by @stas00 in https://github.com/huggingface/transformers/pull/15355
Fix YosoConfig doc by @sgugger in https://github.com/huggingface/transformers/pull/15353
[DocTests Speech] Add doc tests for all speech models by @patrickvonplaten in https://github.com/huggingface/transformers/pull/15031
Push to hub save by @sgugger in https://github.com/huggingface/transformers/pull/15327
Fix KerasMetricCallback prediction with generate() and inference of column names by @Rocketknight1 in https://github.com/huggingface/transformers/pull/15351
Add a device argument to the eval script by @anton-l in https://github.com/huggingface/transformers/pull/15371
improve saving strategy of sentencepiece tokenizer by @SaulLu in https://github.com/huggingface/transformers/pull/15328
Implement fixes for TrainingArguments doc by @sgugger in https://github.com/huggingface/transformers/pull/15370
Super-small fix stops us confusing Keras console logging by modifying… by @Rocketknight1 in https://github.com/huggingface/transformers/pull/15373
Add proper documentation for Keras callbacks by @sgugger in https://github.com/huggingface/transformers/pull/15374
Example script for PushToHubCallback by @Rocketknight1 in https://github.com/huggingface/transformers/pull/15375

Impressive community contributors

The community contributors below have significantly contributed to the v4.16.0 release. Thank you!

@novice03, for contributing Nyströmformer, Swin Transformer and YOSO
@qqaatw, for contributing REALM
@stancld, for adding support for ELECTRA as a decoder, and porting RoFormer to Flax
@ydshieh, for a myriad of documentation fixes, the port of CLIP to TensorFlow, the addition of the TensorFlow vision encoder-decoder model, and the contribution of an image captioning example in Flax.

New Contributors

@YangDong2002 made their first contribution in https://github.com/huggingface/transformers/pull/14894
@Sanster made their first contribution in https://github.com/huggingface/transformers/pull/14917
@kleinay made their first contribution in https://github.com/huggingface/transformers/pull/14930
@MihaiBalint made their first contribution in https://github.com/huggingface/transformers/pull/14946
@milyiyo made their first contribution in https://github.com/huggingface/transformers/pull/15002
@mattchurgin made their first contribution in https://github.com/huggingface/transformers/pull/15037
@itsTurner made their first contribution in https://github.com/huggingface/transformers/pull/14963
@kct22aws made their first contribution in https://github.com/huggingface/transformers/pull/14982
@yoquankara made their first contribution in https://github.com/huggingface/transformers/pull/14082
@cody-moveworks made their first contribution in https://github.com/huggingface/transformers/pull/15019
@MaximovaIrina made their first contribution in https://github.com/huggingface/transformers/pull/14868
@JejuWayfarer made their first contribution in https://github.com/huggingface/transformers/pull/15099
@novice03 made their first contribution in https://github.com/huggingface/transformers/pull/14659
@banda-larga made their first contribution in https://github.com/huggingface/transformers/pull/15125
@manuelciosici made their first contribution in https://github.com/huggingface/transformers/pull/14744
@carlos-aguayo made their first contribution in https://github.com/huggingface/transformers/pull/15142
@gante made their first contribution in https://github.com/huggingface/transformers/pull/15146
@AK391 made their first contribution in https://github.com/huggingface/transformers/pull/15106
@MrinalTyagi made their first contribution in https://github.com/huggingface/transformers/pull/14777
@jsnfly made their first contribution in https://github.com/huggingface/transformers/pull/15056
@jkuball made their first contribution in https://github.com/huggingface/transformers/pull/15230
@wangyems made their first contribution in https://github.com/huggingface/transformers/pull/15235
@evandrosks made their first contribution in https://github.com/huggingface/transformers/pull/15278
@Pawloch247 made their first contribution in https://github.com/huggingface/transformers/pull/15300
@deppen8 made their first contribution in https://github.com/huggingface/transformers/pull/15337
@ngoquanghuy99 made their first contribution in https://github.com/huggingface/transformers/pull/15346

Full Changelog: https://github.com/huggingface/transformers/compare/v4.15.0...v4.16.0

Notes

If you use this software, please cite it using these metadata.

Files

huggingface/transformers-v4.16.0.zip

Files (10.0 MB)

Name	Size	Download all
huggingface/transformers-v4.16.0.zip md5:c7645f6587b93c480e6bb38e91591fb0	10.0 MB	Preview Download

Additional details

Is supplement to: https://github.com/huggingface/transformers/tree/v4.16.0 (URL)

	All versions	This version
Views	69,671	720
Downloads	2,029	10
Data volume	20.5 GB	100.2 MB

Transformers: State-of-the-Art Natural Language Processing

Creators

Description

Notes

Files

huggingface/transformers-v4.16.0.zip

Files (10.0 MB)

Additional details

Related works