Software Open Access
Wolf, Thomas; Debut, Lysandre; Sanh, Victor; Chaumond, Julien; Delangue, Clement; Moi, Anthony; Cistac, Perric; Ma, Clara; Jernite, Yacine; Plu, Julien; Xu, Canwen; Le Scao, Teven; Gugger, Sylvain; Drame, Mariama; Lhoest, Quentin; Rush, Alexander M.
Eight new models are released as part of the Perceiver implementation:
PerceiverForMultimodalAutoencoding, in PyTorch.
The Perceiver IO model was proposed in Perceiver IO: A General Architecture for Structured Inputs & Outputs by Andrew Jaegle, Sebastian Borgeaud, Jean-Baptiste Alayrac, Carl Doersch, Catalin Ionescu, David Ding, Skanda Koppula, Daniel Zoran, Andrew Brock, Evan Shelhamer, Olivier Hénaff, Matthew M. Botvinick, Andrew Zisserman, Oriol Vinyals, João Carreira.
Compatible checkpoints can be found on the hub: https://huggingface.co/models?other=perceivermLUKE
The mLUKE tokenizer is added. The tokenizer can be used for the multilingual variant of LUKE.
The mLUKE model was proposed in mLUKE: The Power of Entity Representations in Multilingual Pretrained Language Models by Ryokan Ri, Ikuya Yamada, and Yoshimasa Tsuruoka. It's a multilingual extension of the LUKE model trained on the basis of XLM-RoBERTa.
Compatible checkpoints can be found on the hub: https://huggingface.co/models?other=lukeImageGPT
Three new models are released as part of the ImageGPT integration:
ImageGPTForImageClassification, in PyTorch.
The ImageGPT model was proposed in Generative Pretraining from Pixels by Mark Chen, Alec Radford, Rewon Child, Jeffrey Wu, Heewoo Jun, David Luan, Ilya Sutskever. ImageGPT (iGPT) is a GPT-2-like model trained to predict the next pixel value, allowing for both unconditional and conditional image generation.
Compatible checkpoints can be found on the hub: https://huggingface.co/models?other=imagegptQDQBert
Eight new models are released as part of the QDQBert implementation:
QDQBertForQuestionAnswering, in PyTorch.
The QDQBERT model can be referenced in Integer Quantization for Deep Learning Inference: Principles and Empirical Evaluation by Hao Wu, Patrick Judd, Xiaojie Zhang, Mikhail Isaev and Paulius Micikevicius.
The semantic Segmentation models' API is unstable and bound to change between this version and the next.
The first semantic segmentation models are added. In semantic segmentation, the goal is to predict a class label for every pixel of an image. The models that are added are SegFormer (by NVIDIA) and BEiT (by Microsoft Research). BEiT was already available in the library, but this release includes the model with a semantic segmentation head.
The SegFormer model was proposed in SegFormer: Simple and Efficient Design for Semantic Segmentation with Transformers by Enze Xie, Wenhai Wang, Zhiding Yu, Anima Anandkumar, Jose M. Alvarez, Ping Luo. The model consists of a hierarchical Transformer encoder and a lightweight all-MLP decode head to achieve great results on image segmentation benchmarks such as ADE20K and Cityscapes.
The BEiT model was proposed in BEiT: BERT Pre-Training of Image Transformers by Hangbo Bao, Li Dong, Furu Wei. Rather than pre-training the model to predict the class of an image (as done in the original ViT paper), BEiT models are pre-trained to predict visual tokens from the codebook of OpenAI's DALL-E model given masked patches.
Adds VisionTextDualEncoder model in PyTorch and Flax to be able to load any pre-trained vision (ViT, DeiT, BeiT, CLIP's vision model) and text (BERT, ROBERTA) model in the library for vision-text tasks like CLIP.
This model pairs a vision and text encoder and adds projection layers to project the embeddings to another embeddings space with similar dimensions. which can then be used to align the two modalities.
CodeParrot, a model trained to generate code, has been open-sourced in the research projects by @lvwerra.
See https://huggingface.co/patrickvonplaten/wav2vec2-xlsr-53-es-kenlm for more information.Flax-specific additions
Adds Flax version of the vision encoder-decoder model, and adds a Flax version of GPT-J.
Vision transformers are here! Convnets are so 2012, now that ML is converging on self-attention as a universal model.
Want to handle real-world tables, where text and data are positioned in a 2D grid? TAPAS is now here for both TensorFlow and PyTorch.
Automatic checkpointing and cloud saves to the HuggingFace Hub during training are now live, allowing you to resume training when it's interrupted, even if your initial instance is terminated. This is an area of very active development - watch this space for future developments, including automatic model card creation and more.
A new class to automatically select processors is added:
AutoProcessor. It can be used for all models that require a processor, in both computer vision and audio.
A new documentation frontend is out for the
transformers library! The goal with this documentation is to be better aligned with the rest of our website, and contains tools to improve readability. The documentation can now be written in markdown rather than RST.
The LayoutLMv2 feature extractor now supports non-English languages, and LayoutXLM gets its own processor.
You can now take advantage of the Ampere hardware with the Trainer:
--bf16- do training or eval in mixed precision of bfloat16
--bf16_full_eval- do eval in full bfloat16
--tf32control having TF32 mode on/off
batch_sizesupport for (almost) all pipelines by @Narsil in https://github.com/huggingface/transformers/pull/13724
BlenderbotTokenizerFastby @stancld in https://github.com/huggingface/transformers/pull/13720
text-generationpipeline. by @Narsil in https://github.com/huggingface/transformers/pull/14118
image-segmentationtests. by @Narsil in https://github.com/huggingface/transformers/pull/14223
image_utils.py& fix image rotation issue by @mishig25 in https://github.com/huggingface/transformers/pull/14062
feature-extractionpipeline. by @Narsil in https://github.com/huggingface/transformers/pull/14193
ignore_labels. by @Narsil in https://github.com/huggingface/transformers/pull/14274
DPRPretrainedModelfrom docs by @xhlulu in https://github.com/huggingface/transformers/pull/14300
pipeline. by @Narsil in https://github.com/huggingface/transformers/pull/14316
np.ndarraybefore converting to pytorch tensors by @eladsegal in https://github.com/huggingface/transformers/pull/14306
pipelinefunction. by @Narsil in https://github.com/huggingface/transformers/pull/14322
generatorin addition to
Datasetfor pipelines by @Narsil in https://github.com/huggingface/transformers/pull/14352
AlbertConverterfor FNet instead of using FNet's own converter by @qqaatw in https://github.com/huggingface/transformers/pull/14365
inputs_embedsas an input by @patrickvonplaten in https://github.com/huggingface/transformers/pull/14443
attentionsin unbatching support. by @Narsil in https://github.com/huggingface/transformers/pull/14420
hf-internal-testing. by @Narsil in https://github.com/huggingface/transformers/pull/14463
add-new-pipelinedocs a bit by @stancld in https://github.com/huggingface/transformers/pull/14485
~/.cache/torch_extensionsbetween builds by @stas00 in https://github.com/huggingface/transformers/pull/14520
__call__method by @xhlulu in https://github.com/huggingface/transformers/pull/14379