huggingface/transformers: Marian

Published May 14, 2020 | Version v2.9.1

Software Open

Marian (@sshleifer)

A new model architecture, MarianMTModel with 1,008+ pretrained weights is available for machine translation in PyTorch.
The corresponding MarianTokenizer uses a prepare_translation_batch method to prepare model inputs.
All pretrained model names use the following format: Helsinki-NLP/opus-mt-{src}-{tgt}
See docs for information on pretrained model discovery and naming, or find your language here

AlbertForPreTraining (@jarednielsen)

A new model architecture has been added: AlbertForPreTraining in both PyTorch and TensorFlow

TF 2.2 compatibility (@mfuntowicz, @jplu)

Changes have been made to both the TensorFlow scripts and our internals so that we are compatible with TensorFlow 2.2

TFTrainer now supports new tasks

Fixes and improvements

Fixed a bug with the tf generation pipeline (@patrickvonplaten)
Fixed the XLA spawn (@julien-c)
The sentiment analysis pipeline tokenizer was cased while the model was uncased (@mfuntowicz)
Albert was added to the conversion CLI (@fgaim)
CamemBERT's token ID generation from tokenizer were removed like RoBERTa, as the model does not use them (@LysandreJik)
Additional migration documentation was added (@guoquan)
GPT-2 can now be exported to ONNX (@tianleiwu)
Simplify cache vars and allow for TRANSFORMERS_CACHE env (@BramVanroy)
Remove hard-coded pad token id in distilbert and albert (@monologg)
BART tests were fixed on GPU (@julien-c)
Better wandb integration (@vanpelt, @borisdayma, @julien-c)

Files

Name	Size	Download all
huggingface/transformers-v2.9.1.zip md5:93e3080e5e9d50be32106758e15d9b99	3.7 MB	Preview Download

Is supplement to: https://github.com/huggingface/transformers/tree/v2.9.1 (URL)