huggingface/transformers: ELECTRA, Bad word filters, bugfixes & improvements

doi:10.5281/zenodo.3741842

Published April 6, 2020 | Version v2.8.0

Software Open

huggingface/transformers: ELECTRA, Bad word filters, bugfixes & improvements

1. @huggingface
2. Hugging Face
3. @canalplus
4. HuggingFace
5. Huggingface
6. DisAItek & Intel AI Innovators
7. @UGent
8. Qlik
9. Leboncoin Lab
10. deepset
11. Stanford University
12. University of Southern California

ELECTRA Model (@LysandreJik)

ELECTRA is a new method for self-supervised language representation learning. It can be used to pre-train transformer networks using relatively little compute. ELECTRA models are trained to distinguish "real" input tokens vs "fake" input tokens generated by another neural network, similar to the discriminator of a GAN. At small scale, ELECTRA achieves strong results even when trained on a single GPU. At large scale, ELECTRA achieves state-of-the-art results on the SQuAD 2.0 dataset.

This release comes with 6 ELECTRA checkpoints:

google/electra-small-discriminator
google/electra-small-generator
google/electra-base-discriminator
google/electra-base-generator
google/electra-large-discriminator
google/electra-large-generator

Paper
Official code
Models available in the community models
Docs

Thanks to the author @clarkkev for his help during the implementation.

Bad word filters in generate (@patrickvonplaten)

The generate method now has a bad word filter.

Fixes and improvements

Decoder input ids are not necessary for T5 training anymore (@patrickvonplaten)
Update encoder and decoder on set_input_embedding for BART (@sshleifer)
Using loaded checkpoint with --do_predict (instead of random init) for Pytorch-lightning scripts (@ethanjperez)
Clean summarization and translation example testing files for T5 and Bart (@patrickvonplaten)
Cleaner examples (@julien-c)
Extensive testing for T5 model (@patrickvonplaten)
Force models outputs to always have batch_size as their first dim (@patrickvonplaten)
Fix for continuing training in some scripts (@xeb)
Resizing embedding matrix before sending it to the optimizer (@ngarneau)
BertJapaneseTokenizer accept options for mecab (@tamuhey)
Speed up GELU computation with torch.jit (@mryab)
fix argument order of update_mems fn in TF version (@patrickvonplaten, @dmytyar)
Split generate test function into beam search, no beam search (@patrickvonplaten)

Files

huggingface/transformers-v2.8.0.zip

Files (3.4 MB)

Name	Size	Download all
huggingface/transformers-v2.8.0.zip md5:a809f09fdaceadc5e1191492adbd4078	3.4 MB	Preview Download

Additional details

Is supplement to: https://github.com/huggingface/transformers/tree/v2.8.0 (URL)

	All versions	This version
Views	72,825	520
Downloads	2,170	19
Data volume	21.9 GB	68.9 MB

huggingface/transformers: ELECTRA, Bad word filters, bugfixes & improvements

Creators

Description

Files

huggingface/transformers-v2.8.0.zip

Files (3.4 MB)

Additional details

Related works