huggingface/transformers: T5 Model, BART summarization example and reduced memory, translation pipeline
Creators
- Thomas Wolf1
- Lysandre Debut2
- Julien Chaumond2
- Victor SANH1
- Patrick von Platen
- Aymeric Augustin3
- Rémi Louf
- Funtowicz Morgan4
- Stefan Schweter
- Denis
- Sam Shleifer5
- erenup
- Manuel Romero
- Matt
- Piero Molino
- Grégory Châtel6
- Bram Vanroy7
- Tim Rault1
- Gunnlaugur Thor Briem8
- Anthony MOI2
- Malte Pietsch9
- Julien Plu10
- Catalin Voss11
- Bilal Khan
- Fei Wang12
- Martin Malmsten
- Louis Martin
- Davide Fiocco
- Clement1
- Ananya Harsh Jha
- 1. @huggingface
- 2. Hugging Face
- 3. @canalplus
- 4. HuggingFace
- 5. Huggingface
- 6. DisAItek & Intel AI Innovators
- 7. @UGent
- 8. Qlik
- 9. deepset
- 10. Leboncoin Lab
- 11. Stanford University
- 12. University of Southern California
Description
T5 is a powerful encoder-decoder model that formats every NLP problem into a text-to-text format. It achieves state of the art results on a variety of NLP tasks (Summarization, Question-Answering, ...).
Five sets of pre-trained weights (pre-trained on a multi-task mixture of unsupervised and supervised tasks) are released. In ascending order from 60 million parameters to 11 billion parameters:
t5-small
, t5-base
, t5-large
, t5-3b
, t5-11b
T5 can now be used with the translation and summarization pipeline.
Related:
- paper
- official code
- model available in Hugging Face's community models
- docs
Big thanks to the original authors, especially @craffel who helped answer our questions, reviewed PRs and tested T5 extensively.
New BART checkpoint:bart-large-xsum
(@sshleifer)
These weights are from BART finetuned on the XSum abstractive summarization challenge, which encourages shorter (more abstractive) summaries. It achieves state of the art.
BART summarization example with pytorch-lightning (@acarrera94)New example: BART for summarization, using Pytorch-lightning. Trains on CNN/DM and evaluates.
Translation pipeline (@patrickvonplaten)A new pipeline is available, leveraging the T5 model. The T5 model was added to the summarization pipeline as well.
Memory improvements with BART (@sshleifer)In an effort to have the same memory footprint and same computing power necessary to run inference on BART, several improvements have been made on the model:
- Remove the LM head and use the embedding matrix instead (~200MB)
- Call encoder before expanding input_ids (~1GB)
- SelfAttention only returns weights if config.output_attentions (~500MB)
- Two separate, smaller decoder attention masks (~500MB)
- drop columns that are exclusively pad_token_id from input_ids in
evaluate_cnn
example.
A new head was added to XLM: XLMForTokenClassification
.
Files
huggingface/transformers-v2.7.0.zip
Files
(3.4 MB)
Name | Size | Download all |
---|---|---|
md5:34c956d2341df1df2bc00c1f6d692b60
|
3.4 MB | Preview Download |
Additional details
Related works
- Is supplement to
- https://github.com/huggingface/transformers/tree/v2.7.0 (URL)