There is a newer version of the record available.

Published March 30, 2020 | Version v2.7.0
Software Open

huggingface/transformers: T5 Model, BART summarization example and reduced memory, translation pipeline

Description

T5 Model (@patrickvonplaten, @thomwolf )

T5 is a powerful encoder-decoder model that formats every NLP problem into a text-to-text format. It achieves state of the art results on a variety of NLP tasks (Summarization, Question-Answering, ...).

Five sets of pre-trained weights (pre-trained on a multi-task mixture of unsupervised and supervised tasks) are released. In ascending order from 60 million parameters to 11 billion parameters:

t5-small, t5-base, t5-large, t5-3b, t5-11b

T5 can now be used with the translation and summarization pipeline.

Related:

Big thanks to the original authors, especially @craffel who helped answer our questions, reviewed PRs and tested T5 extensively.

New BART checkpoint: bart-large-xsum (@sshleifer)

These weights are from BART finetuned on the XSum abstractive summarization challenge, which encourages shorter (more abstractive) summaries. It achieves state of the art.

BART summarization example with pytorch-lightning (@acarrera94)

New example: BART for summarization, using Pytorch-lightning. Trains on CNN/DM and evaluates.

Translation pipeline (@patrickvonplaten)

A new pipeline is available, leveraging the T5 model. The T5 model was added to the summarization pipeline as well.

Memory improvements with BART (@sshleifer)

In an effort to have the same memory footprint and same computing power necessary to run inference on BART, several improvements have been made on the model:

  • Remove the LM head and use the embedding matrix instead (~200MB)
  • Call encoder before expanding input_ids (~1GB)
  • SelfAttention only returns weights if config.output_attentions (~500MB)
  • Two separate, smaller decoder attention masks (~500MB)
  • drop columns that are exclusively pad_token_id from input_ids in evaluate_cnn example.
New model: XLMForTokenClassification (@sakares)

A new head was added to XLM: XLMForTokenClassification.

Files

huggingface/transformers-v2.7.0.zip

Files (3.4 MB)

Name Size Download all
md5:34c956d2341df1df2bc00c1f6d692b60
3.4 MB Preview Download

Additional details