There is a newer version of this record available.

Software Open Access

Transformers: State-of-the-Art Natural Language Processing

Wolf, Thomas; Debut, Lysandre; Sanh, Victor; Chaumond, Julien; Delangue, Clement; Moi, Anthony; Cistac, Perric; Ma, Clara; Jernite, Yacine; Plu, Julien; Xu, Canwen; Le Scao, Teven; Gugger, Sylvain; Drame, Mariama; Lhoest, Quentin; Rush, Alexander M.


DataCite XML Export

<?xml version='1.0' encoding='utf-8'?>
<resource xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://datacite.org/schema/kernel-4" xsi:schemaLocation="http://datacite.org/schema/kernel-4 http://schema.datacite.org/meta/kernel-4.1/metadata.xsd">
  <identifier identifierType="DOI">10.5281/zenodo.5608580</identifier>
  <creators>
    <creator>
      <creatorName>Wolf, Thomas</creatorName>
      <givenName>Thomas</givenName>
      <familyName>Wolf</familyName>
    </creator>
    <creator>
      <creatorName>Debut, Lysandre</creatorName>
      <givenName>Lysandre</givenName>
      <familyName>Debut</familyName>
    </creator>
    <creator>
      <creatorName>Sanh, Victor</creatorName>
      <givenName>Victor</givenName>
      <familyName>Sanh</familyName>
    </creator>
    <creator>
      <creatorName>Chaumond, Julien</creatorName>
      <givenName>Julien</givenName>
      <familyName>Chaumond</familyName>
    </creator>
    <creator>
      <creatorName>Delangue, Clement</creatorName>
      <givenName>Clement</givenName>
      <familyName>Delangue</familyName>
    </creator>
    <creator>
      <creatorName>Moi, Anthony</creatorName>
      <givenName>Anthony</givenName>
      <familyName>Moi</familyName>
    </creator>
    <creator>
      <creatorName>Cistac, Perric</creatorName>
      <givenName>Perric</givenName>
      <familyName>Cistac</familyName>
    </creator>
    <creator>
      <creatorName>Ma, Clara</creatorName>
      <givenName>Clara</givenName>
      <familyName>Ma</familyName>
    </creator>
    <creator>
      <creatorName>Jernite, Yacine</creatorName>
      <givenName>Yacine</givenName>
      <familyName>Jernite</familyName>
    </creator>
    <creator>
      <creatorName>Plu, Julien</creatorName>
      <givenName>Julien</givenName>
      <familyName>Plu</familyName>
    </creator>
    <creator>
      <creatorName>Xu, Canwen</creatorName>
      <givenName>Canwen</givenName>
      <familyName>Xu</familyName>
    </creator>
    <creator>
      <creatorName>Le Scao, Teven</creatorName>
      <givenName>Teven</givenName>
      <familyName>Le Scao</familyName>
    </creator>
    <creator>
      <creatorName>Gugger, Sylvain</creatorName>
      <givenName>Sylvain</givenName>
      <familyName>Gugger</familyName>
    </creator>
    <creator>
      <creatorName>Drame, Mariama</creatorName>
      <givenName>Mariama</givenName>
      <familyName>Drame</familyName>
    </creator>
    <creator>
      <creatorName>Lhoest, Quentin</creatorName>
      <givenName>Quentin</givenName>
      <familyName>Lhoest</familyName>
    </creator>
    <creator>
      <creatorName>Rush, Alexander M.</creatorName>
      <givenName>Alexander M.</givenName>
      <familyName>Rush</familyName>
    </creator>
  </creators>
  <titles>
    <title>Transformers: State-of-the-Art Natural Language Processing</title>
  </titles>
  <publisher>Zenodo</publisher>
  <publicationYear>2020</publicationYear>
  <dates>
    <date dateType="Issued">2020-10-01</date>
  </dates>
  <resourceType resourceTypeGeneral="Software"/>
  <alternateIdentifiers>
    <alternateIdentifier alternateIdentifierType="url">https://zenodo.org/record/5608580</alternateIdentifier>
  </alternateIdentifiers>
  <relatedIdentifiers>
    <relatedIdentifier relatedIdentifierType="URL" relationType="IsSupplementTo">https://github.com/huggingface/transformers/tree/v4.12.0</relatedIdentifier>
    <relatedIdentifier relatedIdentifierType="DOI" relationType="IsVersionOf">10.5281/zenodo.3385997</relatedIdentifier>
    <relatedIdentifier relatedIdentifierType="URL" relationType="IsPartOf">https://zenodo.org/communities/zenodo</relatedIdentifier>
  </relatedIdentifiers>
  <version>v4.12.0</version>
  <rightsList>
    <rights rightsURI="info:eu-repo/semantics/openAccess">Open Access</rights>
  </rightsList>
  <descriptions>
    <description descriptionType="Abstract">TrOCR and VisionEncoderDecoderModel
&lt;p&gt;One new model is released as part of the TrOCR implementation: &lt;code&gt;TrOCRForCausalLM&lt;/code&gt;, in PyTorch. It comes along a new &lt;code&gt;VisionEncoderDecoderModel&lt;/code&gt; class, which allows to mix-and-match any vision Transformer encoder with any text Transformer as decoder, similar to the existing &lt;code&gt;SpeechEncoderDecoderModel&lt;/code&gt; class.&lt;/p&gt;
&lt;p&gt;The TrOCR model was proposed in &lt;a href="https://arxiv.org/abs/2109.10282"&gt;TrOCR: Transformer-based Optical Character Recognition with Pre-trained Models&lt;/a&gt;, by Minghao Li, Tengchao Lv, Lei Cui, Yijuan Lu, Dinei Florencio, Cha Zhang, Zhoujun Li, Furu Wei.&lt;/p&gt;
&lt;p&gt;The TrOCR model consists of an image transformer encoder and an autoregressive text transformer to perform optical character recognition in an end-to-end manner.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Add TrOCR + VisionEncoderDecoderModel by @NielsRogge in &lt;a href="https://github.com/huggingface/transformers/pull/13874"&gt;https://github.com/huggingface/transformers/pull/13874&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Compatible checkpoints can be found on the Hub: &lt;a href="https://huggingface.co/models?other=trocr"&gt;https://huggingface.co/models?other=trocr&lt;/a&gt;&lt;/p&gt;
SEW &amp;amp; SEW-D
&lt;p&gt;SEW and SEW-D (Squeezed and Efficient Wav2Vec) were proposed in &lt;a href="https://arxiv.org/abs/2109.06870"&gt;Performance-Efficiency Trade-offs in Unsupervised Pre-training for Speech Recognition&lt;/a&gt; by Felix Wu, Kwangyoun Kim, Jing Pan, Kyu Han, Kilian Q. Weinberger, Yoav Artzi.&lt;/p&gt;
&lt;p&gt;SEW and SEW-D models use a Wav2Vec-style feature encoder and introduce temporal downsampling to reduce the length of the transformer encoder. SEW-D additionally replaces the transformer encoder with a DeBERTa one. Both models achieve significant inference speedups without sacrificing the speech recognition quality.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Add the SEW and SEW-D speech models by @anton-l in &lt;a href="https://github.com/huggingface/transformers/pull/13962"&gt;https://github.com/huggingface/transformers/pull/13962&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Add SEW CTC models by @anton-l in &lt;a href="https://github.com/huggingface/transformers/pull/14158"&gt;https://github.com/huggingface/transformers/pull/14158&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Compatible checkpoints are available on the Hub: &lt;a href="https://huggingface.co/models?other=sew"&gt;https://huggingface.co/models?other=sew&lt;/a&gt; and &lt;a href="https://huggingface.co/models?other=sew-d"&gt;https://huggingface.co/models?other=sew-d&lt;/a&gt;&lt;/p&gt;
DistilHuBERT
&lt;p&gt;DistilHuBERT was proposed in &lt;a href="https://arxiv.org/abs/2110.01900"&gt;DistilHuBERT: Speech Representation Learning by Layer-wise Distillation of Hidden-unit BERT&lt;/a&gt;, by Heng-Jui Chang, Shu-wen Yang, Hung-yi Lee.&lt;/p&gt;
&lt;p&gt;DistilHuBERT is a distilled version of the HuBERT model. Using only two transformer layers, the model scores competitively on the SUPERB benchmark tasks.&lt;/p&gt;
&lt;p&gt;Compatible checkpoint is available on the Hub: &lt;a href="https://huggingface.co/ntu-spml/distilhubert"&gt;https://huggingface.co/ntu-spml/distilhubert&lt;/a&gt;&lt;/p&gt;
TensorFlow improvements
&lt;p&gt;Several bug fixes and UX improvements for TensorFlow&lt;/p&gt;
Keras callback
&lt;p&gt;Introduction of a Keras callback to push to the hub each epoch, or after a given number of steps:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Keras callback to push to hub each epoch, or after N steps by @Rocketknight1 in &lt;a href="https://github.com/huggingface/transformers/pull/13773"&gt;https://github.com/huggingface/transformers/pull/13773&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
Updates on the encoder-decoder framework
&lt;p&gt;The encoder-decoder framework is now available in TensorFlow, allowing mixing and matching different encoders and decoders together into a single encoder-decoder architecture!&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Add TFEncoderDecoderModel + Add cross-attention to some TF models by @ydshieh in &lt;a href="https://github.com/huggingface/transformers/pull/13222"&gt;https://github.com/huggingface/transformers/pull/13222&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Besides this, the &lt;code&gt;EncoderDecoderModel&lt;/code&gt; classes have been updated to work similar to models like BART and T5. From now on, users don't need to pass &lt;code&gt;decoder_input_ids&lt;/code&gt; themselves anymore to the model. Instead, they will be created automatically based on the &lt;code&gt;labels&lt;/code&gt; (namely by shifting them one position to the right, replacing -100 by the &lt;code&gt;pad_token_id&lt;/code&gt; and prepending the &lt;code&gt;decoder_start_token_id&lt;/code&gt;). Note that this may result in training discrepancies if fine-tuning a model trained with versions anterior to 4.12.0 that set the &lt;code&gt;decoder_input_ids&lt;/code&gt; = &lt;code&gt;labels&lt;/code&gt;.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Fix EncoderDecoderModel classes to be more like BART and T5 by @NielsRogge  in &lt;a href="https://github.com/huggingface/transformers/pull/14139"&gt;https://github.com/huggingface/transformers/pull/14139&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
Speech improvements
&lt;ul&gt;
&lt;li&gt;Add DistilHuBERT  by @anton-l in &lt;a href="https://github.com/huggingface/transformers/pull/14174"&gt;https://github.com/huggingface/transformers/pull/14174&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;[Speech Examples] Add pytorch speech pretraining by @patrickvonplaten in &lt;a href="https://github.com/huggingface/transformers/pull/13877"&gt;https://github.com/huggingface/transformers/pull/13877&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;[Speech Examples] Add new audio feature by @patrickvonplaten in &lt;a href="https://github.com/huggingface/transformers/pull/14027"&gt;https://github.com/huggingface/transformers/pull/14027&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Add ASR colabs by @patrickvonplaten in &lt;a href="https://github.com/huggingface/transformers/pull/14067"&gt;https://github.com/huggingface/transformers/pull/14067&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;[ASR] Make speech recognition example more general to load any tokenizer by @patrickvonplaten in &lt;a href="https://github.com/huggingface/transformers/pull/14079"&gt;https://github.com/huggingface/transformers/pull/14079&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;[Examples] Add an official audio classification example by @anton-l in &lt;a href="https://github.com/huggingface/transformers/pull/13722"&gt;https://github.com/huggingface/transformers/pull/13722&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;[Examples] Use Audio feature in speech classification by @anton-l in &lt;a href="https://github.com/huggingface/transformers/pull/14052"&gt;https://github.com/huggingface/transformers/pull/14052&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
Auto-model API
&lt;p&gt;To make it easier to extend the Transformers library, every Auto class a new &lt;code&gt;register&lt;/code&gt; method, that allows you to register your own custom models, configurations or tokenizers. See more in the &lt;a href="https://huggingface.co/transformers/model_doc/auto.html#extending-the-auto-classes"&gt;documentation&lt;/a&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Add an API to register objects to Auto classes by @sgugger in &lt;a href="https://github.com/huggingface/transformers/pull/13989"&gt;https://github.com/huggingface/transformers/pull/13989&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
Bug fixes and improvements
&lt;ul&gt;
&lt;li&gt;Fix filtering in test fetcher utils by @sgugger in &lt;a href="https://github.com/huggingface/transformers/pull/13766"&gt;https://github.com/huggingface/transformers/pull/13766&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Fix warning for gradient_checkpointing by @sgugger in &lt;a href="https://github.com/huggingface/transformers/pull/13767"&gt;https://github.com/huggingface/transformers/pull/13767&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Implement len in IterableDatasetShard by @sgugger in &lt;a href="https://github.com/huggingface/transformers/pull/13780"&gt;https://github.com/huggingface/transformers/pull/13780&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;[Wav2Vec2] Better error message by @patrickvonplaten in &lt;a href="https://github.com/huggingface/transformers/pull/13777"&gt;https://github.com/huggingface/transformers/pull/13777&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Fix LayoutLM ONNX test error by @nishprabhu in &lt;a href="https://github.com/huggingface/transformers/pull/13710"&gt;https://github.com/huggingface/transformers/pull/13710&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Enable readme link synchronization by @qqaatw in &lt;a href="https://github.com/huggingface/transformers/pull/13785"&gt;https://github.com/huggingface/transformers/pull/13785&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Fix length of IterableDatasetShard and add test by @sgugger in &lt;a href="https://github.com/huggingface/transformers/pull/13792"&gt;https://github.com/huggingface/transformers/pull/13792&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;[docs/gpt-j] addd instructions for how minimize CPU RAM usage by @patil-suraj in &lt;a href="https://github.com/huggingface/transformers/pull/13795"&gt;https://github.com/huggingface/transformers/pull/13795&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;[examples &lt;code&gt;run_glue.py&lt;/code&gt;] missing requirements &lt;code&gt;scipy&lt;/code&gt;, &lt;code&gt;sklearn&lt;/code&gt; by @stas00 in &lt;a href="https://github.com/huggingface/transformers/pull/13768"&gt;https://github.com/huggingface/transformers/pull/13768&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;[examples/flax] use Repository API for push_to_hub by @patil-suraj in &lt;a href="https://github.com/huggingface/transformers/pull/13672"&gt;https://github.com/huggingface/transformers/pull/13672&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Fix gather for TPU by @sgugger in &lt;a href="https://github.com/huggingface/transformers/pull/13813"&gt;https://github.com/huggingface/transformers/pull/13813&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;[testing] auto-replay captured streams by @stas00 in &lt;a href="https://github.com/huggingface/transformers/pull/13803"&gt;https://github.com/huggingface/transformers/pull/13803&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Add MultiBERTs conversion script by @gchhablani in &lt;a href="https://github.com/huggingface/transformers/pull/13077"&gt;https://github.com/huggingface/transformers/pull/13077&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;[Examples] Improve mapping in accelerate examples by @patrickvonplaten in &lt;a href="https://github.com/huggingface/transformers/pull/13810"&gt;https://github.com/huggingface/transformers/pull/13810&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;[DPR] Correct init by @patrickvonplaten in &lt;a href="https://github.com/huggingface/transformers/pull/13796"&gt;https://github.com/huggingface/transformers/pull/13796&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;skip gptj slow generate tests by @patil-suraj in &lt;a href="https://github.com/huggingface/transformers/pull/13809"&gt;https://github.com/huggingface/transformers/pull/13809&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Fix warning situation: UserWarning: max_length is ignored when padding=True" by @shirayu in &lt;a href="https://github.com/huggingface/transformers/pull/13829"&gt;https://github.com/huggingface/transformers/pull/13829&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Updating CITATION.cff to fix GitHub citation prompt BibTeX output. by @arfon in &lt;a href="https://github.com/huggingface/transformers/pull/13833"&gt;https://github.com/huggingface/transformers/pull/13833&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Add TF notebooks by @Rocketknight1 in &lt;a href="https://github.com/huggingface/transformers/pull/13793"&gt;https://github.com/huggingface/transformers/pull/13793&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Bart: check if decoder_inputs_embeds is set by @silviu-oprea in &lt;a href="https://github.com/huggingface/transformers/pull/13800"&gt;https://github.com/huggingface/transformers/pull/13800&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;include megatron_gpt2 in installed modules by @stas00 in &lt;a href="https://github.com/huggingface/transformers/pull/13834"&gt;https://github.com/huggingface/transformers/pull/13834&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Delete MultiBERTs conversion script by @gchhablani in &lt;a href="https://github.com/huggingface/transformers/pull/13852"&gt;https://github.com/huggingface/transformers/pull/13852&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Remove a duplicated bullet point in the GPT-J doc by @yaserabdelaziz in &lt;a href="https://github.com/huggingface/transformers/pull/13851"&gt;https://github.com/huggingface/transformers/pull/13851&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Add Mistral GPT-2 Stability Tweaks by @siddk in &lt;a href="https://github.com/huggingface/transformers/pull/13573"&gt;https://github.com/huggingface/transformers/pull/13573&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Fix broken link to distill models in docs by @Randl in &lt;a href="https://github.com/huggingface/transformers/pull/13848"&gt;https://github.com/huggingface/transformers/pull/13848&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;:sparkles: update image classification example by @nateraw in &lt;a href="https://github.com/huggingface/transformers/pull/13824"&gt;https://github.com/huggingface/transformers/pull/13824&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Update no_* argument (HfArgumentParser) by @BramVanroy in &lt;a href="https://github.com/huggingface/transformers/pull/13865"&gt;https://github.com/huggingface/transformers/pull/13865&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Update Tatoeba conversion by @Traubert in &lt;a href="https://github.com/huggingface/transformers/pull/13757"&gt;https://github.com/huggingface/transformers/pull/13757&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Fixing 1-length special tokens cut. by @Narsil in &lt;a href="https://github.com/huggingface/transformers/pull/13862"&gt;https://github.com/huggingface/transformers/pull/13862&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Fix flax summarization example: save checkpoint after each epoch and push checkpoint to the hub by @ydshieh in &lt;a href="https://github.com/huggingface/transformers/pull/13872"&gt;https://github.com/huggingface/transformers/pull/13872&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Fixing empty prompts for text-generation when BOS exists. by @Narsil in &lt;a href="https://github.com/huggingface/transformers/pull/13859"&gt;https://github.com/huggingface/transformers/pull/13859&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Improve error message when loading models from Hub by @aphedges in &lt;a href="https://github.com/huggingface/transformers/pull/13836"&gt;https://github.com/huggingface/transformers/pull/13836&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Initial support for symbolic tracing with torch.fx allowing dynamic axes by @michaelbenayoun in &lt;a href="https://github.com/huggingface/transformers/pull/13579"&gt;https://github.com/huggingface/transformers/pull/13579&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Allow dataset to be an optional argument for (Distributed)LengthGroupedSampler by @ZhaofengWu in &lt;a href="https://github.com/huggingface/transformers/pull/13820"&gt;https://github.com/huggingface/transformers/pull/13820&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Fixing question-answering with long contexts  by @Narsil in &lt;a href="https://github.com/huggingface/transformers/pull/13873"&gt;https://github.com/huggingface/transformers/pull/13873&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;fix(integrations): consider test metrics by @borisdayma in &lt;a href="https://github.com/huggingface/transformers/pull/13888"&gt;https://github.com/huggingface/transformers/pull/13888&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;fix: replace asserts by value error by @m5l14i11 in &lt;a href="https://github.com/huggingface/transformers/pull/13894"&gt;https://github.com/huggingface/transformers/pull/13894&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Update parallelism.md by @hyunwoongko in &lt;a href="https://github.com/huggingface/transformers/pull/13892"&gt;https://github.com/huggingface/transformers/pull/13892&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Autodocument the list of ONNX-supported models by @sgugger in &lt;a href="https://github.com/huggingface/transformers/pull/13884"&gt;https://github.com/huggingface/transformers/pull/13884&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Fixing GPU for token-classification in a better way. by @Narsil in &lt;a href="https://github.com/huggingface/transformers/pull/13856"&gt;https://github.com/huggingface/transformers/pull/13856&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Update FSNER code in examples-&amp;gt;research_projects-&amp;gt;fsner by @sayef in &lt;a href="https://github.com/huggingface/transformers/pull/13864"&gt;https://github.com/huggingface/transformers/pull/13864&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Replace assert statements with exceptions by @ddrm86 in &lt;a href="https://github.com/huggingface/transformers/pull/13871"&gt;https://github.com/huggingface/transformers/pull/13871&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Fixing Backward compatiblity for zero-shot by @Narsil in &lt;a href="https://github.com/huggingface/transformers/pull/13855"&gt;https://github.com/huggingface/transformers/pull/13855&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Update run_qa.py - CorrectTypo by @akulagrawal in &lt;a href="https://github.com/huggingface/transformers/pull/13857"&gt;https://github.com/huggingface/transformers/pull/13857&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;T5ForConditionalGeneration: enabling using past_key_values and labels in training by @yssjtu in &lt;a href="https://github.com/huggingface/transformers/pull/13805"&gt;https://github.com/huggingface/transformers/pull/13805&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Fix trainer logging_nan_inf_filter in torch_xla mode by @ymwangg in &lt;a href="https://github.com/huggingface/transformers/pull/13896"&gt;https://github.com/huggingface/transformers/pull/13896&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Fix hp search for non sigopt backends by @sgugger in &lt;a href="https://github.com/huggingface/transformers/pull/13897"&gt;https://github.com/huggingface/transformers/pull/13897&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;[Trainer] Fix nan-loss condition by @anton-l in &lt;a href="https://github.com/huggingface/transformers/pull/13911"&gt;https://github.com/huggingface/transformers/pull/13911&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Raise exceptions instead of asserts in utils/download_glue_data by @hirotasoshu in &lt;a href="https://github.com/huggingface/transformers/pull/13907"&gt;https://github.com/huggingface/transformers/pull/13907&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Add an example of exporting BartModel + BeamSearch to ONNX module. by @fatcat-z in &lt;a href="https://github.com/huggingface/transformers/pull/13765"&gt;https://github.com/huggingface/transformers/pull/13765&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;#12789 Replace assert statements with exceptions by @djroxx2000 in &lt;a href="https://github.com/huggingface/transformers/pull/13909"&gt;https://github.com/huggingface/transformers/pull/13909&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Add missing whitespace to multiline strings by @aphedges in &lt;a href="https://github.com/huggingface/transformers/pull/13916"&gt;https://github.com/huggingface/transformers/pull/13916&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;[Wav2Vec2] Fix mask_feature_prob by @patrickvonplaten in &lt;a href="https://github.com/huggingface/transformers/pull/13921"&gt;https://github.com/huggingface/transformers/pull/13921&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Fixes a minor doc issue (missing character) by @mishig25 in &lt;a href="https://github.com/huggingface/transformers/pull/13922"&gt;https://github.com/huggingface/transformers/pull/13922&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Fix LED by @Rocketknight1 in &lt;a href="https://github.com/huggingface/transformers/pull/13882"&gt;https://github.com/huggingface/transformers/pull/13882&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Add BARTpho: Pre-trained Sequence-to-Sequence Models for Vietnamese by @datquocnguyen in &lt;a href="https://github.com/huggingface/transformers/pull/13788"&gt;https://github.com/huggingface/transformers/pull/13788&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;[trainer] memory metrics: add memory at the start report by @stas00 in &lt;a href="https://github.com/huggingface/transformers/pull/13915"&gt;https://github.com/huggingface/transformers/pull/13915&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Image Segmentation pipeline by @mishig25 in &lt;a href="https://github.com/huggingface/transformers/pull/13828"&gt;https://github.com/huggingface/transformers/pull/13828&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Adding support for tokens being suffixes or part of each other. by @Narsil in &lt;a href="https://github.com/huggingface/transformers/pull/13918"&gt;https://github.com/huggingface/transformers/pull/13918&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Adds &lt;code&gt;PreTrainedModel.framework&lt;/code&gt; attribute by @StellaAthena in &lt;a href="https://github.com/huggingface/transformers/pull/13817"&gt;https://github.com/huggingface/transformers/pull/13817&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Fixed typo: herBERT -&amp;gt; HerBERT by @adamjankaczmarek in &lt;a href="https://github.com/huggingface/transformers/pull/13936"&gt;https://github.com/huggingface/transformers/pull/13936&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;[Generation] Fix max_new_tokens by @patrickvonplaten in &lt;a href="https://github.com/huggingface/transformers/pull/13919"&gt;https://github.com/huggingface/transformers/pull/13919&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Fix typo in README.md by @fullyz in &lt;a href="https://github.com/huggingface/transformers/pull/13883"&gt;https://github.com/huggingface/transformers/pull/13883&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Update bug-report.md by @LysandreJik in &lt;a href="https://github.com/huggingface/transformers/pull/13934"&gt;https://github.com/huggingface/transformers/pull/13934&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;fix issue #13904 -attribute does not exist-  by @oraby8 in &lt;a href="https://github.com/huggingface/transformers/pull/13942"&gt;https://github.com/huggingface/transformers/pull/13942&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Raise ValueError instead of asserts in src/transformers/benchmark/benchmark.py by @AkechiShiro in &lt;a href="https://github.com/huggingface/transformers/pull/13951"&gt;https://github.com/huggingface/transformers/pull/13951&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Honor existing attention mask in tokenzier.pad by @sgugger in &lt;a href="https://github.com/huggingface/transformers/pull/13926"&gt;https://github.com/huggingface/transformers/pull/13926&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;[Gradient checkpoining] Correct disabling &lt;code&gt;find_unused_parameters&lt;/code&gt; in Trainer when gradient checkpointing is enabled by @patrickvonplaten in &lt;a href="https://github.com/hugging</description>
    <description descriptionType="Other">If you use this software, please cite it using these metadata.</description>
  </descriptions>
</resource>
37,139
1,293
views
downloads
All versions This version
Views 37,139187
Downloads 1,2936
Data volume 10.0 GB73.7 MB
Unique views 30,889151
Unique downloads 6676

Share

Cite as