Intra-Layer Recurrence in Transformers for Language Modeling

Nguyen, Anthony; Lin, Wenjun

doi:10.5281/zenodo.19503476

Published January 1, 2025 | Version v1

Conference paper Open

Intra-Layer Recurrence in Transformers for Language Modeling

Transformer models have established new benchmarks in natural language processing; however, their increasing depth results in substantial growth in parameter counts. While existing recurrent transformer methods address this issue by reprocessing layers multiple times, they often apply recurrence indiscriminately across entire blocks of layers. In this work, we investigate Intra-Layer Recurrence (ILR), a more targeted approach that applies recurrence selectively to individual layers within a single forward pass. Our experiments show that allocating more iterations to earlier layers yields optimal results. These findings suggest that ILR offers a promising direction for optimizing recurrent structures in transformer architectures.

Files

nguyen-2025-intralayer.pdf

Files (645.2 kB)

Name	Size	Download all
nguyen-2025-intralayer.pdf md5:1b874377ef1be157933fabd5c15a5499	645.2 kB	Preview Download

Additional details

DOI: 10.21428/594757db.834c24c6

Views

Downloads

Show more details

	All versions	This version
Views	4	4
Downloads	3	3
Data volume	1.9 MB	1.9 MB

More info on how stats are collected....

DOI

Resource type

Conference paper

Publisher

Zenodo

Conference

Proceedings of the 38th Canadian Conference on Artificial Intelligence

License: Creative Commons Attribution 4.0 International

The Creative Commons Attribution license allows re-distribution and re-use of a licensed work on the condition that the creator is appropriately credited. Read more

Technical metadata

Created: April 11, 2026
Modified: April 11, 2026

Intra-Layer Recurrence in Transformers for Language Modeling

Authors/Creators

Description

Files

nguyen-2025-intralayer.pdf

Files (645.2 kB)

Additional details

Identifiers