Empirical evaluation of low-rank adaptation for efficient fine-tuning of large language models
Description
This study evaluates Low-Rank Adaptation (LoRA) as a parameter-eficient method for fine-tuning large language models (LLMs), with a focus on the Qwen2.5-3B-Instruct model and the S1.1 dataset. Traditional full fine-tuning of LLMs demands substantial memory and computation, whereas LoRA introduces trainable low-rank matrices into selected layers, significantly reducing the number of trainable parameters while keeping the base model weights fixed.
A series of experiments was conducted on the Baskerville HPC cluster to analyse LoRA’s ezects on memory usage, training time, and model performance under various configurations and GPU settings. Results indicate that LoRA reduces trainable parameters by over 99% and lowers memory usage by up to 34% on a single GPU, with further reductions observed in multi-GPU environments. Although residual memory usage increases due to LoRA’s adapter layers, the overall memory footprint remains considerably lower than that of full fine-tuning.
Performance comparisons reveal that carefully chosen LoRA configurations can approach baseline accuracy, especially when higher rank values and optimised learning rates are used. However, slower convergence and early loss plateaus were observed in some settings. These findings highlight LoRA as an ezective solution for enabling scalable fine-tuning of LLMs in resource-constrained environments.
Files
lazauskas-2025.pdf
Files
(304.2 kB)
| Name | Size | Download all |
|---|---|---|
|
md5:5dc09a153f5025601046248de14d8c22
|
304.2 kB | Preview Download |
Additional details
Funding
- UK Research and Innovation
- Baskerville: a national accelerated compute resource EP/T022221/1
Biodiversity
- Catalog number
- Turing Technical Report No. 9
References
- 1] Daya Guo, Dejian Yang, Haowei Zhang, Junxiao Song, Ruoyu Zhang, Runxin Xu, Qihao Zhu, Shirong Ma, Peiyi Wang, Xiao Bi, et al. Deepseek-r1: Incentivizing reasoning capability in llms via reinforcement learning. arXiv preprint arXiv:2501.12948, 2025. [2] Edward J. Hu, Yelong Shen, Phillip Wallis, Zeyuan Allen-Zhu, Yuanzhi Li, Shean Wang, Lu Wang, and Weizhu Chen. Lora: Low-rank adaptation of large language models, 2021. URL https://arxiv.org/abs/2106.09685. [3] Niklas Muennighoz, Zitong Yang, Weijia Shi, Xiang Lisa Li, Li Fei-Fei, Hannaneh Hajishirzi, Luke Zettlemoyer, Percy Liang, Emmanuel Candès, and Tatsunori Hashimoto. s1: Simple test-time scaling, 2025. URL https://arxiv.org/abs/ 2501.19393.