Replication package for "A Comparative Study on Large Language Models for Log Parsing"
Authors/Creators
Description
This repository contains the replication package for the paper:
"A Comparative Study on Large Language Models for Log Parsing"
Merve Astekin, Max Hort, and Leon Moonen
Accepted for the 18th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM '24)
The paper is deposited on arXiv, will be available at the publisher's site, and a preprint is included in this repository.
The replication package is archived on Zenodo with DOI: 10.5281/zenodo.13625383. The source code is distributed under the MIT license, the data is distributed under the CC BY 4.0 license.
Organization
The repository is organized as follows:
-
Archived source code in the src folder, with a dedicated README.
-
Archived scripts in the ollama_scripts folder, with a dedicated README.
-
Analysis of the results in the analysis folder, with a dedicated README.
-
The file requirements.txt in the root contains the frozen requirements for the Python environment used at the time of running the experiments.
Citation
If you build on this data or code, please cite this work by referring to the paper:
@inproceedings{astekin2024:comparative,
title = {A Comparative Study on Large Language Models for Log Parsing},
author = {Merve Astekin and Max Hort and Leon Moonen},
booktitle = {Proceedings of 18th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM’24)},
year = {2024},
publisher = {ACM},
doi = {https://doi.org/10.1145/3674805.3686684}
}
Files
Large Language Models for Log Parsing.zip
Files
(3.2 MB)
| Name | Size | Download all |
|---|---|---|
|
md5:b554b0aa1af90333a5d2dc6e302239fb
|
3.2 MB | Preview Download |
Additional details
Funding
- The Research Council of Norway
- cureIT - Adaptive Immunity for Software: Making Systems and Services Autonomously Self-Healing 300461
- The Research Council of Norway
- secureIT - Reducing Digital Vulnerabilities by Providing Software Engineers with Intelligent Automated Software Security Assessment Technology 288787
- European Commission
- condenSE - Sustainable Training of Code Language Models through Data Refinement 101151798
- The Research Council of Norway
- eX3 - Experimental Infrastructure for Exploration of Exascale Computing 270053
Dates
- Created
-
2024-09-01