Info: Zenodo’s user support line is staffed on regular business days between Dec 23 and Jan 5. Response times may be slightly longer than normal.

There is a newer version of the record available.

Published January 14, 2023 | Version 1.0
Dataset Open

LogPMDataset

  • 1. University of Oulu, M3S

Description

 

Log PM is a log parser benchmark emphasizing precise in-message parameter detection rather than template-based message clustering. This dataset is a combination of smaller datasets used for this benchmark. Datasets are collected from LogHub, parsed using handcrafted regexes, and stored in CSV files. Each CSV file contains no header and three columns. The first one is the message, the second is the parameter mask, and the third one is the index of the matching regex. The necessary dataset parts are downloaded automatically in the LogPM benchmark, so no direct download is required for benchmarking.

The benchmark includes the following datasets:

  • Android
  • Apache
  • Hadoop
  • HDFS
  • HPC
  • Linux
  • OpenStack
  • Proxifier
  • SSH
  • ZooKeeper

Files

android.csv

Files (2.1 GB)

Name Size Download all
md5:b3af2be0510a98122a42ba6abdb4a1aa
7.6 MB Preview Download
md5:1854600feb4a85afa41285a0dd0822a4
1.7 MB Preview Download
md5:6c15be7dae072eb470f6ca5c44ded0eb
11.3 MB Preview Download
md5:ce074f5544ef7febb7bc9a6d78f0d5fd
2.0 GB Preview Download
md5:e32fdfe07bd4f704b80151001a9456bf
1.6 MB Preview Download
md5:26d8cf52eacf042e588f0afccc4c176f
1.2 MB Preview Download
md5:8e9e58cf041eb991c55c4957baf9d395
31.1 MB Preview Download
md5:82eecc81544a962fd61da02cdcc230e4
1.8 MB Preview Download
md5:a996af00935df3c0950a099a53dc8b8b
25.0 MB Preview Download
md5:cad66efebcbc72a961e1e7578664f7e0
3.1 MB Preview Download

Additional details

Funding

Detecting Technical Debt with Natural Language Processing 328058
Research Council of Finland