Info: Zenodo’s user support line is staffed on regular business days between Dec 23 and Jan 5. Response times may be slightly longer than normal.

Published January 14, 2023 | Version 1.0
Dataset Open

LogPMDataset

  • 1. University of Oulu, M3S

Description

 

Log PM is a log parser benchmark emphasizing precise in-message parameter detection rather than template-based message clustering. This dataset is a combination of smaller datasets used for this benchmark. Datasets are collected from LogHub, parsed using handcrafted regexes, and stored in CSV files. Each CSV file contains no header and three columns. The first one is the message, the second is the parameter mask, and the third one is the index of the matching regex. The necessary dataset parts are downloaded automatically in the LogPM benchmark, so no direct download is required for benchmarking.

The benchmark includes the following datasets:

  • Android
  • Apache
  • Hadoop
  • HDFS
  • HPC
  • Linux
  • OpenStack
  • Proxifier
  • SSH
  • ZooKeeper

Files

android.csv

Files (2.1 GB)

Name Size Download all
md5:3c1e45b4e7dab016169c83cd19eb4831
7.6 MB Preview Download
md5:c7b5e931d8bd16975ecc43aefdf44ea6
1.7 MB Preview Download
md5:3c18ed5afd8e6b2a5a6657b1610ae336
11.3 MB Preview Download
md5:ce074f5544ef7febb7bc9a6d78f0d5fd
2.0 GB Preview Download
md5:2f72164304f453750cb0421d26f35fdf
1.6 MB Preview Download
md5:d9227aac7bc0dee4ceff184b3a82f0a0
1.2 MB Preview Download
md5:65a417012d77a626298db079ca1ba656
31.1 MB Preview Download
md5:8c1ab5aba7c4c943dd8976a774a47bc5
1.8 MB Preview Download
md5:bfb12faf111744240eb1e919fed90b07
25.0 MB Preview Download
md5:3172114c581a62d5a9f6f2b80760a24c
3.1 MB Preview Download

Additional details

Funding

Detecting Technical Debt with Natural Language Processing 328058
Research Council of Finland