Published June 11, 2021 | Version 1
Dataset Open

Linux Kernel binary size

  • 1. IRISA

Description

Dataset containing measurements of Linux Kernel binary size after compilation. The reported size, in the column "perf", is the size in bytes of the vmlinux file. In contains also a column "active_options" reporting the number of activated options (set at "y"). All other columns, the list being reported in the file "Linux_options.json", are Linux kernel options. The sampling have been made using randconfig. The version of Linux used is 4.13.3.

Not all available options are present. First, it only contains options about the x86 and 64 bits version. Then, all non-tristate options have been ignored. Finally, options not having multiple value through the whole dataset, due to not enough variability in the sampling, are ignored. All options are encoded as 0 for "n" and "m" options value, and 1 for "y".

In python, importing the dataset using pandas will attribute all columns to int64, which will lead to a great consumption of memory (~50GB). We provide this way to import it using less than 1 GB of memory by setting options columns to int8.

import pandas as pd
import json
import numpy

with open("Linux_options.json","r") as f:
    linux_options = json.load(f)
# Load csv by setting options as int8 to save a lot of memory
return pd.read_csv("Linux.csv", dtype={f:numpy.int8 for f in linux_options})

 

Files

Linux.csv

Files (1.8 GB)

Name Size Download all
md5:22b454de0caeec915e11ac0986c99af1
1.8 GB Preview Download
md5:b66ea9c31888982433e17f11c5d83713
168.6 kB Preview Download

Additional details

Funding

Agence Nationale de la Recherche
VaryVary - Varying Variability ANR-17-CE25-0010