Journal article Open Access
Recently, a growing number of researchers have applied machine learning to assist users of interactive theorem provers.
However, the expressive nature of underlying logics and esoteric structures of proof documents impede machine learning practitioners,
who often do not have much expertise in formal logic, let alone Isabelle/HOL, from achieving a large scale success in this field.
In this data description, we present a simple dataset that contains data on over 400k proof method applications along with over 100 extracted features for each in a format that can be processed easily without any knowledge about formal logic.
Our simple data format allows machine learning practitioners to try machine learning tools to predict proof methods in Isabelle/HOL without requiring domain expertise in logic.