There is a newer version of the record available.

Published December 1, 2023 | Version v2.0.0
Software Open

SIRUS.jl: Interpretable Machine Learning via Rule Extraction

Description

SIRUS v2.0.0

Diff since v1.3.4

Some quite big changes in this v2 release. Firstly, confusing terms such as Split and TreePath have been renamed to SubClause and Clause respectively. This is more in line with the usage of these types on the client side. Furthermore, a sorting step was removed (🎉 ) from the algorithm because it wasn't necessary (best code is no code). Next, the plotting functionality has been simplified by exporting more functions. And finally, the root cause for a the performance problem on regression tasks is narrowed down to the random forest implementation. The multiclass classification performs much better now that the lambda hyperparameter was tuned. It turned out that the model is very sensitive to the choice of lambda, and this has been documented at various places. In the benchmarks, SIRUS.jl is now, without finetuning, outperforming the original R algorithm on all but one task.

Merged pull requests:

  • Add JOSS paper (#36) (@rikhuijzer)
  • [Breaking] Rename confusing terms such as Split (#67) (@rikhuijzer)
  • Remove rule sorting step (#68) (@rikhuijzer)
  • Add spell checker to repo (#69) (@rikhuijzer)
  • Bump actions/checkout from 3 to 4 (#70) (@dependabot[bot])
  • Another shot at fixing regression (#71) (@rikhuijzer)
  • fix: typo (#73) (@storopoli)
  • Extend API (#74) (@rikhuijzer)

Closed issues:

  • Add linear models to the benchmarks (#31)
  • Easily access condition and consequence (then/otherwise) of a Rule (#44)
  • Add API to obtain rules for visualizations (#66)

Notes

If you use this software, please cite our article in the Journal of Open Source Software.

Files

rikhuijzer/SIRUS.jl-v2.0.0.zip

Files (694.5 kB)

Name Size Download all
md5:1a66fdd62b7b0aaff643df981d5da9ab
694.5 kB Preview Download

Additional details

Related works