Enabling open and reproducible research at computer systems conferences: the good, the bad and the ugly

Grigori Fursin

doi:10.5281/zenodo.3908799

Published March 14, 2017 | Version v2

Presentation Open

Enabling open and reproducible research at computer systems conferences: the good, the bad and the ugly

Grigori Fursin¹

1. cTuning foundation

14 March 2017, CNRS webinar, Grenoble, France

A decade ago my research nearly stalled. I was investigating how to crowdsource performance analysis and optimization of realistic workloads across diverse hardware provided by volunteers and combine it with machine learning [1]. Often, it was simply impossible to reproduce crowdsourced empirical results and build predictive models due to continuously changing software and hardware stacks. Worse still, lack of realistic workloads and representative data sets in our community severely limited the usefulness of such models.

All these problems motivated me to create a public portal (cTuning.org) to share, validate and reuse workloads, data sets, tools, experimental results, and predictive models while involving the community in this effort [2]. This experience, in turn, helped us to initiate the so-called Artifact Evaluation (AE) at ACM conferences on parallel programming, architecture and code generation (ASPLOS, CGO, PPoPP, PACT, SC and MLSys). AE aims to independently validate experimental results reported in the publications and to encourage code and data sharing.

These slides are from my webinar “Enabling open and reproducible research at computer systems conferences: the good, the bad and the ugly” at CNRS Grenoble (14 March 2017). I shared my practical experience organizing Artifact Evaluation over the past years, along with encountered problems and possible solutions.

On the one hand, we have received incredible support from the research community, ACM, universities, and companies. We have even received a record number of artifact submissions at the CGO/PPoPP'17 AE (27 vs 17 two years ago) sponsored by NVIDIA and the cTuning foundation. We have also introduced Artifact Appendices and co-authored the new ACM Result and Artifact Review and Badging policy now used at Supercomputing.

On the other hand, the use of proprietary benchmarks, rare hardware platforms, and totally ad-hoc scripts to set up, run and process experiments all place a huge burden on evaluators. It is simply too difficult and time-consuming to customize and rebuild experimental setups, reuse artifacts and eventually build upon others’ efforts - the main pillars of open science!

I then present Collective Knowledge (CK), my attempt to introduce a customizable workflow framework with a unified JSON API and a cross-platform package manager, that can automate ML&systems R&D and enable live papers while automatically adapting to continuously evolving software and hardware [3]. I also demonstrate a practical CK workflow to collaboratively optimize deep learning across different compilers, libraries, data sets and diverse platforms from resource-constrained mobile devices to data centers (see our Android app to crowdsource DNN optimization across diverse mobile devices provided by volunteers, and the public repository with results) [4].

Finally, I describe our novel publication model to reproduce results from published papers with the help of the community [5].

Please feel free to contact me at Grigori.Fursin@cTuning.org if you have any questions or comments! I am looking forward to your feedback!

References

“Milepost GCC: Machine learning enabled self-tuning compiler”, International journal of parallel programming, Volume 39, Issue 3, pp.296-327, 2009
“Collective Tuning Initiative: automating and accelerating development and optimization of computing systems”, GCC Developers' Summit, Montreal, Canada. 2009
“Collective Knowledge: towards R&D sustainability”, Proceedings of the Conference on Design, Automation, and Test in Europe (DATE), 2016
“Optimizing Convolutional Neural Networks on Embedded Platforms with OpenCL”, IWOCL'16, Vienna, Austria, 2016
“Community-driven reviewing and validation of publications”, Proceedings of the 1st ACM SIGPLAN Workshop on Reproducible Research Methodologies and New Publication Models in Computer Engineering @ PLDI’14, Edinburgh, UK

Files

presentation.pdf

Files (6.0 MB)

Name	Size	Download all
presentation.pdf md5:8155b59bee74dfd461f414f5b24e3509	6.0 MB	Preview Download

Additional details

Is supplement to: 10.3850/9783981537079_1018 (DOI); arXiv:1801.06378 (arXiv)
Is supplemented by: arXiv:1406.4020 (arXiv); 10.1007/s10766-010-0161-2 (DOI); arXiv:1407.3487 (arXiv)

	All versions	This version
Views	676	219
Downloads	307	74
Data volume	1.5 GB	443.3 MB

Enabling open and reproducible research at computer systems conferences: the good, the bad and the ugly

Creators

Description

Files

presentation.pdf

Files (6.0 MB)

Additional details

Related works