Published May 29, 2020 | Version v1
Presentation Open

The Collective Knowledge project: closing the gap between ML&systems research and practice with portable workflows, reusable automation actions, and reproducible crowd-benchmarking

  • 1. cTuning foundation and cKnowledge SAS


Developing novel applications based on deep tech (ML, AI, HPC, quantum, IoT) and deploying them in production is a very painful, ad-hoc, time consuming and expensive process due to continuously evolving software, hardware, models, data sets and research techniques.

After struggling with these problems for many years, I started the Collective Knowledge project (CK) to decompose complex systems and research projects into reusable, portable, customizable and non-virtualized CK components with unified automation actions, Python APIs, CLI and JSON meta descriptions.

My idea is to gradually abstract all existing artifacts (software, hardware, models, data sets, results) and use the DevOps methodology to connect such components together into functional CK solutions. Such solutions can automatically adapt to evolving models, data sets and bare-metal platforms with the help of customizable program workflows, a list of all dependencies (models, data sets, frameworks), and a portable meta package manager.

CK is basically our intermediate language to connect researchers and practitioners to collaboratively design, benchmark, optimize and validate innovative computational systems. It then makes it possible to find the most efficient system configutations on a Pareto frontier (trading off speed, accuracy, energy, size and different costs) using an open repository of knowledge with live SOTA scoreboards and reproducible papers.

Even though the CK technology is used in production for more than 5 years, it is still a proof-of-concept prototype requiring further improvements and standardization. We plan to develop a user-friendly web front-end to make it easier for researchers and practitioners to create and share CK workflows, artifacts, SOTA scoreboards, live papers, and participate in collaborative, trustable and reproducible R&D.



Files (5.9 MB)

Name Size Download all
5.9 MB Preview Download