Presentation Open Access

Static datasets aren't enough: where deployed systems differ from research

Bernease Herman

Slides from csvconf,v6 presentation, "Static datasets aren't enough: where deployed systems differ from research" by Bernease Herman (WhyLabs).

The focus on static datasets in machine learning and AI training fails to translate to how these systems are being deployed in industry. As a result, data scientists and engineers aren't considering how these systems perform in changing, real world environments nor the feedback mechanisms and societal implications that these systems can cause. In the session, we will highlight existing tools that work with dynamic (and perhaps streaming) data. We will suggest some preliminary studies of activities and lessons that may bridge the gap in data science training for realistic data.The goal of the talk is to:
- Point to resources for AI practitioners to engage with dynamic datasets
- Engage in discussion about the impact of feedback loops and other consequences on the real world
- Brainstorm new approaches to teaching skills on dynamic datasets

Files (9.7 MB)
Name Size
csvconf 2021 Static Datasets Aren’t Enough.pdf
3.2 MB Download
csvconf 2021 Static Datasets Aren’t Enough.pptx
6.5 MB Download
All versions This version
Views 1515
Downloads 1212
Data volume 37.9 MB37.9 MB
Unique views 1515
Unique downloads 1212


Cite as