Published May 5, 2021 | Version v1
Presentation Open

Static datasets aren't enough: where deployed systems differ from research

  • 1. WhyLabs

Description

Slides from csvconf,v6 presentation, "Static datasets aren't enough: where deployed systems differ from research" by Bernease Herman (WhyLabs).

The focus on static datasets in machine learning and AI training fails to translate to how these systems are being deployed in industry. As a result, data scientists and engineers aren't considering how these systems perform in changing, real world environments nor the feedback mechanisms and societal implications that these systems can cause. In the session, we will highlight existing tools that work with dynamic (and perhaps streaming) data. We will suggest some preliminary studies of activities and lessons that may bridge the gap in data science training for realistic data.The goal of the talk is to:
- Point to resources for AI practitioners to engage with dynamic datasets
- Engage in discussion about the impact of feedback loops and other consequences on the real world
- Brainstorm new approaches to teaching skills on dynamic datasets

Files

csvconf 2021 Static Datasets Aren’t Enough.pdf

Files (9.7 MB)

Name Size Download all
md5:65ba096fda63f8a881abb6fb5a64e24c
3.2 MB Preview Download
md5:992c2d49679bc94a97d5b494b92ebd24
6.5 MB Download