Presentation Open Access
Grant R. Vousden-Dishington
Data science work often requires computing resources that isn’t available to practitioners from disadvantaged backgrounds or located in “data deserts” with low technology access. Cloud computing is an option in such situations but can be costly and not friendly to beginners who don’t know what infrastructure to pick. Most of the data science community thinks of GitHub as only a code storage repository with some management features, but GitHub Actions provide a powerful and often free resource for computing that can perform many data collection and analysis needs. This introduction to GitHub Actions will provide an overview of everything needed for GitHub users to get started applying GitHub Actions to their projects. We’ll see both simple and advanced examples of the YAML format that controls various workflows and how it ties into traditional data science needs, like testing and automation. Time permitting, we’ll also see how these workflows can be used for special applications, such as open source intelligence and machine learning.