Exploring Data at Scale with Arkouda: A Practical Introduction to Scalable Data Science
Description
Data scientists can be thought of as modern-day explorers, venturing into the vast unknown of information. However, this exciting journey is not without its hurdles. One of the biggest challenges they face is the sheer immensity of data they encounter. Modern datasets cannot fit in laptop memory, containing terabytes or even petabytes of information. Working with such massive data requires specialized tools and techniques to extract meaningful insights. As data sets are growing ever larger, data science demands interactivity, where scientists can learn while working with the data. At the same time, data science demands scalability, where scientists are able to work with data sets in their entirety. Data scientists have naturally been drawn to Python as it provides interactivity through its read, evaluate, print loop and performance through its utilization of libraries written in other languages, like C and Fortran. These libraries typically are not designed for HPC and run into problems when attempting to scale. The gap that Arkouda fills in the data science landscape is a library that is both interactive, providing a familiar Python API, and scalable, leveraging a scalable Chapel server in the backend. Arkouda is a framework for scalable Python packages for interactive data science and has applications ranging from oceanography to net flow analysis.
Files
Arkouda final demo cut.mp4
Files
(170.6 MB)
Name | Size | Download all |
---|---|---|
md5:20e2ba32dfe79de5cf4cdacb5c7e2850
|
170.6 MB | Preview Download |
Additional details
Dates
- Submitted
-
2024-08-15
Software
- Repository URL
- https://github.com/bmcdonald3/chapelcon-2024-arkouda
- Programming language
- Python, Chapel
- Development Status
- Active