Kaggle Data Community Survey - Data Summary/Analysis report
Creators
Description
Having greater access to data leads to many benefits, from advancing science and promoting transparency and accountability in government to boosting innovation. However, merely releasing the data does not make it easy to use; even when the data is openly available online, people may struggle to work with it. We aim to understand what makes some data more reused than other data, through the lens of one of the largest data-sharing platforms worldwide, Kaggle. This report presents summarised findings from an online survey taken by 434 active members of the Kaggle community in February 2021. We identify several factors that demonstrably support data use, which are related to the data itself, but also factors related to how people engage with the data. Key findings highlight the importance of textual descriptions of the data, related to `understandability' which is perceived as a key dimension of data quality. Our insights can inform the design of data platforms, in areas such as community building and user retention, and also support data publishers in prioritising data maintenance work.
Files
Kaggle Data Community Survey_ Data Summary_Analysis report.pdf
Files
(668.6 kB)
Name | Size | Download all |
---|---|---|
md5:f6060e6c709a0d45ee15faba8033c875
|
668.6 kB | Preview Download |
Additional details
Related works
- Requires
- Other: 10.5281/zenodo.10576981 (DOI)