Published November 19, 2023
| Version v0.0
Dataset
Open
tags-stack-overflow
Description
Overview
This dataset is derived from tags on Stack Overflow posts. Each hyperedge corresponds to all of the tags used in a post, and each node in a hyperedge corresponds to a tag. The timestamps of the posts are in millisecond resolution, are adjusted so that the time of the earliest tag starts at 0, and are in ISO8601 format.
Statistics
Some basic statistics of this dataset are:
- number of nodes: 49,998
- number of timestamped hyperedges: 14,458,875
- number of unique hyperedges: 5,675,497
- Component sizes:
Component size, number
- 49931, 1
- 2, 7
- 1, 53
Source of original data
References
If you use this data, please cite the following paper:
- Simplicial closure and higher-order link prediction. Austin R. Benson, Rediet Abebe, Michael T. Schaub, Ali Jadbabaie, and Jon Kleinberg. Proceedings of the National Academy of Sciences (PNAS), 2018.
Files
tags-stack-overflow.json
Files
(2.0 GB)
Name | Size | Download all |
---|---|---|
md5:bccd5752e09a7383f7c53a0fccb67d44
|
2.0 GB | Preview Download |