Social Graph Inference Reddit
Description
we construct a large-scale, empirically grounded dataset from Reddit to support the development and evaluation of agent-based social simulations. The dataset includes 33, technology-focused, 14 climate-focused, and 7 COVID-related agents, each domain encompassing over (one million posts and comments ). Using publicly available posts and comments, we define agent categories based on content and interaction patterns, derive inter-agent relationships from temporal commenting behaviors, and build a directed, weighted network that reflects empirically observed user connections. The resulting dataset enables researchers to calibrate and benchmark agent behavior, network structure, and information diffusion processes against real social dynamics. Quantitative and qualitative analysis reveal distinctive patterns in user connectivity, engagement life cycles, and triadic closure growth, illustrating the potential of Reddit-derived interaction networks for realistic social simulation.
Files
climate_14_agents.json
Files
(1.4 GB)
| Name | Size | Download all |
|---|---|---|
|
md5:6b116a3d62f121ead559679fe138532f
|
443.1 MB | Preview Download |
|
md5:daa3b5271e27ece9206cda543c4e6c87
|
105.8 kB | Preview Download |
|
md5:f602692d48f933d0f71078757bdc887f
|
490.7 MB | Preview Download |
|
md5:3da83f5a45e493725e4c1a5125e39258
|
12.3 kB | Preview Download |
|
md5:cb6cb769d2479bdc3207896959f27410
|
67.0 kB | Preview Download |
|
md5:5304041f3302208c11009a7b9581cb2f
|
502.0 MB | Preview Download |
Additional details
Software
- Repository URL
- https://github.com/abdulsittar/Social-Graph-Inference-Reddit
- Programming language
- Python
- Development Status
- Active