Published May 30, 2018 | Version v1
Other Open

Selection of detailed results of the analysis done in "Stochastic Block Model Reveals Maps of Citation Patterns and Their Evolution in Time"

  • 1. Aalto University

Description

Introduction

Selection of the detailed results of the analysis done in the article Darko Hric, Kimmo Kaski, Mikko Kivelä: "Stochastic Block Model Reveals Maps of Citation Patterns and Their Evolution in Time". The results consist of aggregated networks of citations between scientific journals and their blocks inferred using a stochastic block model. See the article for details on the methods and data sources.


Network construction
The article contains a detailed description of the network and block construction.

The full citation data is split into 10-year (1900s-1970s) and 5-year windows (after 1970s), and networks are constructed using only the data inside these windows. In each time window, a node corresponds to an active journal that has publications in the given time period. The connections between the journals are constructed using outgoing citations from these journals such that there is a directed link from journal a to journal b if an article in journal a cites an article in journal b, and the weight of this link is taken to be the number of such citations. For each time window only the contemporary citations satisfying the following two criteria are used:
  (1) the cited article is published in a journal that is active in the time window, and
  (2) the time difference between the citing and the cited article is shorter than the length of the window. 

The results of this procedure are found in files named "edgelist_level_0.txt" and "blocks_level_0.txt". Higher levels are the results of fitting a hierarchical, degree-corrected stochastic blockmodel (as implemented in python package graph-tool, version 2.19dev) onto the networks at level_0.


Data format
Data for each time window is placed in a separate directory.
Files with filenames starting with 'edgelist_' contain one edge per line, in the format:

source target weight

source and target are zero-indexed numbers, representing blocks whose content can be found in files with filenames starting with 'blocks_'. Each line in 'blocks_' files contains a space-separated list of journals belonging to a block, and the line number they are at corresponds to the serial number of the block (note: since blocks are zero-indexed, line Nr 1 corresponds to block Nr 0, etc.).

Field assignments of each journal is given in files named 'blocks_fields.txt', in the same format as other 'blocks_' files. The names of fields are given in files 'blocks_fields.txt.names', with line numbers matching those in the corresponding 'blocks_' files.

Citing

When using or refering to these results, please cite: 

Stochastic Block Model Reveals Maps of Citation Patterns and Their Evolution in Time, D. Hric, K. Kaski, M. Kivelä
 

Files

results.zip

Files (594.4 MB)

Name Size Download all
md5:14e6835d890ae8dfaa865329f02ce3d2
594.4 MB Preview Download

Additional details

Related works

Is supplement to
https://arxiv.org/abs/1705.00018 (URL)