Published February 26, 2022 | Version 1
Dataset Open

Pagerank Dataset for Bitcoin Blockchain - Part 1 of 2

  • 1. Bogazici University

Description

Description

This dataset contains the Pagerank values and rankings of Bitcoin addresses and transaction IDs (TXID). It contains a total of 1.608.748.675 addresses or TXIDs.

Part 2 is available at https://zenodo.org/deposit/6077428

 

File format

The dataset is compressed with bzip2. It can be uncompressed using the command bunzip2. The dataset is divided into multiple files since it was large. The files are space-delimited plain text files and have the following five fields:

<Label> <Label type> <Rank> <Rank with ties> <Pagerank value>

Label: A alphanumeric Bitcoin address (e.g. 1DzTCMmWABEDM1rYFL1RgdLyE59jXMzEHV) or a 64 character hexadecimal transaction ID (e.g. 000000000fdf0c619cd8e0d512c7e2c0da5a5808e60f12f1e0d01522d2986a51) Type: String

Label type: It's value is 0 if the label is transaction ID and 1 if the label is a Bitcoin address. Type: Integer

Rank: Unique Pagerank rank where the ties (addresses having the same Pagerank value) are resolved by sorting the addresses. Type: Integer

Rank with ties: Pagerank rank where the ties (addresses having the same Pagerank value) have the same rank. Type: Integer

Pagerank value: Pagerank of the address and transaction IDs calculated using Pagerank algorithm. Type: Floating-point number

 

Sample lines:

000000000fdf0c619cd8e0d512c7e2c0da5a5808e60f12f1e0d01522d2986a51 0 427225664 266976712 0.979246
1DzTCMmWABEDM1rYFL1RgdLyE59jXMzEHV 1 1114666798 508037940 0.877961

 

"head.txt" contains the first 10 lines of each file. "tail.txt" contains the last 10 lines of each file.

 

Dataset Generation

The Bitcoin transactions between blocks 0 (mined on 03.01.2009) and 713.999 (mined on 13.12.2021) are extracted. A transaction graph is constructed, where Bitcoin addresses and transaction IDs are nodes of the graph and the transaction inputs and outputs are edges of the graph. Pagerank is applied on this transaction graph. This computation is performed using the system presented in the paper 'Parallel analysis of Ethereum blockchain transaction data using cluster computing'.

 

Note

If you use our dataset in your research, please cite our paper: https://link.springer.com/article/10.1007/s10586-021-03511-0

@article{kilic2022parallel, 
  title={Parallel Analysis of Ethereum Blockchain Transaction Data using Cluster Computing}, 
  journal={Cluster Computing},
  author={K{\i}l{\i}{\c{c}}, Baran and {\"O}zturan, Can and Sen, Alper},
  year={2022},
  month={Jan} 
}

 

Other Datasets

If you are interested, please also check out our Pagerank Dataset for Ethereum Blockchain.

 

Files

head.txt

Files (48.2 GB)

Name Size Download all
md5:352824b41343851dfe8cebdc6c6e9d11
4.4 GB Download
md5:6bdc3323cdf5f19854cb7cf62aa8391d
4.4 GB Download
md5:5fd97a40ad2bfaf86269ba27f688efc7
4.4 GB Download
md5:89067a18bfbc7155079eb6c53ca6aeea
4.4 GB Download
md5:521e017e9ca6851d0f9d3ba561710612
4.4 GB Download
md5:e3d23e5df94018564b0a475ec418e85d
4.4 GB Download
md5:40bc89b26762a3c6a06b9a1627904414
4.4 GB Download
md5:e81a02349cef76ec50888454942aaaca
3.5 GB Download
md5:dbfad9d04fa8a1833eb67112f1cfb577
3.5 GB Download
md5:34d6846b170e393918f9757494340979
3.5 GB Download
md5:50c1ffc98de47df37b5de06c65541acb
3.5 GB Download
md5:ad695e77c02510774ba5b79ab504cc35
3.5 GB Download
md5:3603a7799a579975b92fc5559e6a7caf
13.9 kB Preview Download
md5:3ff77d832b27b9c3ca2ee6afb8eacb47
13.9 kB Preview Download

Additional details

Funding

INFINITECH – Tailored IoT & BigData Sandboxes and Testbeds for Smart, Autonomous and Personalized Services in the European Finance and Insurance Services Ecosystem 856632
European Commission