Published March 28, 2020 | Version v1
Dataset Open

Protein Graphs Dataset from PDB

  • 1. RPI

Description

This dataset contains the protein graphs constructed from PDB, the Protein Data Bank (www.rcsb.org/pdb), used in the paper:

Nilothpal Talukder and Mohammed J. ZakiA distributed approach for graph mining in massive networks. Data Mining and Knowledge Discovery: Special Issue on ECML/PKDD 2016 Journal Track Papers, 30(5):1024–1052, 2016. URL: http://link.springer.com/article/10.1007/s10618-016-0466-x.

The format of graphs is as follows:

t # GID

v VID VLABEL

e VID1 VID2 ELABEL

where

GID is a graph identifier (integer)

VID is a vertex identifier (integer) with VLABEL its vertex label (integer)

VID1 VID2 denotes an edge between the two vertices, with ELABEL the edge label (integer)

Files

Files (959.2 MB)

Name Size Download all
md5:3699b0cf9b747310cae5a27e9b7c52ac
959.2 MB Download

Additional details

Funding

III: Medium: Mining petabytes of data using cloud computing and a massively parallel cyberinstrument 1302231
National Science Foundation

References