Published March 28, 2020
| Version v1
Dataset
Open
Protein Graphs Dataset from PDB
Description
This dataset contains the protein graphs constructed from PDB, the Protein Data Bank (www.rcsb.org/pdb), used in the paper:
Nilothpal Talukder and Mohammed J. Zaki. A distributed approach for graph mining in massive networks. Data Mining and Knowledge Discovery: Special Issue on ECML/PKDD 2016 Journal Track Papers, 30(5):1024–1052, 2016. URL: http://link.springer.com/article/10.1007/s10618-016-0466-x.
The format of graphs is as follows:
t # GID
v VID VLABEL
e VID1 VID2 ELABEL
where
GID is a graph identifier (integer)
VID is a vertex identifier (integer) with VLABEL its vertex label (integer)
VID1 VID2 denotes an edge between the two vertices, with ELABEL the edge label (integer)
Files
Files
(959.2 MB)
Name | Size | Download all |
---|---|---|
md5:3699b0cf9b747310cae5a27e9b7c52ac
|
959.2 MB | Download |
Additional details
Funding
- III: Medium: Mining petabytes of data using cloud computing and a massively parallel cyberinstrument 1302231
- National Science Foundation
References
- Nilothpal Talukder and Mohammed J. Zaki. A distributed approach for graph mining in massive networks. Data Mining and Knowledge Discovery: Special Issue on ECML/PKDD 2016 Journal Track Papers, 30(5):1024–1052, 2016. URL: http://link.springer.com/article/10.1007/s10618-016-0466-x.