Graphs and Attributes used for the attribute-structure correlation pattern mining
Description
## SCPM: An implementation of an algorithm for structural correlation pattern mining.
The structural correlation measures how a set of attributes induces dense subgraphs in an attributed graph. A structural correlation pattern is a dense subgraph induced by a particular attribute set. Structural correlation pattern mining is useful to analyze how different attribute sets are correlated to dense subgraphs in several real-life attributed graphs.
**Relevant Publications**
* Arlei Silva, Wagner Meira, Jr., and Mohammed J. Zaki. Structural correlation pattern mining for large graphs. In Proceedings of the Eighth Workshop on Mining and Learning with Graphs (MLG '10).
* Arlei Silva, Wagner Meira, Jr., and Mohammed J. Zaki. Mining Attribute-structure Correlated Patterns in Large Attributed Graphs. In Proceedings of the VLDB Endowment (PVLDB '12).
* Arlei Silva. Structural correlation pattern mining for large graphs. M.Sc Thesis, Computer Science Department, Universidade Federal de Minas Gerais, 2011.
* Arlei Silva, Wagner Meira Jr. Structural correlation pattern mining for large graphs. Thesis and Dissertation Contest of the Brazilian Computer Society (CTD'12).
## HOW TO
cd to trunk and run make
see README in trunk
## Datasets:
### Description:
#### ATTRIBUTE FILE:
Format: Lists the attributes of each vertex from the graph.
<VERTEX_ID>,<ATTRIBUTE_ID>,<ATTRIBUTE_ID>...,<ATTRIBUTE_ID>
Example:
1,A,C
2,A
3,A,C,D
4,A,D
5,A,E
6,A,B,C
7,A,B,E
8,A,B
9,A,B
10,A,B,D
11,A,B
#### GRAPH FILE:
Format: Lists the neighbors of each vertex from the graph (adjacency list). Although the graph is undirected, each edge must be included in both directions.
<VERTEX_ID>,<NEIGHBOR_ID>,<NEIGHBOR_ID>...,<NEIGHBOR_ID>
Example:
1,4
2,3
3,2,4,5,6,7
4,1,3,5,6
5,3,4,6
6,3,4,5,7,8,9,10
7,3,6,8,11
8,6,7,9,10,11
9,6,8,10,11
10,6,8,9,11
11,7,8,9,10
### REAL DATASETS
Lastfm:
attributes: attrLastFm.csv.tar.bz2
network: graphLastFm.csv.tar.gz
DBLP:
attributes: newAttrDBLP.csv.tar.bz2
network: newGraphDBLP.csv.tar.bz2
CITESEER:
attributes: attrCiteseer.csv.tar.bz2
network: graphCiteseer.csv.tar.bz2
Files
Files
(482.7 MB)
Name | Size | Download all |
---|---|---|
md5:d86909436e70af09c529dbdd3512bf1e
|
23.5 MB | Download |
md5:a7a6374457c8576ed0adbeaeaf741cac
|
427.4 MB | Download |
md5:b9c6ff802f172aee31633d09cea2ffc7
|
23.2 MB | Download |
md5:06e1513c274f989dea179111d7d1d73f
|
2.7 MB | Download |
md5:619cf8400c93db11c3f819ae92844834
|
2.8 MB | Download |
md5:550c3cc57f6b1ab118328a603ab5132b
|
3.2 MB | Download |
Additional details
Funding
- EMT/BSSE: Discovery of Gene and Protein Expression Patterns and Networks 0829835
- National Science Foundation
References
- Arlei Silva, Mohammed J. Zaki, and Wagner Meira Jr. Mining attribute-structure correlated patterns in large attributed graphs. PVLDB, 5(5):466–477, 2012.