Published March 28, 2020 | Version v1
Dataset Open

Graphs and Attributes used for the attribute-structure correlation pattern mining

  • 1. UCSB
  • 2. UFMG
  • 3. RPI

Description

## SCPM: An implementation of an algorithm for structural correlation pattern mining.

The structural correlation measures how a set of attributes induces dense subgraphs in an attributed graph. A structural correlation pattern is a dense subgraph induced by a particular attribute set. Structural correlation pattern mining is useful to analyze how different attribute sets are correlated to dense subgraphs in several real-life attributed graphs.

**Relevant Publications**

* Arlei Silva, Wagner Meira, Jr., and Mohammed J. Zaki. Structural correlation pattern mining for large graphs. In Proceedings of the Eighth Workshop on Mining and Learning with Graphs (MLG '10).

* Arlei Silva, Wagner Meira, Jr., and Mohammed J. Zaki. Mining Attribute-structure Correlated Patterns in Large Attributed Graphs. In Proceedings of the VLDB Endowment (PVLDB '12).

* Arlei Silva. Structural correlation pattern mining for large graphs. M.Sc Thesis, Computer Science Department, Universidade Federal de Minas Gerais, 2011.

* Arlei Silva, Wagner Meira Jr. Structural correlation pattern mining for large graphs. Thesis and Dissertation Contest of the Brazilian Computer Society (CTD'12).


## HOW TO

cd to trunk and run make
see README in trunk


## Datasets:

### Description:

#### ATTRIBUTE FILE:

Format: Lists the attributes of each vertex from the graph.

    <VERTEX_ID>,<ATTRIBUTE_ID>,<ATTRIBUTE_ID>...,<ATTRIBUTE_ID>

        Example: 
        1,A,C 
        2,A 
        3,A,C,D 
        4,A,D 
        5,A,E 
        6,A,B,C 
        7,A,B,E 
        8,A,B 
        9,A,B 
        10,A,B,D 
        11,A,B

#### GRAPH FILE:

Format: Lists the neighbors of each vertex from the graph (adjacency list). Although the graph is undirected, each edge must be included in both directions.

    <VERTEX_ID>,<NEIGHBOR_ID>,<NEIGHBOR_ID>...,<NEIGHBOR_ID> 
    
        Example: 
        1,4 
        2,3 
        3,2,4,5,6,7 
        4,1,3,5,6 
        5,3,4,6 
        6,3,4,5,7,8,9,10 
        7,3,6,8,11 
        8,6,7,9,10,11 
        9,6,8,10,11 
        10,6,8,9,11 
        11,7,8,9,10

### REAL DATASETS

Lastfm:

attributes: attrLastFm.csv.tar.bz2

network: graphLastFm.csv.tar.gz

DBLP:

attributes: newAttrDBLP.csv.tar.bz2

network: newGraphDBLP.csv.tar.bz2

CITESEER:

attributes: attrCiteseer.csv.tar.bz2

network: graphCiteseer.csv.tar.bz2

Files

Files (482.7 MB)

Name Size Download all
md5:d86909436e70af09c529dbdd3512bf1e
23.5 MB Download
md5:a7a6374457c8576ed0adbeaeaf741cac
427.4 MB Download
md5:b9c6ff802f172aee31633d09cea2ffc7
23.2 MB Download
md5:06e1513c274f989dea179111d7d1d73f
2.7 MB Download
md5:619cf8400c93db11c3f819ae92844834
2.8 MB Download
md5:550c3cc57f6b1ab118328a603ab5132b
3.2 MB Download

Additional details

Funding

EMT/BSSE: Discovery of Gene and Protein Expression Patterns and Networks 0829835
National Science Foundation

References

  • Arlei Silva, Mohammed J. Zaki, and Wagner Meira Jr. Mining attribute-structure correlated patterns in large attributed graphs. PVLDB, 5(5):466–477, 2012.