Published January 12, 2018 | Version 1.0.3
Dataset Open

NetVotes iKnow Dataset

  • 1. Avignon Université

Description

Description. This is the data used in the experiment of the following conference paper:

  • N. Arınık, R. Figueiredo, and V. Labatut, “Signed Graph Analysis for the Interpretation of Voting Behavior,” in International Conference on Knowledge Technologies and Data-driven Business - International Workshop on Social Network Analysis and Digital Humanities, Graz, AT, 2017, vol. 2025. ⟨hal-01583133

Source code. The code source is accessible on GitHub: https://github.com/CompNet/NetVotes

Citation. If you use the data or source code, please cite the above paper.


@InProceedings{Arinik2017,
  author    = {Arınık, Nejat and Figueiredo, Rosa and Labatut, Vincent},
  title     = {Signed Graph Analysis for the Interpretation of Voting Behavior},
  booktitle = {International Conference on Knowledge Technologies and Data-driven Business - International Workshop on Social Network Analysis and Digital Humanities},
  year      = {2017},
  volume    = {2025},
  series    = {CEUR Workshop Proceedings},
  address   = {Graz, AT},
  url       = {http://ceur-ws.org/Vol-2025/paper_rssna_1.pdf},
}

----------------------

Details. 


# RAW INPUT FILES
The 'itsyourparliament' folder contains all raw input files for further data processing (such as network extraction).
The folder structure is as follows:
* itsyourparliament/
** domains: There are 28 domain files. Each file corresponds to a domain (such as Agriculture, Economy, etc.) and contains corresponding vote identifiers and their "itsyourparliament.eu" links.
** meps: There are 870 Member of Parliament (MEP) files. Each file contains the MEP information (such as name, country, address, etc.)
** votes: There are 7513 vote files. Each file contains the votes expressed by MEPs
# NETWORKS AND CORRESPONDING PARTITIONS
This work studies the voting behavior of French and Italian MEPs on "Agriculture and Rural Development" (AGRI) and "Economic and Monetary Affairs" (ECON) for each separate year of the 7th EP term (2009-10, 2010-11, 2011-12, 2012-13, 2013-14). Note that the interpretation part (section 4) of the published paper is limited to only a few of these instances (2009-10 in ECON and 2012-13 in AGRI).
The extracted networks are located in the "networks" folder and the corresponding partitions are in the "partitions" folder. Both folders have the same structure, which is as follows:
COUNTRY-NAME
|__DOMAIN-NAME
|__2009-10
|__2010-11
|__2011-12
|__2012-13
|__2013-14
## NETWORKS
The networks in this folder are used in the article. All those networks are the ones obtained after the filtering step (as explained in the article). The networks are in 'Graphml' format. These networks are enriched with some MEPs' properties (such as name, political party, etc.) associated with each node.
## ALL NETWORKS
For those who are interested in other countries or domains, we make available all possible networks that we can extract from raw data with vs. without filtering step.
COUNTRY-NAME
|__m3
|__negtr=NA_postr=NA: This folder contains all filtered networks. Note that the filtering step is explained in Section 2.1.2 of the article.
|__bygroup
|__bycountry
|__negtr=0_postr=0: This folder contains all original networks (i.e. no filtering step).
|__bygroup
|__bycountry
## PARTITIONS
The partitions are obtained in this way: First, the Ex-CC (exact) method is run and we denote 'k' for the the number of detected cluster in output. This 'k' value is the reference point in order to run the ILS-RCC (heuristic) method by specifying the number of desired cluster in output. Then, ILS-RCC is run with various values ('k', 'k+1', 'k+2'). All those results are integrated into the initial network graphml files and then converted into gephi format so that this will help dive in the results in interactive way.
Note that we need to handle the absent MEPs in clustering results. Because, those MEPs correspond to isolated nodes in networks. Each isolated node is considered a single cluster node in Ex-CC results. We simply omit those nodes in order to find the 'k' (number of detected cluster) value before running ILS-RCC. Not also that ILS-RCC does not process isolated nodes such that an isolated node can be part of a cluster.

----------------------
# COMPARISON RESULTS
The 'material-stats' folder contains all the comparison results obtained for Ex-CC and ILS-CC. The csv files associated with plots are also provided.
The folder structure is as follows:
* material-stats/
** execTimePerf: The plot shows the execution time of Ex-CC and ILS-CC based on randomly generated complete networks of different size.
** graphStructureAnalysis: The plots show the weights and links statistics for all instances.
** ILS-CC-vs-Ex-CC: The folder contains 4 different comparisons between Ex-CC and ILS-CC: Imbalance difference, number of detected clusters, difference of the number of detected clusters, NMI (Normalized Mutual Information)

----------------------
Funding: Agorantic FR 3621, FMJH Program Gaspard Monge in optimization and operation research (Project 2015-2842H)

Files

all-networks.zip

Files (517.3 MB)

Name Size Download all
md5:a8f7a6132f04784a1e83a8a61712aab1
477.2 MB Preview Download
md5:a5ef8425052a112f150b856fa2a7364b
34.2 MB Preview Download
md5:9d329d9859f7b10b85c52be078e29cbf
1.0 MB Preview Download
md5:87094eefbb8487aaa91968a8451ed7f0
4.0 MB Preview Download
md5:a9bd3d712f6a207c914fe00b8d51cf69
837.1 kB Preview Download

Additional details

Related works

Is documented by
Conference paper: http://ceur-ws.org/Vol-2025/paper_rssna_1.pdf (URL)
Is required by
Software: https://github.com/CompNet/NetVotes (URL)
Obsoletes
Dataset: 10.6084/m9.figshare.5785833 (DOI)