Published November 19, 2023 | Version v0.0
Dataset Open

email-EU

  • 1. ROR icon University of Vermont

Description

Overview

This hypergraph dataset was generated using email data from a large European research institution for a period from October 2003 to May 2005 (18 months). Information about all incoming and outgoing emails between members of the research institution has been anonymized. The e-mails only represent communication between institution members (the core), and the dataset does not contain incoming messages from or outgoing messages to the rest of the world.

This is a temporal hypergraph dataset, which here means a sequence of timestamped hyperedges where each hyperedge is a set of nodes. Timestamps are in ISO8601 format. In email communication, messages can be sent to multiple recipients. In this dataset, nodes are email addresses at a European research institution. The original data source only contains directed temporal edge tuples (sender, receiver, timestamp), where timestamps are recorded at 1-second resolution. The hyperedges are undirected and consist of a sender and all receivers grouped such that the email between the sender and each receiver has the same timestamp.

Statistics

Some basic statistics of this dataset are:

  • number of nodes: 1,005
  • number of timestamped hyperedges: 235,263
  • distribution of the connected components:

Component Size, Number 

  • 986, 1
  • 1, 19

Source of original data

Source: email-Eu dataset

References

If you use this dataset, please cite these references:

Files

email-eu.json

Files (27.2 MB)

Name Size Download all
md5:0e07119a91c0fdbb2629eac71f219820
27.2 MB Preview Download