2669079
doi
10.5281/zenodo.2669079
oai:zenodo.org:2669079
Smeriga, Juraj
Masaryk University
Host network traffic time series 2019/01
Jirsik, Tomas
Masaryk University
info:eu-repo/semantics/openAccess
Creative Commons Attribution 4.0 International
https://creativecommons.org/licenses/by/4.0/legalcode
network traffic
time series
host
clustering
classification
<p><em><strong>General info</strong></em></p>
<p>Dataset was collected over one <strong>month period in January 2019</strong>. The observation points for the collection of IP flows were located at the borders of the university campus network. The campus university network has /16 CIDR IPv4 network range at disposal and contains various network segments from segments connecting dormitories, over server segments, to a segment containing working stations of university administrative workers. The size of the raw IP flows used to create the dataset was over 860GB. <strong>A host in our dataset is identified by its source IPv4 address. </strong><br>
</p>
<p><em><strong>Variables</strong></em></p>
<p>The dataset contains the following variables:</p>
<ul>
<li><strong>Aggregations</strong> - created from five-minute total volumes aggregated over one-hour disjoint windows using mean/max/min aggregation functions
<ul>
<li><strong># of flows (FL) </strong>- number of flows for a given source IP </li>
<li><strong># of packets (PKT)</strong> - number of packets for a given source IP</li>
<li><strong># of bytes (BYT)</strong> - number of packets for a given source IP</li>
<li><strong>flow duration (DUR)</strong> - average flow duration in seconds</li>
</ul>
</li>
<li><strong>Distinct Counts </strong>- count of distinct values for each variable in five-minute window aggregated over one-hour disjoint windows using mean/max/min aggregation functions
<ul>
<li><strong># of peers (PEER)</strong> - number of distinct communication peers for a given source IP</li>
<li><strong># of ports (PORTS)</strong> - number of distinct destination ports for a given source IP</li>
<li><strong># of protocols (PROTO)</strong> - number of distinct communication protocols for a given source IP</li>
<li><strong># of AS numbers (AS)</strong> - number of distinct destination AS numbers for a given source IP</li>
<li><strong># of countries (CTRY)</strong> - number of distinct destination countries for a given source IP</li>
</ul>
</li>
<li><strong>Labels</strong>
<ul>
<li><strong>Range (RNG)</strong> - a network range a host belongs to (anonymized)</li>
<li><strong>Unit (UNT) </strong>- an administrative unit owning the network range</li>
<li><strong>Sub-unit (SUB-UNT)</strong> - a sub-unit of the unit</li>
</ul>
</li>
</ul>
<p> </p>
<p><em><strong>Dataset format</strong></em></p>
<ul>
<li>The dataset is in <strong>comma-separated values (CSV)</strong> format. </li>
<li><strong>Header</strong> - multilevel, first 3 lines
<ul>
<li>1 level - aggregation type {mean|min|max}</li>
<li>2 level - variable {see above}</li>
<li>3 level - hour of a day {00,01,02,03,...,22,23}</li>
</ul>
</li>
<li><strong>Lablels</strong> - last 4 columns</li>
<li><strong>Dataset size </strong>
<ul>
<li>rows: 65536 host records + 3 headers</li>
<li>columns: 648 variables + 4 labels</li>
</ul>
</li>
</ul>
<p> </p>
Zenodo
2019-05-06
info:eu-repo/semantics/other
2669078
1.0.0
1579893896.382838
158716697
md5:1a72f130f9bfd95c3107309419221ad2
https://zenodo.org/records/2669079/files/host-network-traffic-time-series-2019-01-annon.csv
public
10.5281/zenodo.2669078
isVersionOf
doi