<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" lang="" xml:lang="">
<head>
  <meta charset="utf-8" />
  <meta name="generator" content="pandoc" />
  <meta name="viewport" content="width=device-width, initial-scale=1.0, user-scalable=yes" />
  <title>Stratosphere Datasets</title>
  <style>
    code{white-space: pre-wrap;}
    span.smallcaps{font-variant: small-caps;}
    span.underline{text-decoration: underline;}
    div.column{display: inline-block; vertical-align: top; width: 50%;}
    div.hanging-indent{margin-left: 1.5em; text-indent: -1.5em;}
    ul.task-list{list-style: none;}
  </style>
</head>
<body>
<header id="title-block-header">
<h1 class="title">Stratosphere Datasets</h1>
</header>
<h1 id="ctu-sme-11-a-labeled-dataset-with-real-benign-and-malicious-network-traffic-mimicking-a-small-medium-size-enterprise-environment">CTU-SME-11: a labeled dataset with real benign and malicious network traffic mimicking a small medium-size enterprise environment</h1>
<h2 id="about">About</h2>
<p>CTU-SME-11 is a labeled network dataset designed to address the limitations of previous datasets. The dataset was captured in a real network that mimics a small-medium enterprise setting. Raw network traffic (packets) was captured from 11 devices using tcpdump for a duration of 7 days, from 20th to 26th of February, 2023 in Prague, Czech Republic. The devices were chosen based on the enterprise setting and consists of IoT, desktop and mobile devices, both bare metal and virtualized. The devices were infected with malware or exposed to Internet attacks, and factory reset to restore benign behavior.</p>
<p>The CTU-SME-11 dataset was created in 2023 by Štěpán Bendl, as part of his master thesis in the Stratosphere Laboratory, AIC, FEL, Czech Technical University in Prague, under the supervision of Sebastian Garcia and Veronica Valeros.</p>
<h2 id="citation">Citation</h2>
<p>Cite as: Bendl, Štěpán, Valeros, Veronica, &amp; Garcia, Sebastian. (2023). CTU-SME-11: a labeled dataset with real benign and malicious network traffic mimicking a small medium-size enterprise environment (1.0.0) [Data set]. Zenodo. https://doi.org/10.5281/zenodo.7958259</p>
<h2 id="dataset-specifications">Dataset specifications</h2>
<ul>
<li>Dataset name: CTU-SME-11</li>
<li>Dataset description: This dataset contains network traffic of 11 devices for 7 days</li>
<li>Dataset duration: 24 hours * 7 days * 11 devices</li>
</ul>
<h2 id="dataset-file-description">Dataset file description</h2>
<p>The following files are included in each dataset day:</p>
<ul>
<li>raw/
<ul>
<li>.pcap: original packet capture file in pcap format</li>
</ul></li>
<li>zeek/
<ul>
<li>conn.log.labeled</li>
<li>Additional Zeek logs depending on the network traffic of each capture</li>
</ul></li>
<li>artifacts/
<ul>
<li>labels.config: netflow labeler rule configuration file</li>
</ul></li>
<li>README.md: Markdown README</li>
<li>README.html: HTML README</li>
</ul>
<h2 id="dataset-timeline">Dataset timeline</h2>
<ul>
<li>Capture started on: 2023-02-20</li>
<li>Capture stopped on: 2023-02-26</li>
</ul>
<h2 id="ethical-statement">Ethical Statement</h2>
<p>This dataset contains human behavior. Consent was obtained from all participants that participated in the dataset generation, ensuring their participation was voluntary and that the participants understood the research objectives, methods, and potential risks. To protect anonymity and confidentiality, participants were forbidden to use personal accounts in the devices they interacted with.</p>
<h2 id="contact">Contact</h2>
<p>These files were generated in the Stratosphere Laboratory, AIC, FEL, Czech Technical University in Prague, Czech Republic. Contact us at stratosphere@aic.fel.cvut.cz, sebastian.garcia@agents.fel.cvut.cz.</p>
</body>
</html>