Dataset Open Access
Constantinos Patsakis;
Fran Casino
{ "description": "<p>This repository contains a dataset for the research of domain generation algorithms (DGAs) and machine learning. More precisely, it targets dictionary-based DGAs.</p>\n\n<p><em>Constantinos Patsakis, Fran Casino: "Exploiting Statistical and Structural Features for the Detection of Domain Generation Algorithms", Journal of Information Security and Applications, 2021.</em></p>\n\n<p>Features ordered as in the shared dataset:</p>\n\n<ul>\n\t<li>Family: DGA that the domain belongs to</li>\n\t<li>SLD: SLD of the Domain</li>\n\t<li>L-LEN: The length of Domain</li>\n\t<li>L-DIG: The number of digits in Domain</li>\n\t<li>L-CON-MAX: The maximum number of consecutive consonants Domain</li>\n\t<li>R-CON-VOW: Number of consonants divided by L-LEN </li>\n\t<li>L-SYM: The number of special characters</li>\n\t<li>R-SYM-LEN: L-SYM divided by L-LEN</li>\n\t<li>R-Dom-3G: Ratio of benign grams in Dom-3G</li>\n\t<li>R-Dom-4G: Ratio of benign grams in Dom-4G</li>\n\t<li>R-Dom-5G: Ratio of benign grams in Dom-5G</li>\n\t<li>L-W2: Number of words with more than 2 characters in Domain</li>\n\t<li>L-W3: Number of words with more than 3 characters in Domain</li>\n\t<li>R-WS-LEN: Dom-WS divided by L-LEN</li>\n\t<li>R-WDS-LEN: Dom-WDS divided by L-LEN</li>\n\t<li>R-W2-LEN: Dom-W2 divided by L-LEN</li>\n\t<li>R-W3-LEN: Dom-W3 divided by L-LEN</li>\n\t<li>M2-Dom-Ws: 2-Chain Markov English grams applied to Dom-WS</li>\n\t<li>M2-Dom-WDS: 2-Chain Markov English grams applied Dom-WDS</li>\n\t<li>E-Dom-WS: Entropy of Dom-WS </li>\n\t<li>E-Dom-WDS: Entropy of Dom-WDS</li>\n\t<li>E-Dom-W2: Entropy of Dom-W2</li>\n\t<li>E-Dom-W3: Entropy of Dom-W3</li>\n</ul>", "license": "https://creativecommons.org/licenses/by/4.0/legalcode", "creator": [ { "affiliation": "University of Piraeus", "@id": "https://orcid.org/0000-0002-4460-9331", "@type": "Person", "name": "Constantinos Patsakis" }, { "affiliation": "University of Piraeus", "@id": "https://orcid.org/0000-0003-4296-2876", "@type": "Person", "name": "Fran Casino" } ], "url": "https://zenodo.org/record/4010620", "datePublished": "2020-09-01", "version": "1.0", "keywords": [ "DGAs" ], "@context": "https://schema.org/", "distribution": [ { "contentUrl": "https://zenodo.org/api/files/08c510cc-418b-4dc7-9bf0-6475a01478a4/dictionary_DGAs_dataset.zip", "encodingFormat": "zip", "@type": "DataDownload" } ], "identifier": "https://doi.org/10.5281/zenodo.4010620", "@id": "https://doi.org/10.5281/zenodo.4010620", "@type": "Dataset", "name": "Exploiting Statistical and Structural Features for the Detection of Domain Generation Algorithms" }
All versions | This version | |
---|---|---|
Views | 134 | 134 |
Downloads | 11 | 11 |
Data volume | 574.4 MB | 574.4 MB |
Unique views | 109 | 109 |
Unique downloads | 11 | 11 |