CISA TTP Articles Data Set
Authors/Creators
Description
This dataset contains 77 cybersecurity articles crawled from the public CISA website. All these articles were publically available at the time of crawling without the need of any subscription or usage of paid services. These articles were published from July 2020 to February 2024 and selected for this dataset if they contained explicitely mentioned MITRE ATT&CK TTPs (Tactics, Techniques, and Procedures).
The data set supports research in the domain of Cyber Threat Intelligence as it may act as a ground truth for TTP labeling. Specifically, this dataset is designed to facilitate research and analysis related to the identification and classification of TTPs in cybersecurity advisories.
Each crawled article is represented by the following four columns:
RawText: The unfiltered text extracted from the main content of each article (class: "l-full__main").TTP: A set of MITRE ATT&CK TTP (Tactics, Techniques, and Procedures) IDs identified within the article's RawText. These IDs are extracted using the regex pattern: (?:TA\d{4}|T\d{4,5}(?:\.\d{3})?).CleanText: A cleaned version of the RawText, with tables and TTP IDs removed for clarity.URL: The url to the original article.
About the crawling process
All advisories were gathered on Sept 27th, 2024 from the CISA website by sifting through all advisory urls backwards in time until 2020. All articles which explicitely mentioned TTPs were selected for the data set. To detect the presence of TTP IDs, each article was checked for the presence of any of the following phrases in the main content:
- "MITRE ATT&CK Tactics and Techniques"
- "Tactics and Techniques"
- "MITRE ATT&CK Techniques"
The data set is availble in CSV and as JSON format, both containing the same data.
Acknowledgments: Funded by the European Union under the European Defence Fund (GA no. 101121403 - NEWSROOM and GA no. 101121418 - EUCINF). Views and opinions expressed are however those of the author(s) only and do not necessarily reflect those of the European Union or the European Commission. Neither the European Union nor the granting authority can be held responsible for them. This work is co-funded by the Austrian FFG Kiras project ASOC (GA no. FO999905301).
Files
CISA-crawl-rt-ttp-ct.csv
Files
(8.1 MB)
| Name | Size | Download all |
|---|---|---|
|
md5:2c135f512d9312177c070e37782eaf0e
|
4.0 MB | Preview Download |
|
md5:8497bf685c04c5b68730946dbe203557
|
4.1 MB | Preview Download |
Additional details
Dates
- Collected
-
2024-09-27