SecLLM
Description
Generative Artificial Intelligence, and in particular Large Language Models (LLMs) and LLM-based Agents, have significantly changed the way how we (humans) perform our daily activities; features like being able to establish smart conversations with and without context, as well as their capabilities for answering questions from any kind and domain is attracting more attention from the researchers and practitioners communities because we need to understand and assess the weaknesses and strengths of the models. For instance, hallucination is a well-known issue in LLMs, as well as the possibility of providing inappropriate questions when the models lack filters or are biased or poisoned. Previous work has been devoted to asses LLMs under different contexts and scenarios, e.g., code generation. However, few studies have been done in the context of information security; to our knowledge, no previous work has analyzed the quality of answers provided by LLMs to cybersecurity-related questions. Therefore, we present a dataset of questions extracted from StackExchange, including their top-10 answers and the ones generated by three GPT models (3.5-Turbo, 4-4o) for 5K+ questions; the dataset also includes similarity metrics (e.g., ROUGE, SacreBLUE, BERTScore) of the LLM-based answers when compared to the human-accepted ones.
Files
SecLLM.md
Files
(219.8 MB)
Name | Size | Download all |
---|---|---|
md5:21867b526ad85204474bef0bc51cb49e
|
18.9 MB | Download |
md5:ff9c3d5bb474398ffd0b0f2a9a48b35d
|
74.4 MB | Preview Download |
md5:98f58d1296c460040b5250ea7f784f34
|
954 Bytes | Preview Download |
md5:99ee05ffbac4b16a8c036ef2d4f7f94d
|
72.1 MB | Preview Download |
md5:39fd291fe7f76934de21587616c28cb2
|
54.4 MB | Download |
md5:2c1fb4747ad8922f69bd81cf110c8481
|
5.6 kB | Download |