Published February 8, 2023 | Version v0.5
Dataset Open

A StackExchange Dataset of Developer Questions Related to Checked-in Secrets in Software Artifacts

  • 1. North Carolina State University

Description

Throughout 2021, GitGuardian's monitoring of public GitHub repositories revealed a two-fold increase in the number of secrets (database credentials, API keys, and other credentials) exposed compared to 2020, accumulating more than six million secrets. To our knowledge, the challenges developers face to avoid checked-in secrets are not yet characterized. In our artifact, we provide a dataset containing 779 questions mined from three StackExchange sites asked by developers related to checked-in secrets from three StackExchange sites. In addition, we provide 434 accepted answers provided by the other users of StackExchange to mitigate the challenge of checked-in secrets.

 

An overview of StackExchange artifact
Field Name Description
Id An unique identifier of the question.
Title The title of the question.
Body The description of the question.
Tags The tags related to the question such as "security", "git" and "key-management".
CreationDate The date when the question is posted.
Score The count of upvotes in the question.
ViewCount The number of users who viewed the question.
AnswerCount The total number of answers posted in the question.
CommentCount The total number of comments posted in the question.
FavouriteCount The total number of users who marked the question as favourite.
ClosedDate The date when the community marked the question as closed. 
URL The url of the question.
AcceptedAnswerId The unique identifier of the accepted answer for the question.
Answer The accepted answer of the question.

Notes

If you use this dataset, please cite it using the metadata from this file.

Files

setu1421/ICSE-2023-Artifacts-v0.5.zip

Files (961.4 kB)

Name Size Download all
md5:119508d3343e3fa376c26dbf4dab2465
961.4 kB Preview Download

Additional details