Dataset Open Access

The Clarity Software Documentation Dataset

Anonymous Authors


Citation Style Language JSON Export

{
  "publisher": "Zenodo", 
  "DOI": "10.5281/zenodo.5822884", 
  "title": "The Clarity Software Documentation Dataset", 
  "issued": {
    "date-parts": [
      [
        2022, 
        1, 
        5
      ]
    ]
  }, 
  "abstract": "<p>This repository holds the Clarity Dataset which is a companion to the SANER&#39;22 entitled &quot;An Empirical Investigation into the Use of Image Captioning for Automated Software Documentation&quot;. The dataset consists of 45,998 captions&nbsp;10,204 GUI screenshots and xml metadata files (akin to the &quot;html&quot; for stipulating GUIs)&nbsp;of Android applications.&nbsp;The NL captions were obtained from human labelers, underwent several quality control mechanisms, and contain both high- (screen-level) and low-(component)&nbsp;level descriptions of screen functionality. This dataset is meant as a new source of data to augment techniques for software documentation that can take advantage of the rich pixel-based information contained within screenshots.</p>", 
  "author": [
    {
      "family": "Anonymous Authors"
    }
  ], 
  "version": "1.0", 
  "type": "dataset", 
  "id": "5822884"
}
82
18
views
downloads
All versions This version
Views 8271
Downloads 1815
Data volume 123.3 GB86.3 GB
Unique views 6860
Unique downloads 1411

Share

Cite as