Dataset Open Access

All Your Script Are Belong to Us: Collecting and Analyzing JavaScript Code from 10K Sites for 9 Months

Dimitris Mitropoulos; Panos Louridas; Vitalis Salis; Diomidis Spinellis


Citation Style Language JSON Export

{
  "publisher": "Zenodo", 
  "DOI": "10.5281/zenodo.2593266", 
  "title": "All Your Script Are Belong to Us: Collecting and Analyzing JavaScript Code from 10K Sites for 9 Months", 
  "issued": {
    "date-parts": [
      [
        2019, 
        3, 
        14
      ]
    ]
  }, 
  "abstract": "<p>We present a massive dataset (~2 TB) of client-side JavaScript code. Specifically, we have collected and stored on adaily basis JavaScript code from Alexa&#39;s Top 10000 web sites (~7.5 GB per day) for nine consecutive months. Our collection involved both inline scripts extracted from each web site&#39;s main page and external scripts linked from it. In order to aid researchers identify similar scripts and examine their popularity and evolution, we have produced hashes that represent the scripts&#39; logical structure. Furthermore, we have analyzed the resulting dataset with well-established static analysis tools, generating additional metadata including reports with quality bugs and vulnerable libraries.</p>", 
  "author": [
    {
      "family": "Dimitris Mitropoulos"
    }, 
    {
      "family": "Panos Louridas"
    }, 
    {
      "family": "Vitalis Salis"
    }, 
    {
      "family": "Diomidis Spinellis"
    }
  ], 
  "type": "dataset", 
  "id": "2593266"
}
82
10
views
downloads
All versions This version
Views 8282
Downloads 1010
Data volume 363.5 GB363.5 GB
Unique views 7373
Unique downloads 66

Share

Cite as