Dataset Open Access

115th U.S. Congress Member Website (Full JavaScript-enabled Scrape) Collection

Rudis, Bob

This data set represents a point-in-time full JavaScript-enabled scrape of all available 115th U.S. Congress member web sites. The data collection originated and completed on 2018-04-13 and the results are in ndjson/jsonlines/streaming JSON format. File format information is in the enclosed README.md file.

The data was used to evaluate the privacy profiles of each U.S. Congress members' official (.gov hosted) websites for the discussion in <https://rud.is/b/2018/04/13/does-congress-really-care-about-your-privacy/>.

ScrapingHub's "Splash" platform (<https://github.com/scrapinghub/splash>) was used along with the "splashr" R package (<https://github.com/hrbrmstr/splashr>) to retrieve the content.

Files (1.9 GB)
Name Size
congress.json.gz
md5:fa06420a2a38d74c3d7b79fff917217d
1.9 GB Download
LICENSE.txt
md5:63f9a9d76e5388688597d204cecb9d5b
33 Bytes Download
README.md
md5:8c6db81cfc046a341fa04ecd306a69ca
1.0 kB Download
sample.json.gz
md5:6c26a5d59079b9056fdb869a5b69dda3
1.4 kB Download
99
14
views
downloads
All versions This version
Views 9999
Downloads 1414
Data volume 7.8 GB7.8 GB
Unique views 9393
Unique downloads 55

Share

Cite as