Oregon State University Superfund Research Center: Automated Python Superfund NPL Site Scraper

Barton, Michael; Deal, Carter; Germano, Francesca; Anderson, Kim; Rohlman, Diana

doi:10.5281/zenodo.17595062

Published November 13, 2025 | Version v1

Software Open

Oregon State University Superfund Research Center: Automated Python Superfund NPL Site Scraper

1. Oregon State University

The Superfund NPL Site Scraper automates the collection and standardization of data from U.S. EPA Superfund resources. Built in Python, the tool retrieves site-level information from EPA online tables, Microsoft Excel files, and individual site profile pages. It uses requests, BeautifulSoup, and pandas to parse structured and semi-structured content, extract cleanup milestones, and normalize outputs into consistent CSV schemas (e.g., site ID, site name, location, operational status, milestone history).
The scraper is fully configurable, enabling users to add or modify target data fields without restructuring the codebase. Designed for repeated use, it supports research tracking, program reporting, and integration with Google Sheets and other database systems.

Files

README.md

Files (12.6 kB)

Name	Size	Download all
README.md md5:309b5029db59fb7c1ff6800fb753d005	3.4 kB	Preview Download
requirements.txt md5:0b660262f2b61c640e4f0ce66e4927b3	126 Bytes	Preview Download
superfund.py md5:bcabff2cd42582df00cbb449a6e0f427	9.1 kB	Download

Additional details

National Institutes of Health
Identification of Remediation Technologies and Conditions that Minimize Formation of Hazardous PAH Breakdown Products at Superfund Sites 5P42ES016465-15

Repository URL: https://github.com/bartonmike/superfund-npl-scraper
Programming language: Python
Development Status: Active

	All versions	This version
Views	22	22
Downloads	7	7
Data volume	34.3 kB	34.3 kB

Oregon State University Superfund Research Center: Automated Python Superfund NPL Site Scraper

Authors/Creators

Description

Files

README.md

Files (12.6 kB)

Additional details

Funding

Software