Oregon State University Superfund Research Center: Automated Python Superfund NPL Site Scraper
Authors/Creators
- 1. Oregon State University
Description
The Superfund NPL Site Scraper automates the collection and standardization of data from U.S. EPA Superfund resources. Built in Python, the tool retrieves site-level information from EPA online tables, Microsoft Excel files, and individual site profile pages. It uses requests, BeautifulSoup, and pandas to parse structured and semi-structured content, extract cleanup milestones, and normalize outputs into consistent CSV schemas (e.g., site ID, site name, location, operational status, milestone history).
The scraper is fully configurable, enabling users to add or modify target data fields without restructuring the codebase. Designed for repeated use, it supports research tracking, program reporting, and integration with Google Sheets and other database systems.
Files
README.md
Additional details
Funding
- National Institutes of Health
- Identification of Remediation Technologies and Conditions that Minimize Formation of Hazardous PAH Breakdown Products at Superfund Sites 5P42ES016465-15
Software
- Repository URL
- https://github.com/bartonmike/superfund-npl-scraper
- Programming language
- Python
- Development Status
- Active