Published July 29, 2020
| Version v0.7.0
Software
Open
htmldate: A Python package to extract publication dates from web pages
Description
htmldate finds original and updated publication dates of web pages using heuristics on HTML code and linguistic patterns. It operates both within Python and from the command-line.
- code base and performance improved
- minimum date available as option
- support for Turkish patterns and CMS idiosyncrasies (thanks @evolutionoftheuniverse)
Files
adbar/htmldate-v0.7.0.zip
Files
(5.4 MB)
Name | Size | Download all |
---|---|---|
md5:af35a7b7d48d36bddb7b628d80d16090
|
5.4 MB | Preview Download |
Additional details
Related works
- Is supplement to
- https://github.com/adbar/htmldate/tree/v0.7.0 (URL)