Planned intervention: On Wednesday June 26th 05:30 UTC Zenodo will be unavailable for 10-20 minutes to perform a storage cluster upgrade.

There is a newer version of the record available.

Published July 29, 2020 | Version v0.7.0
Software Open

htmldate: A Python package to extract publication dates from web pages

  • 1. Berlin-Brandenburg Academy of Sciences

Description

htmldate finds original and updated publication dates of web pages using heuristics on HTML code and linguistic patterns. It operates both within Python and from the command-line.

  • code base and performance improved
  • minimum date available as option
  • support for Turkish patterns and CMS idiosyncrasies (thanks @evolutionoftheuniverse)

Files

adbar/htmldate-v0.7.0.zip

Files (5.4 MB)

Name Size Download all
md5:af35a7b7d48d36bddb7b628d80d16090
5.4 MB Preview Download

Additional details

Related works