Project deliverable Open Access

LoCloud D2.6: Crawler ready tagging tools

Bergheim, Stein Runar; Slettvåg, Siri

The LoCloud Crawler Ready Tagging Tools (henceforth CRTT) are a set of experimental tools for automatically extracting structured metadata from HTML mark-up loaded from web documents.  The objective is to verify if the crawling/indexing method applied by the mainstream search engines could be a viable, simplified supplement to the comprehensive Europeana ingestion process. To this end, the CRTT have been validated using small institutions as a test case.

This deliverable describes the rationale, technology, validation testing and next steps for the LoCloud CRTT.

LoCloud was funded by the European Commission's ICT Policy Support Programme. Grant Agreement number: 325099
Files (2.9 MB)
Name Size
LoCloud-D2.6_Crawler_ready_tagging_tools.pdf
md5:2d841a9925fe60a4768eed2354650dcf
2.9 MB Download
12
10
views
downloads
All versions This version
Views 1212
Downloads 1010
Data volume 28.9 MB28.9 MB
Unique views 1111
Unique downloads 99

Share

Cite as