Project deliverable Open Access

LoCloud D2.6: Crawler ready tagging tools

Bergheim, Stein Runar; Slettvåg, Siri

The LoCloud Crawler Ready Tagging Tools (henceforth CRTT) are a set of experimental tools for automatically extracting structured metadata from HTML mark-up loaded from web documents.  The objective is to verify if the crawling/indexing method applied by the mainstream search engines could be a viable, simplified supplement to the comprehensive Europeana ingestion process. To this end, the CRTT have been validated using small institutions as a test case.

This deliverable describes the rationale, technology, validation testing and next steps for the LoCloud CRTT.

LoCloud was funded by the European Commission's ICT Policy Support Programme. Grant Agreement number: 325099
Files (2.9 MB)
Name Size
LoCloud-D2.6_Crawler_ready_tagging_tools.pdf
md5:2d841a9925fe60a4768eed2354650dcf
2.9 MB Download
21
15
views
downloads
All versions This version
Views 2121
Downloads 1515
Data volume 43.3 MB43.3 MB
Unique views 2020
Unique downloads 1414

Share

Cite as