Published February 19, 2015
| Version Final
Project deliverable
Open
LoCloud D2.6: Crawler ready tagging tools
Description
The LoCloud Crawler Ready Tagging Tools (henceforth CRTT) are a set of experimental tools for automatically extracting structured metadata from HTML mark-up loaded from web documents. The objective is to verify if the crawling/indexing method applied by the mainstream search engines could be a viable, simplified supplement to the comprehensive Europeana ingestion process. To this end, the CRTT have been validated using small institutions as a test case.
This deliverable describes the rationale, technology, validation testing and next steps for the LoCloud CRTT.
Notes
Files
LoCloud-D2.6_Crawler_ready_tagging_tools.pdf
Files
(2.9 MB)
Name | Size | Download all |
---|---|---|
md5:2d841a9925fe60a4768eed2354650dcf
|
2.9 MB | Preview Download |