Published November 18, 2024 | Version v1
Project deliverable Open

Crawler Coordination Software Stack & Demonstrator V2

  • 1. ROR icon University of Passau
  • 2. ROR icon Deutsches Zentrum für Luft- und Raumfahrt e. V. (DLR)
  • 3. ROR icon Leipzig University
  • 4. ROR icon Radboud University Nijmegen
  • 5. ROR icon European Organization for Nuclear Research

Description

This document provides a detailed overview of the second deliverable for D1.2 "The OpenWebSearch Crawler and the Crawling Frontier", a deliverable of the OpenWebSearch.eu initiative, which is supported by the European Commission (EC) under the Horizon Europe Framework Programme grant agreement number GA 101070014. It outlines the achievements in developing and launching the Open Web Crawler (OWLer) and its related software components. These are reviewed in the context of their initial proposal stage as a Proof-of-Concept, whose main motivation is to investigate the feasibility of a distributed, heterogeneous and yet scalable crawling system. Our work sheds a light on how a future software system – backed up with more power in engineering and infrastructure – may look like in order to support the creation process of the Open Web Index with the continuous collection of relevant web documents.

Notes (English)

The deliverable expresses the opinion of the authors and has not yet been approved by the EC.

Files

OpenWebSearch.eu Project - Deliverable D1.2 Crawler Coordination Software Stack & Demonstrator V2.pdf