Published July 10, 2024 | Version v1
Presentation Open

Bridging the Semantic Gap: Innovations in AI-driven Access to Czech Digital Libraries

  • 1. ROR icon Moravian Library in Brno
  • 2. Library of Czech Academy of Sciences

Description

In recent years, the digitization of documents within Czech libraries and archives has reached unprecedented levels, presenting new challenges and opportunities for information retrieval. The traditional full-text search methods employed in the Kramerius digital library system, while effective in locating words or phrases, fall short in finding images or capturing the semantic nuances and meanings within the text. This proposal outlines two groundbreaking projects, Orbis Pictus and Sémant, that aim to revolutionize the accessibility and exploration of digitized content within Czech libraries, transcending the limitations of conventional search methods.

SemANT Project: Enhancing Meaningful Search and Navigation

The SemANT project is dedicated to overcoming the limitations of conventional full-text search methods by introducing a semantically enhanced search system. This approach allows users to delve beyond specific words and search for meaning and context within the text. Through the incorporation of automatic topic identification, users can seamlessly navigate related documents and explore areas of interest. The project also allows users to search by text segments (e.g., paragraphs) and to specify their own topics, creating a personalized and intuitive search experience.

Orbis Pictus Project: Unlocking Graphic Content through Machine Learning

While textual information dominates library collections, a significant portion of cultural heritage lies within graphic elements, including drawings, maps, photographs, and diagrams. The Orbis Pictus project leverages machine learning methods to identify, categorize, and contextualize these graphic elements within digitized documents. By extending the capabilities of digital libraries to include graphic content, the project enhances user access and opens new possibilities for creative industries. By bridging the semantic gap between textual and graphic content, Orbis Pictus contributes to a more holistic and inclusive exploration of our cultural heritage.

In this lecture, we will delve into the methodologies, challenges, and anticipated outcomes of these projects, showcasing their potential to redefine how users interact with digitized materials within digital libraries. The innovations presented in Sémant and Orbis Pictus mark a significant step forward in leveraging AI for a more enriched and meaningful exploration of cultural and historical artifacts. Additionally, the presentation will explore the necessary changes to the core of the Kramerius digital library system and its user interface. We will discuss how these enhancements align with the broader goal of transforming the user experience and expanding the capabilities of digital libraries in the context of evolving technologies and user expectations.

Files

Session1_FilipJebavy_LIBER 2024_updated.pdf

Files (9.4 MB)

Name Size Download all
md5:fa70818afcf0783cd4240573802ee221
9.4 MB Preview Download