Published July 2, 2025 | Version v1

The Information Potential of Books

Authors/Creators

  • 1. ROR icon École Polytechnique Fédérale de Lausanne

Description

For practical and legal reasons, Large Language Models are primarily trained on contemporary, web-based texts and not on the vast array of content found in published books. As a consequence, their competence does not capture the rich diversity of knowledge that libraries have worked to preserve and make accessible. Because of this epistemic gap, libraries can potentially play a crucial role in the development of future versions of these models. In this presentation, I will discuss a computational strategy designed to effectively quantify and utilize the knowledge contained within books, addressing the opportunities and challenges for libraries in this process.

Files

Frédéric Kaplan_LIBER2025.pdf

Files (14.2 MB)

Name Size Download all
md5:834c5374fde25a4ace9454143b468563
14.2 MB Preview Download