Thesis Open Access
To service users’ information needs is the mission of libraries. Inevitably, the Library Catalog as a tool and its objectives evolve based on users’ needs and seeking behavior. During the 20th century, the evolution of the catalog’s objectives provoked the identification of bibliographic entities, their attributes and their relationships. Thanks to the entity-relationship modeling, these conceptualizations were first expressed by the FRBR model (Functional Requirements for Bibliographic Records). Nowadays, bibliographic entities need to be remodeled exploiting current technologies, known as the Semantic Web, that render data machine-understandable, and provide structure, meaning and trust to the existing World Wide Web. Linked data is a step further enabling the linking of these machine-understandable representations. In this context, bibliographic relationships and families may serve the linking of bibliographic entities and exploration within and beyond the Library Catalog. Thus, library data will be linked to other data to serve new user tasks out of the library environment and in a wide variety of domains.
The overview of the current bibliographic conceptual models presents an abundance of them with differences in terms of the numbers of bibliographic entities and relationships they define. Existing library linked datasets that have exploited these models are very different to one another in terms of modeling and selection of vocabularies. Thus, even though linked data technologies are used, the understanding of the data in the datasets is not ensured. This is a semantic interoperability issue that needs to be resolved to avoid the development of library linked datasets that end up isolated and unused. There have been taken some related initiatives; two mappings between non-library models (schema.org and European Data Model) and the FRBR have been attempted, and studies mostly with regard to the interoperability between models’ core entities. There are no mappings between library models and almost no study exists on the preservation of bibliographic relationships as linking mechanisms in the linked data environment. Toward the goal of semantic interoperability and mappings, bibliographic conceptual models need to be compared to discover similarities and divergences in terms of modeling, granularity, constructs, and linking mechanisms.
The main research question of the thesis is: “Is semantic interoperability between conceptual bibliographic data models feasible?” To answer this question, the thesis poses four objectives: 1) to study and to compare bibliographic models identifying similarities and differences, 2) to develop mappings between the models, 3) to assess the mappings using a testbed, and 4) to identify any possible prerequisites or good cataloging practices for better mappings. The study of the models focuses on 5 models of the library domain, FRBR and its consolidation LRM, FRBRoo, RDA, BIBFRAME, and the EDM, a cultural heritage domain model. The inspection uses real-world cases to discover how core bibliographic entities, common bibliographic relationships (derivative, equivalence, and aggregates), and bibliographic families are represented by each model. This study reveals similarities that may enable semantic interoperability, as well as important differences that may impede it. The results have been organized using the Haslhofer and Klas categorization of metadata heterogeneities. A BIBFRAME-EDM application profile and three mappings (FRBR-BIBFRAME, RDA-BIBFRAME, and BIBFRAME-RDA) have been developed attempting to reconcile the heterogeneities identified between the models. The mappings are assessed using Gold Datasets to ultimately exhibit the success of the mappings. There are cases that semantics is lost after the conversions, but these losses are due to the models’ conceptualizations and not due to the mappings.
The results of the thesis confirm that semantic interoperability may be achieved under specific conditions. All the conditions, prerequisites and good practices identified during the study of the models, the development of the mappings and their assessment using the approach of the Gold Datasets, involve cataloging policy decisions. Thus, the final thesis statement advocates for better cooperation between stakeholders and the adoption of a common mindset and practices to resolve heterogeneities of the past and to prevent new ones from happening.