Published June 26, 2024 | Version v1
Presentation Open

Hybrid ML/AI driven search as cataloging aid in Archipelago Commons

Authors/Creators

  • 1. Metropolian New York Library Council, United States of America

Description

In this presentation we will explain and showcase how Semantic and Image Similarity Search is being developed and implemented in Archipelago Commons, an OSS repository system. This new feature will be publicly available and deployed by default for all our users starting on version 1.5.0. This functionality brings a hybrid approach on AI/ML driven search and application to our OSS repository system. By adding new Services to our existing Docker stack, using NLP and LoD entities for important validation context, and providing end user facing tools we ensure our community has complete control of what is included/excluded and how results to queries are exposed to the world. Our hybrid approach also entails integrating an embeddings/features extraction pipeline, vector based search on our current Solr indexes, Spotify's Annoy as a Service, and a custom IIIF Content Search API 2.x implementation with Binary search capability.

Files

Pino_Hybrid_ML_AI_driven_search_as_cataloging_aid_OR_2024.pptx.pdf

Files (3.4 MB)