Published December 2, 2024 | Version v1
Conference paper Open

StreamSense: Policy-driven Semantic Video Search in Streaming Systems

  • 1. ROR icon Universidad Rovira i Virgili
  • 2. DELL Technologies
  • 3. National Center for Tumor Diseases

Description

Streaming systems are an increasingly appealing substrate for managing video data via the stream abstraction. However, if we consider a large stream collection, it can be hard for data scientists to discover and locate relevant videos, let alone specific video fragments. In this paper, we propose StreamSense: a policy-driven, semantic video search solution for streaming systems. StreamSense allows users to deploy AI models that generate embeddings from video frames via policies. Our system uses such embeddings for building a two-level index in a vector DB that efficiently handles inter/intra video queries. StreamSense abstracts users from vector DB interactions so they can perform semantic search using images as input and visualize the results. We built our prototype on top of a tiered streaming storage system (Pravega) and validated it on a health-related use case. We show that StreamSense allows data scientists to search for video fragments in real surgery datasets in < 30ms. StreamSense also reduces data ingestion related to AI training data loading in +80% compared to simple bulk loading video streams.

Files

embeddings_middleware_industry_2024 (2).pdf

Files (904.2 kB)

Name Size Download all
md5:3de9dd1935182aa33114e2977c562754
904.2 kB Preview Download

Additional details

Funding

European Commission
NEARDATA - Extreme Near-Data Processing Platform 101092644
European Commission
CloudSkin - Adaptive virtualization for AI-enabled Cloud-edge Continuum 101092646