Published April 16, 2022 | Version 1.0.0
Report Open

Approximate Nearest Neighbor based information extraction with text and image data

  • 1. ROR icon Kennesaw State University

Description

Most of the search engines use algorithms like Best Match 25 that perform well and return the top-ranked results, but they lack the ability to understand the semantics of the user that they are searching for. The main task of this project is to apply the various deep learning techniques to build the search engine that gives the most relevant results using the search query using the pre-trained models. Our main objective is to use deep learning to rank highly similar results at scale. This project also deals with the image data using the Image Captioning model that was trained on the Open Images V6 dataset [2]. We have
successfully vectorized the text data from the 200,000+ Jeopardy! Questions dataset and wrote the search engine to search for the given query by vectorizing the search query and fetches the results with the least cosine distance.

Files

Approximate_Nearest_Neighbor_based_information_extraction_with_text_and_image_data.pdf