Published April 16, 2024
| Version v1
Conference paper
Open
Bidirectional Image-to-Text and Text-to-Image Conversion using Deep Learning Models
Authors/Creators
Description
Abstract— In this paper, we introduce a system that facilitates bidirectional conversion between images and text using pre-trained deep learning models. Our approach incorporates a Vision Encoder-Decoder model for image captioning and utilizes the Stable Diffusion method for generating images from textual prompts. The implementation is integrated into a user-friendly UI application developed using Streamlit, enabling smooth transitions between images and text. Users have the ability to upload images for automatic captioning or input textual prompts to generate corresponding images, allowing for intuitive exploration of the interplay between visual and textual data.
Files
Bidirectional Image-to-Text and Text-to-Image Conversion using Deep Learning Models.pdf
Files
(555.9 kB)
| Name | Size | Download all |
|---|---|---|
|
md5:d7987dc66f74cd7da1f4d6bb10a213fe
|
555.9 kB | Preview Download |