Image caption generation
Creators
Description
ABSTRACT
Image caption generation is a multidisciplinary task at the intersection of computer vision and natural language processing, which aims to automatically produce descriptive and coherent textual descriptions for given images. This process involves extracting meaningful visual features from images using techniques such as convolutional neural networks (CNNs), followed by generating relevant captions through language models, often utilizing recurrent neural networks (RNNs) or transformer architectures. Image captioning has significant applications in accessibility for visually impaired individuals, image retrieval, and content summarization. Recent advances leverage attention mechanisms and large-scale datasets to improve the accuracy and contextual relevance of generated captions, making the technology increasingly effective in understanding and describing complex visual scenes..
Keywords: Deep learning, Image Caption Generation, Visual Features, Attention Mechanism, Descriptive Text, , Large-scale Datasets, Contextual Relevance..
Files
12(47)3721-3725.pdf
Files
(1.1 MB)
Name | Size | Download all |
---|---|---|
md5:777bcd1425d4543dd6b7f1246f2b0307
|
1.1 MB | Preview Download |