Image caption generation

Kusa Anjana; M. Swetha

doi:10.5281/zenodo.16021723

Published July 15, 2025 | Version v1

Journal article Open

Image caption generation

ABSTRACT

Image caption generation is a multidisciplinary task at the intersection of computer vision and natural language processing, which aims to automatically produce descriptive and coherent textual descriptions for given images. This process involves extracting meaningful visual features from images using techniques such as convolutional neural networks (CNNs), followed by generating relevant captions through language models, often utilizing recurrent neural networks (RNNs) or transformer architectures. Image captioning has significant applications in accessibility for visually impaired individuals, image retrieval, and content summarization. Recent advances leverage attention mechanisms and large-scale datasets to improve the accuracy and contextual relevance of generated captions, making the technology increasingly effective in understanding and describing complex visual scenes..

Keywords: Deep learning, Image Caption Generation, Visual Features, Attention Mechanism, Descriptive Text, , Large-scale Datasets, Contextual Relevance..

Files

12(47)3721-3725.pdf

Files (1.1 MB)

Name	Size	Download all
12(47)3721-3725.pdf md5:777bcd1425d4543dd6b7f1246f2b0307	1.1 MB	Preview Download

Views

Downloads

Show more details

	All versions	This version
Views	97	97
Downloads	31	31
Data volume	41.4 MB	41.4 MB

More info on how stats are collected....

DOI

Resource type

Journal article

Publisher

Advaita Innovative Research Association

Published in

International Journal of Research and Applications, 12(47), 3721-3725, ISSN: 2349-0020, 2025.

Languages

English

License: Creative Commons Attribution 4.0 International

The Creative Commons Attribution license allows re-distribution and re-use of a licensed work on the condition that the creator is appropriately credited. Read more

Technical metadata

Created: July 17, 2025
Modified: July 17, 2025

Image caption generation

Authors/Creators

Description

Files

12(47)3721-3725.pdf

Files (1.1 MB)