Image Generation from Caption
Description
Generating images from a text description is as challenging as it is interesting. The Adversarial network performs in a competitive fashion where the networks are the rivalry of each other. With the introduction of Generative Adversarial Network, lots of development is happening in the field of Computer Vision. With generative adversarial networks as the baseline model, studied Stack GAN consisting of two-stage GANS step-by-step in this paper that could be easily understood. This paper presents visual comparative study of other models attempting to generate image conditioned on the text description. One sentence can be related to many images. And to achieve this multi-modal characteristic, conditioning augmentation is also performed. The performance of Stack-GAN is better in generating images from captions due to its unique architecture. As it consists of two GANS instead of one, it first draws a rough sketch and then corrects the defects yielding a high-resolution image.
Files
7201ijscai01.pdf
Files
(429.7 kB)
| Name | Size | Download all |
|---|---|---|
|
md5:d82cc88efa65e97df65949df1f5cc812
|
429.7 kB | Preview Download |