There is a newer version of the record available.

Published March 7, 2026 | Version v1
Journal article Open

AI-Assisted Manga Creation: A Workflow for Non-Artists

  • 1. Independent Researcher

Description

Making manga has always required years of artistic training — a barrier that has kept countless storytellers from the medium. This paper asks whether generative AI can change that. I developed and tested a five-stage production pipeline that combines large language models for narrative writing with diffusion-based image synthesis for visuals, covering everything from initial story concept through to finished page layout. To validate the approach, I produced a complete five-page manga chapter from scratch — using ChatGPT (OpenAI, 2023), Stable Diffusion (Stability AI, 2022), Midjourney (Midjourney, 2023), and Clip Studio Paint — without any formal drawing training. The results are genuinely encouraging: production time fell dramatically compared to conventional methods, and three independent readers found the chapter coherent and visually engaging. That said, keeping characters visually consistent across panels remained a real struggle, and the emotional depth that comes from a skilled human artist's hand is not something current tools can fully replicate. Beyond the technical findings, this paper engages honestly with the harder questions — what AI-assisted creation means for professional artists, who owns the work, and what it means to call something genuinely creative.

Files

Mahadi_Islam_Alif_AI_Manga_Workflow_2026.pdf

Files (36.9 kB)

Name Size Download all
md5:8dc29536001514b06739d3096f98f267
36.9 kB Preview Download

Additional details

Dates

Created
2026-03-07

References

  • Zhang, L., Rao, A., & Agrawala, M. (2023). Adding conditional control to text-to-image diffusion models. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV 2023) (pp. 3836-3847). IEEE.
  • Midjourney, Inc. (2023). Midjourney v5: Image synthesis platform documentation. Retrieved March 7, 2026, from https://docs.midjourney.com/
  • McCloud, S. (1993). Understanding comics: The invisible art. HarperCollins Publishers.
  • Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, Ł., & Polosukhin, I. (2017). Attention is all you need. Advances in Neural Information Processing Systems, 30, 5998-6008.
  • McCormack, J., Gifford, T., & Hutchings, P. (2019). Autonomy, authenticity, authorship and intention in computer generated art. In Proceedings of the 8th International Conference on Computational Intelligence in Music, Sound, Art and Design (EvoMUSART 2019) (Vol. 11453, pp. 35-50). Springer.
  • Stability AI Ltd. (2022). Stable Diffusion: A latent text-to-image diffusion model [Computer software]. Retrieved March 7, 2026, from https://stability.ai/
  • Ho, J., Jain, A., & Abbeel, P. (2020). Denoising diffusion probabilistic models. Advances in Neural Information Processing Systems, 33, 6840-6851.
  • Ho, J., Chan, W., Saharia, C., Whang, J., Gao, R., Gritsenko, A., Kingma, D. P., Poole, B., Norouzi, M., Fleet, D. J., & Salimans, T. (2022). Imagen video: High definition video generation with diffusion models. arXiv Preprint. arXiv:2210.02303
  • Adobe Inc. (2023). Adobe Firefly: Generative AI for creative workflows [Computer software].
  • Anderson v. Stability AI Ltd., No. 3:23-cv-00201 (N.D. Cal. 2023).
  • Animation Guild, IATSE Local 839. (2023). AI and the entertainment industry: A survey of members' experiences. IATSE.
  • Anthropic, PBC. (2023). Claude: A large-scale language model [Computer software].
  • Banet-Weiser, S. (2012). Authentic: The politics of ambivalence in a brand culture. NYU Press.
  • Sohl-Dickstein, J., Weiss, E., Maheswaranathan, N., & Ganguli, S. (2015). Deep unsupervised learning using nonequilibrium thermodynamics. In Proceedings of the 32nd International Conference on Machine Learning (ICML 2015) (Vol. 37, pp. 2256-2265). PMLR.
  • Samuelson, P. (2023). Generative AI meets copyright. Science, 381(6654), 158-161. https://doi.org/10.1126/science.adi0656
  • European Union. (2024). Regulation (EU) 2024/1689 of the European Parliament and of the Council of 13 June 2024 on Artificial Intelligence. Official Journal of the European Union.
  • Eloundou, T., Manning, S., Mishkin, P., & Rock, D. (2023). GPTs are GPTs: An early look at the labor market impact potential of large language models. arXiv Preprint. arXiv:2303.10130
  • Epstein, Z., Levine, S., Rand, D. G., & Rahwan, I. (2020). Who gets credit for AI-generated art? iScience, 23(9), Article 101515. https://doi.org/10.1016/j.isci.2020.101515
  • Bender, E. M., Gebru, T., McMillan-Major, A., & Shmitchell, S. (2021). On the dangers of stochastic parrots: Can language models be too big? In Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency (FAccT '21) (pp. 610-623). Association for Computing Machinery.
  • Runway AI Inc. (2023). Gen-2: Multimodal AI research and video synthesis [Computer software]. Retrieved March 7, 2026, from https://runwayml.com/
  • Celsys Inc. (2023). AI functions in Clip Studio Paint [Computer software]. Retrieved March 7, 2026, from https://www.clip-studio.com/
  • Berndt, J. (2008). Considering manga discourse: Position, literature, authenticity. In M. W. MacWilliams (Ed.), Japanese visual culture: Explorations in the world of manga and anime (pp. 295-310). M.E. Sharpe.
  • Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., Agarwal, S., Herbert-Voss, A., Krueger, G., Henighan, T., Child, R., Ramesh, A., Ziegler, D., Wu, J., Winter, C., ... Amodei, D. (2020). Language models are few-shot learners. Advances in Neural Information Processing Systems, 33, 1877-1901.
  • Brynjolfsson, E., & McAfee, A. (2014). The second machine age: Work, progress, and prosperity in a time of brilliant technologies. W. W. Norton & Company.
  • Elgammal, A., Liu, B., Elbadawy, M., & Mazzone, M. (2017). CAN: Creative adversarial networks: Generating art by learning about styles and deviating from style norms. In Proceedings of the 8th International Conference on Computational Creativity (pp. 96-103). Association for Computational Creativity.
  • Getty Images (US), Inc. v. Stability AI Ltd., No. 1:23-cv-00135-UNA (D. Del. filed Feb. 3, 2023).
  • Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., & Bengio, Y. (2014). Generative adversarial nets. Advances in Neural Information Processing Systems, 27, 2672-2680.
  • Grand View Research. (2022). Manga market size, share & trends analysis report, 2022-2030 (Report No. GVR-4-68039-225-7). Grand View Research, Inc.
  • HAKUREI. (2022). Waifu Diffusion v1.3: Anime-style latent diffusion model [Computer software]. Hugging Face. Retrieved March 7, 2026, from https://huggingface.co/hakurei/waifu-diffusion
  • NovelAI. (2022). NovelAI Diffusion: Anime image generation [Computer software]. Anlatan. Retrieved March 7, 2026, from https://novelai.net/
  • OpenAI. (2023). GPT-4 technical report. arXiv Preprint. arXiv:2303.08774
  • Rombach, R., Blattmann, A., Lorenz, D., Esser, P., & Ommer, B. (2022). High-resolution image synthesis with latent diffusion models. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2022) (pp. 10684-10695). IEEE.
  • Ruiz, N., Li, Y., Jampani, V., Pritch, Y., Rubinstein, M., & Aberman, K. (2023). DreamBooth: Fine tuning text-to-image diffusion models for subject-driven generation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2023) (pp. 22500-22510). IEEE.