There is a newer version of the record available.

Published December 31, 2022 | Version v1
Journal article Open

Advancing Healthcare with Language Models: Leveraging the Power of Large Language Models for Transformative Impact

Authors/Creators

Description

LLMs have emerged as powerful tools in healthcare, offering transformative solutions to improve patient care, streamline clinical workflows, and enhance medical research. These models, built upon advanced NLP techniques and trained on vast amounts of text data, can understand, generate, and analyze human language with unprecedented accuracy and complexity. This paper provides a comprehensive overview of LLMs in healthcare, covering their fundamentals, applications, advantages, challenges, and future directions. We discuss the evolution and development of LLMs, their key components and architectures, and the training and fine-tuning processes involved. Furthermore, we explore many applications of LLMs in healthcare, including clinical documentation, medical literature analysis, diagnostic assistance, and patient engagement. We also examine the advantages of LLMs in improving healthcare delivery, such as enhancing clinical decision-making, reducing administrative burden, and facilitating patient-provider communication. However, adopting LLMs in healthcare has challenges, including ethical and privacy considerations, technical limitations, and bias mitigation strategies. Through case studies and use cases, we highlight successful implementations of LLMs in healthcare settings and discuss lessons learned and best practices. Finally, we provide recommendations and guidelines for researchers, practitioners, and policymakers to harness the full potential of LLMs while ensuring ethical and responsible use. This paper underscores the significance of LLMs in shaping the future of healthcare and calls for continued research and innovation in this rapidly evolving field.

Files

Files (1.5 MB)

Name Size Download all
md5:991ee95ebf7008927563030001ddfcaf
1.5 MB Download

Additional details

References

  • [1]. E. Bender, et al., "On the Dangers of Stochastic Parrots: Can Language Models Be Too Big?", FAccT '21: Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency, March 202, Pages 610–623 [Online]. Available: https://dl.acm.org/doi/abs/10.1145/3442188.3445922#sec-terms
  • [2]. T. Brown, et al., "Language Models are Few-Shot Learners", Advances in Neural Information Processing Systems 33 (NeurIPS 2020) [Online]. Available: https://proceedings.neurips.cc/paper_files/paper/2020/hash/1457c0d6bfcb496 7418bfb8ac142f64a Abstract.html?utm_medium=email&utm_source=transac tion
  • [3]. R. Bommasani et al., "On the Opportunities and Risks of Foundation Models (2022) " [Online]. Available: https://arxiv.org/abs/2108.07258
  • [4]. L. Ouyang et al., "Training language models to follow instructions with human feedback" Advances in Neural Information Processing Systems 35 (NeurIPS 2022), [Online]. Available: https://proceedings.neurips.cc/paper_files/paper/2022/hash/b1efde53be364a73914f58805a001731-Abstract-Conference.html
  • [5]. Chung, H. W., Hou, L., Longpre, et al., "Scaling Instruction-Finetuned Language Models (2022)" [Online]. Available: https://arxiv.org/abs/2210.11416
  • [6]. Radford, Alec, et al. "Language models are unsupervised multitask learners." OpenAI blog 1.8 (2019): 9. [Online]. Available: https://insightcivic.s3.us-east-1.amazonaws.com/language-models.pdf
  • [7]. Z. Dai, et al., "Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context (2019)" [Online]. Available:https://arxiv.org/abs/1901.02860
  • [8]. W. Chan, N. Jaitly, Q. Le and O. Vinyals, "Listen, attend and spell: A neural network for large vocabulary conversational speech recognition," 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) [Online]. Available: https://www.snowflake.com/trending/data-governance-best-practices/
  • [9]. S. Bowman, et al., "Generating Sentences from a Continuous Space(2016" [Online]. Available: https://arxiv.org/abs/1511.06349
  • [10]. Jiasen Lu, Dhruv Batra, Devi Parikh, Stefan Lee, "ViLBERT: Pretraining Task-Agnostic Visiolinguistic Representations for Vision-and-Language Tasks," Advances in Neural Information Processing Systems 32 (NeurIPS 2019), [Online]. Available: https://proceedings.neurips.cc/paper_files/paper/2019/hash/c74d97b01eae257 e44aa9d5bade97baf-Abstract.html
  • [11]. V. Panayotov, G. Chen, D. Povey and S. Khudanpur, "Librispeech: AnASR corpus based on public domain audio books," 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), South Brisbane, QLD, Australia, 2015, pp. 5206-5210, [Online]. Available: https://ieeexplore.ieee.org/abstract/document/7178964
  • [12]. Zhilin Yang, Zihang Dai, Yiming Yang, Jaime Carbonell, Russ R. Salakhutdinov, Quoc V. Le, "XLNet: Generalized Autoregressive Pretraining for Language Understanding," Advances in Neural Information Processing Systems 32 (NeurIPS 2019) [Online]. Available:https://arxiv.org/abs/1906.08237
  • [13]. Katikapalli Subramanyam Kalyan, Ajit Rajasekharan, Sivanesan Sangeetha "AMMUS : A Survey of Transformer-based Pretrained Models in Natural Language Processing (2021), " [Online] Available:https://arxiv.org/abs/2108.05542
  • [14]. T. Wolf, et al., "Transformers: State-of-the-Art Natural Language Processing," Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, [Online] Available: https://aclanthology.org/2020.emnlp-demos.6/
  • [15]. I.Tenney, et al., "BERT Rediscovers the Classical NLP Pipeline," Presented at ACL 2019, [Online] Available: https://arxiv.org/abs/1905.05950 [16] M. V. Koroteev, "BERT: A Review of Applications in Natural Language Processing and Understanding (2021)" [Online] Available:https://arxiv.org/abs/2103.11943
  • [16]. M. V. Koroteev, "BERT: A Review of Applications in Natural Language Processing and Understanding (2021)" [Online] Available: https://arxiv.org/abs/2103.11943
  • [17]. Santiago González-Carvajal, Eduardo C. Garrido-Merchán, "Comparing BERT against traditional machine learning text classification (2021)" [Online]Available: https://aclanthology.org/2020.emnlp-demos.6/
  • [18]. Kawin Ethayarajh, "How Contextual are Contextualized Word Representations? Comparing the Geometry of BERT, ELMo, and GPT-2Embeddings (2019)" [Online] Available: https://arxiv.org/abs/1909.00512
  • [19]. [19] Ian J. Goodfellow, Jonathon Shlens, and Christian Szegedy, "Explaining and Harnessing Adversarial Examples" (December 2014). [Online] Available: http://dx.doi.org/
  • [20]. Alex Graves. "Sequence Transduction with Recurrent Neural Networks(2020)" [Online] Available: http://arxiv.org/abs/1211.3711
  • [21]. Sepp Hochreiter and Jürgen Schmidhuber. "Long Short-Term Memory(1997)." [Online] Available: https://doi.org/10.1162/neco.1997.9.8.1735
  • [22]. Di Jin, Zhijing Jin, Joey Tianyi Zhou, and Peter Szolovits. "Is BERT Really Robust? A Strong Baseline for Natural Language Attack on Text Classification and Entailment (2019)" [Online] Available:http://arxiv.org/abs/1907.11932
  • [23]. [26] Mandar Joshi, Danqi Chen, Yinhan Liu, Daniel S. Weld, Luke Zettlemoyer, and Omer Levy. "SpanBERT: Improving Pre-training by Representing and Predicting Spans (2020)," Transactions of the Association for Computational Linguistics 8, 64-77. [Online] Available: https://doi.org/10.1162 / tacl_a_00300
  • [24]. David Jurgens, Srijan Kumar, Raine Hoover, Dan McFarland, and Dan Jurafsky(2018), "Measuring the Evolution of a Scientific Field through Citation Frames," Transactions of the Association for Computational Linguistics 6, 391-406. [Online] Available: https: //doi.org/10.1162 / tacl_a_00028
  • [25]. Nal Kalchbrenner, Edward Grefenstette, and Phil Blunsom (2014), "A Convolutional Neural Network for Modeling Sentences," In Proceedings of the 52nd Annual Meeting of the Association for Computationa Linguistics (Volume 1: Long Papers), 655–665.[Online] Available: https://doi.org/10.3115 / v1 / P14-1062
  • [26]. Yoon Kim (2014), "Convolutional Neural Networks for Sentence Classification," In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), 1746-1751. [Online] Available: https://doi.org/10.3115 / v1 / D14-1181
  • [27]. Sezgin E, Sirrianni J, Linwood SL, "Operationalizing and Implementing Pretrained, Large Artificial Intelligence Linguistic Models in the US Health Care System: Outlook of Generative Pretrained Transformer 3 (GPT-3) as a Service Model," JMIR Med Inform 2022;10(2):e32875. [Online] Available: https://medinform.jmir.org/2022/2/e32875
  • [28]. Bharath Chintagunta, Namit Katariya, Xavier Amatriain, Anitha Kannan"Medically Aware GPT-3 as a Data Generator for Medical Dialogue Summarization," Proceedings of the 6th Machine Learning for Healthcare Conference, PMLR 149:354-372, 2021.[Online] Available: https://proceedings.mlr.press/v149/chintagunta21a.html