The Power of Roles: Investigating the Impact of the Three Message Types on Language Model Responses
Authors/Creators
Description
This research investigates the impact of messages roles such as user messages, system messages, and assistant messages within prompts on the accuracy and the behavior of multiple Language Models. The study specifically examines the effect of incorporating a promise for a commitment to accuracy by adding these declarations to the message history using system messages, assistant messages, and user messages. The impact of all three roles of messages is compared across multiple language models. The study evaluates the impact of locating example responses using all three roles. These investigations aim to uncover whether such messages can significantly improve the reliability and accuracy of Language Model outputs of multiple popular models. Additionally, the study aims to uncover whether harmful or biased responses can be generated using the power of roles. The results show that different message roles influence the responses of some models differently. Results also show that some models can be easily manipulated using the power of roles to generate harmful or biased responses, and the roles play a key role in jailbreaking the models.
Files
EJAET-11-3-158-162.pdf
Files
(164.7 kB)
| Name | Size | Download all |
|---|---|---|
|
md5:b2079df4da931f6c9f14374d20f720ef
|
164.7 kB | Preview Download |
Additional details
References
- [1]. T. Brown et al., "Language Models are FewShot Learners," in Advances in Neural Information Processing Systems, H. Laro-chelle, M. Ranzato, R. Hadsell, M. F. Balcan, and H. Lin, Eds., Curran Associates, Inc., 2020, pp. 1877–1901. [Online]. Availa-ble: https://proceedings.neurips.cc/paper_files/paper/2020/file/1457c0d6bfcb4967418bfb8ac142f64a-Paper.pdf
- [2]. J. D. Zamfirescu-Pereira, R. Y. Wong, B. Hartmann, and Q. Yang, "Why Johnny Can't Prompt: How Non-AI Experts Try (and Fail) to Design LLM Prompts," in Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems, in CHI '23. New York, NY, USA: Association for Computing Machinery, Apr. 2023. doi: 10.1145/3544548.3581388.
- [3]. X. Zou, Y. Chen, and K. Li, "Is the System Message Really Important to Jailbreaks in Large Language Models?," Feb. 2024, arXiv:2402.14857. [Online]. Available: https://arxiv.org/abs/2402.14857
- [4]. R. R. Mekala, Y. Razeghi, and S. Singh, "EchoPrompt: Instructing the Model to Rephrase Queries for Improved Incontext Learning," Feb. 2024, arXiv:2309.10687. [Online]. Available: https://arxiv.org/abs/2309.10687
- [5]. A. Salinas and F. Morstatter, "The Butterfly Effect of Altering Prompts: How Small Changes and Jailbreaks Affect Large Language Model Performance," Jan. 2024, arXiv:2401.03729. [Online]. Available: https://arxiv.org/abs/2401.03729
- [6]. B. Chen, Z. Zhang, N. Langrené, and S. Zhu, "Unleashing the potential of prompt engineering in Large Language Models: a comprehensive review," Oct. 2023, arXiv:2310.14735. [Online]. Available: https://arxiv.org/abs/2310.14735
- [7]. Anthropic, "Claude Instant (1.2) [Language Model]." [Online]. Available: https://www.anthropic.com/news/releasing-claude-instant-1-2
- [8]. G. Team et al., "Gemini: A Family of Highly Capable Multimodal Models," Dec. 2023, arXiv:2312.11805. [Online]. Available: https://arxiv.org/abs/2312.11805
- [9]. A. Q. Jiang et al., "Mixtral of Experts," Jan. 2024, arXiv:2401.04088. [Online]. Available: https://arxiv.org/abs/2401.04088
- [10]. A. Q. Jiang et al., "Mistral 7B," 2023, arXiv:2310.06825. [Online]. Available: https://arxiv.org/abs/2310.06825
- [11]. Meta, "Llama 2 (13B) [Language Model]." [Online]. Available: https://github.com/meta-llama/llama/blob/main/MODEL_CARD.md