UPDATE: Zenodo migration postponed to Oct 13 from 06:00-08:00 UTC. Read the announcement.

Presentation Open Access

Reinforcement Learning from Human Feedback: A Tutorial at ICML 2023

Lambert, Nathan; Ustalov, Dmitry

Project manager(s)
Fedorova, Natalia
Researcher(s)
Pavlichenko, Nikita; Ryabinin, Max; Rajani, Nazneen; Tunstall, Lewis; Koshelev, Sergey

Reinforcement learning from human feedback (RLHF) has dramatically improved the real-world performance and user experience of large machine learning models. Still, this approach has primarily been applied at a scale of compute and data curation that limits academic availability. In this tutorial, we will describe the general framework of RLHF and explain the technical procedures required to apply this framework. The tutorial begins with a detailed conceptual overview and continues with an explanation of human-in-the-loop data collection procedures used when scaling state-of-the-art systems.

Files (21.4 MB)
Name Size
ICML2023-RLHF-Tutorial.pdf
md5:5be2cee9234f88f4b80ea03ce08412cd
21.4 MB Download
696
589
views
downloads
All versions This version
Views 696696
Downloads 589589
Data volume 12.6 GB12.6 GB
Unique views 646646
Unique downloads 487487

Share

Cite as