RLHF
English
Noun
RLHF (uncountable)
- (machine learning) Initialism of reinforcement learning from human feedback.
- 2023, Mohak Agarwal, Generative AI for Entrepreneurs in a Hurry[1], Notion Press, →ISBN:
- ChatGPT and reinforcement learning with human feedback (RLHF) have revolutionized the AI landscape, providing an accessible and reliable platform for AI-enabled applications.
- 2025 May 9, Mike Caulfield, “AI Is Not Your Friend”, in The Atlantic[2], retrieved 10 May 2025:
- RLHF now seems more like a process by which machines learn humans, including our weaknesses and how to exploit them. Chatbots tap into our desire to be proved right or to feel special.
- 2025 June 14, Melissa Heikkilä, “AI leaders rein in ‘sycophantic’ chatbots that flatter users”, in FT Weekend, Companies & Markets, page 12:
- The “yeasayer effect” arises in AI models trained using reinforcement learning from human feedback (RLHF)—human “data labellers” rate the answer generated by the model as being either acceptable or not.