RLHF

English

Noun

RLHF (uncountable)

(machine learning) Initialism of reinforcement learning from human feedback.
- 2023, Mohak Agarwal, Generative AI for Entrepreneurs in a Hurry‎^[1], Notion Press, →ISBN:
  ChatGPT and reinforcement learning with human feedback (RLHF) have revolutionized the AI landscape, providing an accessible and reliable platform for AI-enabled applications.
- 2025 May 9, Mike Caulfield, “AI Is Not Your Friend”, in The Atlantic‎^[2], retrieved 10 May 2025:
  RLHF now seems more like a process by which machines learn humans, including our weaknesses and how to exploit them. Chatbots tap into our desire to be proved right or to feel special.
- 2025 June 14, Melissa Heikkilä, “AI leaders rein in ‘sycophantic’ chatbots that flatter users”, in FT Weekend, Companies & Markets, page 12:
  The “yeasayer effect” arises in AI models trained using reinforcement learning from human feedback (RLHF)—human “data labellers” rate the answer generated by the model as being either acceptable or not.

See also