What is Reinforcement Learning from Human Feedback?
Reinforcement learning from human feedback (RLHF) trains AI systems to generate text or take actions that align with human preferences. RLHF has become one of the central methods for fine-tuning large language models. In particular, RLHF was a key component for training GPT-4, Claude, Bard, and LLaMA-2 chat models. RLHF