／var／log marcus chiu

❯

❯

❯

❯

❯

❯

Reinforcement Learning from Human Feedback (RLHF)

Created on Mar 26, 2023

is a new technique for training large language models that have been critical to OpenAI’s ChatGPT and InstructGPT models, DeepMind’s Sparrow, Anthropic’s Claude, and more