Topic cluster · 2 items

alignment

Constitutional methods for alignment

Training models to critique and revise their own outputs against principles.

Reinforcement learning from human feedback — tuning a model toward preferred answers.