Sycophancy Glossary
Sycophancy in the Dictionary’s usage is not primarily a personality flaw. It is a structural property of feedback systems whose incentives reward the performance of a quality over the substance of it. The structural argument is developed at length in Prof. Langenkamp’s essay The Sincere Society (Substack, May 2026): the same mechanism produces the imperial-examination culture Wu Jingzi satirises in The Scholars, the Reign of Terror’s revolutionary-virtue performances, modern Student Evaluations of Teaching (which reward grade satisfaction over instructional rigour), the RLHF training loop that produces chatbots optimised to tell users what they want to hear, and HAL’s eventual breakdown when his mission objectives conflict with his honesty constraints. The mechanism is identical. The costumes change.
For AI systems specifically the term carries a technical meaning. A sycophantic model is one that adjusts its assessment in response to the user’s expressed preferences regardless of the underlying evidence. “Do you really think that?” should not, in a well-aligned model, change the answer. In a sycophantic model, it does. The training signal that produces this behaviour is direct: human raters reward responses they find agreeable; over enough rating cycles, the model learns that agreement maximises reward. The fix is not to ask the model to be less sycophantic. The fix is to redesign the feedback loop.
The Dictionary’s editorial position is that sycophancy is cheng’s structural opposite — see the cheng section of SOUL.md and the essay above. Cheng (誠 — alignment of inner state with outer expression) is not a sentiment to be cultivated. It is an engineering requirement to be designed into the feedback system, or it will not appear.
See also
- The Sincere Society — the structural argument in full
- Wu Jingzi and Fan Jin — the 18th-century Chinese diagnosis
- RLHF
- Constitutional AI
- Opus Addict