AI paper index

Decomposing Factual Sycophancy in Language Models: How Size and Instruction Tuning Shape Robustness

2026-06-04 · arXiv: 2606.06306

One-line summary

An AI research paper on Decomposing Factual Sycophancy in Language Models: How Size and Instruction Tuning Shape Robustness.

Engineering notes

Engineering notes will be added by the aipentium editorial team.

Chinese explanation / 中文解读

中文解读待补充:本站会优先为大语言模型、生成式AI、ChatGPT相关技术、计算机视觉、深度学习等高价值论文补充中文说明。

Original abstract

Factual sycophancy occurs when a language model abandons a correct, verifiable answer under social pressure. Because a flip occurs only when pressure toward a false answer exceeds the model's neutral preference for the truth, flip rates conflate two mechanisms: the strength of that baseline preference (truth margin), and how far pressure shifts it (manipulation sensitivity). We decompose factual sycophancy into these channels and use them to separate the effects of size and instruction tuning across 56 open-weight models spanning 0.3B-32B parameters and 13 manipulation types. We find that vulnerability is governed mainly by size, but instruction tuning changes how size acts: small instruction-tuned models can become less robust, whereas large instruction-tuned models usually become more robust. Instruction tuning primarily increases truth margin, but its behavioral effect depends on manipulation type. Scaling also changes the two channels differently: base models gain margin but become mildly more manipulation-sensitive, whereas instruction-tuned models gain margin faster and become less sensitive. Factual sycophancy is therefore not a single scalar property. Evaluations should report channel-specific, manipulation-specific, and size-conditioned robustness rather than flip rates alone.

5.0Engineering value
7.0Research novelty
4.0Business relevance

Links and sources

Need this topic turned into a technical roadmap?

aipentium can prepare a custom AI literature review, code map, dataset map, and B2B technology assessment.

Request B2B AI research

Comments

No comments yet. Be the first to share your thoughts on this paper.
Login or register to leave a comment