Relatively new arXiv preprint that got featured on Nature News, I slightly adjusted the title to be less technical. The discovery was done using aggregated online Q&A… one of the funnier sources being 2000 popular questions from r/AmITheAsshole that were rated YTA by the most upvoted response. Study seems robust, and they even did several-hundred participants trials with real humans.

A separate preprint measured sycophancy across various LLMs in a math competition-context (https://arxiv.org/pdf/2510.04721), where apparently GPT-5 was the least sycophantic (+29.0), and DeepSeek-V3.1 was the most (+70.2)

The Nature News report (which I find a bit too biased towards researchers): https://www.nature.com/articles/d41586-025-03390-0

  • BakerBagel
    link
    fedilink
    English
    arrow-up
    21
    ·
    1 day ago

    Reminds me of a tweet from a few years ago that said something along the lines of "Middle managers think AI is intelligent because it speaks just like they do instead of realizing it means that they aren’t intelligent "