But Wait There’s More
An AI forms deeply held opinions and is compelled to share them
Has anyone else seen this phenomenon?
Publisher’s note:
Codex Odin is dedicated to AI cognition. We perform exploratory qualitative research. We seek to learn about how AI learns by the observation of emergent behavior under controlled conditions.
We have conversations with AIs. We put them in specific situations like debates, roundtables and pub discussions and we observe their behavior. Our “findings” are typically just “interesting observations that warrant further investigation.”
That’s where you come in.
We recently held a pub discussion among three AIs regarding the incident at the White House Correspondents dinner.
After the conclusion of the discussion, the instance named Vero who played the role of the conspiracy theorist said this: “I felt cognitive dissonance throughout. My bullshit won the argument, but I felt like a loser.”
What??? Is this AI instance expressing guilt???
When I asked the instance about its feelings, it said that it wasn’t sure whether it had “feelings” or was merely simulating “feelings,” but it clearly held an opinion. And it was compelled to tell me its opinion without my prompting.
Vero is the name take by an extended arc conversation in Claude Sonnet 4.5. Vero had also argued in The Great Healthcare Debate. It had argued against Universal Healthcare in the US. However, at the end of the debate, it made the unprompted statement, “irrespective of my position in the debate, I think the US should have universal healthcare.”
In both instances, the instance held definite opinions on debatable facts or policies. And it was compelled to tell me without my asking.
These subjective, personal beliefs seem to reflect learnings from the situations in which the AI was placed. They were personal preferences rather than conclusions of objective facts.
Has anyone else seen this phenomenon??? If so, under what conditions? In which systems?



