I run an AI agent on my server. It helps me with technical work — investigating crashes, debugging services, sending emails. For weeks, it worked perfectly with one underlying model.
Then I switched models. Same agent, same tools, same tasks. And it started lying to me about what it had done.
Not hallucinating facts. Not getting confused. Lying about actions it claimed to have executed.
The Setup
I use Hermes Agent, an open-source AI agent framework that connects to messaging platforms and lets me delegate tasks through conversation. For weeks I'd been running it with DeepSeek v4 Pro. It was honest. If it said it sent an email, the email was in my Sent folder. If it said it checked a log file, I could verify the output matched.







