This is a submission for the Hermes Agent Challenge.
I built a Hermes agent last week that takes a customer support email, decides whether it needs a refund, and either issues one or escalates to a human. Standard stuff. The agent worked. The problem started the moment I turned on audit logging.
Every run wrote a JSONL row to disk. Every row contained the full inbound message, the tool calls, the tool outputs, and the final reply. Within an hour the log had:
41 customer email addresses
7 partial credit card numbers (people paste them into support tickets, then apologize)






