From hallucinations to hardware: Lessons from a real-world computer vision project gone sideways

Want smarter insights in your inbox? Sign up for our weekly newsletters to get only what matters to enterprise AI, data, and security leaders. Subscribe Now

Computer vision projects rarely go exactly as planned, and this one was no exception. The idea was simple: Build a model that could look at a photo of a laptop and identify any physical damage — things like cracked screens, missing keys or broken hinges. It seemed like a straightforward use case for image models and large language models (LLMs), but it quickly turned into something more complicated.

Along the way, we ran into issues with hallucinations, unreliable outputs and images that were not even laptops. To solve these, we ended up applying an agentic framework in an atypical way — not for task automation, but to improve the model’s performance.

In this post, we will walk through what we tried, what didn’t work and how a combination of approaches eventually helped us build something reliable.

Where we started: Monolithic prompting

From hallucinations to hardware: Lessons from a real-world computer vision project gone sideways

Related reading

The Model Is the Easy Part: What a Real-Time Computer Vision Product Actually…

The AI Wasn't Hallucinating. Our Architecture Was.

I'm the CEO of an AI startup that finds blind spots in visual data. If missed,…

Building an AI-Powered Motion Blur Mitigation System for High-Speed Railway…

AI brings object-level vision prosthetics closer to reality - Robohub

Why AI Models Break Outside The Lab