I built a niche AI English conversation app called Mesugaki AI English on Kotonia. "Mesugaki" (メスガキ) is a tsundere-style bratty persona popular in Japanese subculture — imagine a character who constantly mocks you but secretly has your back. At first glance this looks like a one-off gag product, but under the hood it's a two-layer design: persona managed as code + Gemini audio input for actual pronunciation correction. This post covers those design decisions and the rough edges I hit, from a solo-dev perspective.
Why a Sarcastic AI English Tutor?
Strategy first. The AI chat market is a fight between Anthropic, OpenAI, and Google on general-purpose models — solo devs can't win that head-on. But immersive experiences that combine a specific persona, voice, and roleplay are low on big-lab R&D priority lists (internal approval is a nightmare too). That's the gap Kotonia as a whole is targeting.
Three reasons I picked this specific persona for English learning:
Zero search competition. No SaaS is fighting for "mesugaki English conversation." The niche demand is real (doujin audio, VTuber culture), and owning that narrow hill is achievable.










