On May 7, OpenAI made gpt-realtime-2 generally available in the Realtime API at roughly $0.25 to $0.35 per minute of conversation, with GPT-5-class reasoning, 128K context, native translation, and 70+ languages. Coverage: TechCrunch, OpenAI's announcement, and a clean teardown at DataCamp. Five days later, on May 12, Vapi closed a $50M Series B at a $500M valuation and publicly positioned itself as the enterprise infrastructure layer, with Amazon Ring running 100% of inbound through them. Two days after that, Synthflow's enterprise page started leading with two deal proof points: a $230M multinational BPO operator running 40+ branded agents at 600K calls per month, and a top U.S.-based CRM platform white-labeling Synthflow at 500K calls per month, both inside 60 days. The full receipts are on Synthflow's enterprise comparison page.
Eleven days. Three events. One outcome. The middle of the AI voice stack collapsed, and an agency owner with five clients is the only customer left in the room nobody is engineering for.
Why this matters for AI voice agencies
For the last 18 months, the canonical AI voice agency stack was a wrapper or infra platform (Retell, Vapi, Voicerr, Synthflow) plus GoHighLevel plus Zapier plus Stripe plus Twilio plus a custom dashboard. The wrapper layer existed because the orchestration was hard. Stitching STT, an LLM, TTS, telephony, turn-taking, barge-in, latency tuning, and call quality together was a four-vendor problem with 200ms of jitter to manage. You paid the wrapper for the integration, not the model.







