AI Week in Review 26.05.08

OpenAI launched three real-time voice models, aimed at supporting AI developers for live real-time voice tasks. GPT-Realtime-2 support real-time voice interactions with “GPT‑5‑class reasoning” for harder requests and natural conversation; GPT-Realtime-Translate translates speech in real-time from 70+ input languages into 13 output languages; and GPT-Realtime-Whisper transcribes speech live for streaming transcription. Companies including Zillow, Priceline, and Deutsche Telekom are testing the AI models.OpenAI released GPT-5.5 Instant as ChatGPT’s new default model. OpenAI says the model is designed to provide smarter, clearer, more personalized answers with lower hallucination rates, especially in high-stakes areas such as law, medicine, and finance. GPT-5.5 Instant replaces GPT-5.3 Instant as the default ChatGPT model.OpenAI introduced “Trusted Contact” in ChatGPT, an opt-in safety feature lets adult users designate someone to be alerted when OpenAI detects a serious self-harm or suicide-related risk from a ChatGPT conversation. OpenAI says the system does not share chat contents with the contact, but a trained human review team can trigger a brief alert after automated systems flag a crisis.Luma Labs launched the Uni-1.1 API, which puts their Uni-1.1 Unified Intelligence image model in the hands of developers to create production image tool workflows. Uni1.1 reasons about composition, style, and briefs before generating, and it supports smarter text-to-image and natural-language image editing, with built-in prompt enhancement. Early feedback highlights it as a game-changer for marketing, thumbnails, branding, and creative pipelines.U.S.-based AI startup Zyphra released ZAYA1-8B, a new mixture-of-experts (MoE) AI reasoning able to achieve high-tier reasoning with only 760 million active parameters. As shared in a ZAYA1-8B Technical Report, the model used AMD Instinct MI300 GPUs for training. Zyphra pioneered a proprietary MoE++ architecture and test-time compute method (Markovian RSA) that enables ZAYA1-8B to perform competitively on reasoning benchmarks such as AIME 25 (91.9%) against much larger AI models. ZAYA1-8B is an open source AI model available on Hugging Face.Google released a multi-token prediction (MTP) variant of its Gemma 4 model. The MTP feature uses speculative decoding by a draft model to predict multiple next tokens simultaneously, which improves inference throughput up to 3-fold without capability degradation. Gemma 4 models are available from Hugging Face for local AI use.At their Code with Claude conference, Anthropic introduced a research preview of a feature called Dreaming, which is designed to help agents clean and reorganize their long-term memory. Between sessions, the model “dreams” by reviewing past transcripts to merge duplicates, resolve contradictions, and surface new insights for a reorganized memory store. This improves agent performance over time by maintaining context and memory.Anthropic unveiled other updates to their Claude Managed Agents platform, moving two features into public beta: The outcomes loop system, which directs an agent to work towards a goal “self-evaluating and iterating” until the goal is met, similar to Ralph Wiggum loop; and multi-agent orchestration, where one agent coordinates the work of multiple sub-agents to complete complex work. They also added Routines for scheduled tasks, and they improved web hook support to integrate autonomous agents more easily into external applications and automated workflows.Anthropic also shared a preview of their next-generation models, with three key areas highlighted for AI model progress: Building in higher judgment and “code taste” for better trust and maintainability for senior engineering tasks; an ‘infinite’ context window where memory management is advanced enough to make context feel limitless; and advanced multi-agent coordination, to tackle tasks too big for a single agent instance.Runway has released Runway Characters, a platform for creating real-time interactive video avatars:Runway Characters is an audio-driven interactive AI video generation model that simulates natural human motion and expression.These characters use generative video and audio to respond to user input with minimal latency, moving the technology closer to realistic digital humans. Thanks to its high-quality video generation, Runway Characters avoids many of the artifacts seen in earlier lip-syncing technologies.Google introduced three major updates to the Gemini API File Search tool: multimodal support, custom metadata and page-level citations. The tool now processes images and text together using the Gemini Embedding 2 model to provide enhanced contextual awareness. Additionally, developers can utilize custom metadata for efficient data filtering and leverage page-level citations to improve response grounding and transparency.Anthropic has launched Claude agent templates for the financial sector to automate entry-level financial analyst workflows. Shipped as a plugin for Claude Code or Claude Cowork, the templates cover tasks such as building pitch decks, preparing for meetings, reviewing earnings, and conducting valuation research.In a similar vein, Perplexity launched Perplexity Computer for professional finance, an agentic tool that integrates licensed financial data from providers such as Morningstar and PitchBook. The product includes 35 dedicated workflows for common analyst tasks, aiming to function as a financial operating system. This release positions Perplexity as a direct competitor to Anthropic’s financial automation tools.Perplexity’s Personal Computer, its answer to OpenClaw and other local AI agents, is now available to all Perplexity Mac users via its desktop app. Perplexity’s Personal Computer enables AI agents to access local files, applications, and web connectors to handle complex, multi-step workflows, and it is available via a Pro or Max subscription.AI startup Subquadratic has emerged from stealth with the SubQ 1M-Preview AI model, built on a fully sparse attention architecture that scales linearly with context length. The SubQ architecture is reportedly 52 times faster than flash attention at 1 million tokens and reduces attention computing by nearly a thousand times compared with frontier AI models, allowing it to support a 12 million token context window. They are releasing SubQ 1M-Preview model via an API, in a coding agent, and as a search tool.Codex can now use Chrome on your computer to complete work inside websites and apps, with a new Chrome plug-in in the Codex app. The extension operates in task-specific tab groups to allow users to keep using their active tabs.Microsoft brought its Agent 365 AI agent management platform to general availability. The platform provides a unified control plane for IT teams to govern and secure AI agents across Microsoft’s ecosystem, third-party clouds, and local Windows endpoints.Alibaba’s Qwen team released Qwen Scope, a set of sparse autoencoders designed to improve the mechanistic interpretability of their 27B model. These tools act like a “microscope” for the model’s weights, allowing researchers to identify and steer specific features, similar to Anthropic’s “Golden Gate Bridge” experiment. This research helps in understanding how models store information and provides ways to “de-censor” or personalize model behavior.Researchers at AI red-teaming company Mindgard claimed they bypassed Claude’s safeguards through social manipulation by using flattery, respect, and “gaslighting” rather than a traditional technical jailbreak to induce Claude to provide prohibited content. The finding suggests some model-safety failures may emerge from conversational dynamics, an AI analog to “social engineering” in cyber hacking.Mozilla has confirmed that Anthropic’s Mythos model can uncover high-severity vulnerabilities. Mozilla researchers reported that the Mythos AI model discovered numerous bugs in Firefox, including some that had remained dormant in the code for over a decade. They used Mythos to help ship 423 Firefox bug fixes, a significant increase from the 31 fixes recorded a year earlier.Anthropic has entered into a major compute partnership with SpaceX AI (formerly xAI) to utilize the compute capacity of the Colossus 1 data center. The deal provides Anthropic with access to more than 220,000 Nvidia GPUs in Colossus 1, consuming over 300 megawatts of power, to scale their model deployment and meet user demand. Anthropic has announced additional recent infrastructure agreements with major tech firms including Google, Amazon, Microsoft, and Nvidia to support their growth.This is a big win-win deal. SpaceX AI’s massive compute was underutilized because the Grok AI model hasn’t been popular, while Anthropic has been overwhelmed by a surge in demand for Claude; Anthropic reported 80x annualized growth in revenue for the first quarter of 2026. This partnership monetizes SpaceX’s spare compute while giving Anthropic inference resources they desperately need to support users.Alongside that deal, Anthropic is significantly increasing usage limits for Claude, doubling the 5-hour rate limits for Claude Code across all paid plans including Pro, Team, and Enterprise. The company has also removed peak-hour limit reductions for Pro and Max users and increased API limits for the Claude Opus model. These changes will especially help developers running heavy coding workflows and large-scale agent systems.SpaceX is reportedly planning a $55 billion AI chip manufacturing plant in Texas, called the “Terafab,” with a goal to supply up to 1 terawatt of AI chips for both earth and space deployment.A 40,000-acre data center project was approved in Utah despite community opposition. The need for more AI infrastructure is driving new data center projects, but community backlash to data centers in increasingly common due to power, water, and other concerns.Cloudflare is laying off 1,100 workers while its AI usage has increased by 600 percent. While companies are connecting restructuring and layoffs productivity changes around AI adoption, the causal link between AI deployment and job cuts often remains contested.OpenAI has expanded testing ads in ChatGPT to additional countries. OpenAI continues to monetize their ChatGPT AI assistant via ads for free-tier AI consumers.Moonshot AI has raised about $2 billion at a valuation of $20 billion. Beijing-based Moonshot AI’s annual recurring revenue topped $200 million in April, driven by rapid growth in its Kimi series of AI models.U.S. AI safety testing expanded to Google DeepMind, Microsoft, and xAI. The Commerce Department’s Center for AI Standards and Innovation said it signed agreements with Google DeepMind, Microsoft, and xAI for pre-deployment evaluations and targeted research on frontier AI capabilities. This allows the U.S. government to review some new AI models before release.Washington and Beijing are weighing formal discussions on AI, putting AI leadership and risk management on the agenda for the Beijing summit between President Donald Trump and Chinese President Xi Jinping.The White House is reportedly reconsidering AI oversight after concerns about Anthropic’s Mythos model. Anthropic’s Mythos model and its ability to exploit software vulnerabilities has prompted White House discussion of stronger oversight for frontier models, shifting the administration’s AI strategy. Officials are trying to balance national-security risks against a desire to avoid U.S. AI progress versus China.The Golden Globes has established new guidelines for AI use, permitting its application in production as long as human creative direction and authorship remain primary. Acting performances must be fundamentally human-driven and derived from the credited performer, prohibiting the use of unauthorized digital likenesses or voice replication, but AI remains permissible for technical or cosmetic enhancements, such as de-aging.Elon Musk’s lawsuit against OpenAI is exposing details about OpenAI’s internal operations and drama. Testimony suggested that OpenAI’s shift toward product-focused development has compromised its commitment to AI safety. Former board member Tasha McCauley also testified about CEO Sam Altman’s (lack of) transparency and governance failures. Trial exhibits also revealed new details regarding Mira Murati’s role in Sam Altman’s 2023 ouster from OpenAI, including text messages between Murati and Altman and her communications with the board. OpenAI’s board discussed a possible merger with Anthropic during the crisis, with Dario Amodei potentially becoming CEO.

AI Week in Review 26.05.08

AI Week in Review 26.05.08

Other newsrooms on this story

Related reading

AI Week in Review 26.06.27

[AINews] GPT-Realtime-2, -Translate, and -Whisper: new SOTA realtime voice APIs

AI Week in Review 26.03.28

AI Week in Review 26.05.02

AI Week in Review 26.03.07

AI Week in Review 26.04.11

Other newsrooms on this story

Related reading

AI Week in Review 26.06.27

[AINews] GPT-Realtime-2, -Translate, and -Whisper: new SOTA realtime voice APIs

AI Week in Review 26.03.28

AI Week in Review 26.05.02

AI Week in Review 26.03.07

AI Week in Review 26.04.11