AI Week in Review 26.07.04

Figure 1. Newly released Nano Banana 2 Lite produces an image to celebrate USA’s 250th anniversary. Happy Independence Day!Anthropic has restored global access to Claude Fable 5 on Claude and Claude Code following the U.S. Department of Commerce’s withdrawal of emergency export controls. Anthropic’s new release of Fable 5 features additional cybersecurity safety guardrails and a proposed Cyber Jailbreak Severity (CJS) framework to standardize software vulnerability risk assessments. Fable 5’s classifier has a higher safety margin in preventing harmful actions, so it may block some benign requests out of caution.While Fable 5 is now available to all users, it is accessible via Claude subscription only until July 7th, after which expensive API usage credits are required for use. Subscribers, try it before it goes away. For developers, Anthropic’s Claude Fable 5 models are available again on Amazon Bedrock. Access to the specialized Mythos 5 model is limited to vetted, authorized organizations under Project Glasswing.Figure 2. Fable 5’s classifier puts more requests into the safety margin, which means more benign prompts get blocked, but also there is higher confidence about the prevention of harmful outcomes.Anthropic’s advice for using the powerful but expensive Fable 5 is to reserve the model for high-stakes, long-context judgment tasks rather than brute-forcing small operations. Define high-level outcomes rather than micro-managing step-by-step instructions and save reusable context via Markdown files.Anthropic released Claude Sonnet 5, positioning it as its most agentic Sonnet-class model yet, with improved planning, tool use, coding, and autonomous task execution. Sonnet 5 offers clear improvements over Sonnet 4.6, with excellent benchmark results that narrow performance gap with Opus 4.8: Sonnet 5 scores 1615 on GDP-val, 63.2% on SWE-bench Pro, and 81.2% on OSWorld-verified.Figure 3. Sonnet 5 benchmarks place it well above Sonnet 4.6 but not at the Opus 4.8 level.Claude Sonnet 5 offers near-flagship performance and improved agentic capabilities at a significantly lower cost than Anthropic’s Opus 4.8, with pricing of $2 / $10 per million input / output tokens. However, some testers have noted it may be less token-efficient than Opus, thus undoing that cost advantage. Reaction to Claude Sonnet 5 has been mixed, as it is close but not at the frontier as an AI model (coders might continue to use Claude Opus 4.8 for their hardest agentic coding tasks) and has been overshadowed by the release of Fable 5.Anthropic launched Claude Science, an AI workbench that integrates research tools, datasets, and visual generation to facilitate research and analysis. The utilizes specialized agents to connect with life sciences databases and models like Nvidia’s BioNeMo, and it brings literature analysis, code execution, figure generation, manuscript drafting, and reproducibility tracking into a single environment. Currently in beta for Claude Pro, Max, Team, and Enterprise users, the workflow product can run locally on macOS and Linux platforms or through HPC access. Additionally, the company revealed plans to develop its own drugs, specifically targeting neglected diseases.Google unveiled Nano Banana 2 Lite, a high-speed image model that generates images in four seconds, far faster than Nano Banana 2 yet at comparable quality. Google is natively integrating the fast-generation Nano Banana 2 Lite across its ecosystem, including Gemini applications, search features, and Google Photos. Aimed at enterprise image generation, it can be run from Google AI Studio or the Gemini API for just $0.034 per 1,000 images.Figure 4. Nano Banana 2 Lite has 6 times the speed and lower cost per image than Nano Banana 2 yet has comparable quality.Google DeepMind also released a public preview of Gemini Omni Flash, Google’s multimodal Omni model for conversational short video generation and editing. Gemini Omni Flash uses text, images, and clips to produce 720p videos with synchronized audio, scene consistency, and text insertion. Currently, video outputs are capped at 10-second clips and priced at $0.10 per second in the Gemini API and Google AI Studio.Google has combined both models in demo applications for multi-media generation with AI, including SpaceLift, a tool for interior design exploration, and Omni Product Studio, which turns product images into a brief video showcase.Google updated its Notebook LM platform to support the automated generation of short-form, vertical video summaries. Users can process existing document notes to render 60-second video overviews complete with synthetic voice commentary and basic visual layouts. The creation process requires a notable amount of background processing time to compile and render the final media file.Chinese AI lab Meituan released LongCat-2.0 to open source under an enterprise-friendly MIT license. LongCat-2.0 is a 1.6T parameter Mixture-of-Experts model with a dynamic range of 33B to 56B active parameters and features a 1-million-token context. Trained on domestic Chinese ASICs with model architecture and training to specialize in agentic software engineering tasks, LongCat-2.0 has frontier-class performance on coding tasks, getting 59.5% on SWE-bench Pro and 70.8% on Terminal-Bench 2.1. .Figure 5. LongCat-2.0 uses multi-teacher on-policy RL and distillation to train a unified AI model great at reasoning, agentic tasks, and instruction-following.Google launched a new macOS version of Gemini Spark and released other Spark updates for desktop task automation and remote task execution via mobile devices. The update introduces integrations with services like Canva and Dropbox, support for custom Model Context Protocol (MCP), and real-time tracking for news, sports, and finance. The desktop application integrates directly with local file systems, allowing the AI to interact with local directories and manipulate user files via natural language instructions.Gemini expanded their personalized image generation to all eligible U.S. users via Personal Intelligence. The update integrates Google Photos, Gmail, and YouTube to provide customized responses and images based on user context. This allows users to generate unique images that reflect their personal preferences and lifestyle.Exo Labs announced local.ai, a site for comparing how AI models run on a user’s own hardware, including model capability, quantization, hardware, and workload benchmarks. The site is intended to show the best local model for a user’s hardware, the tradeoff versus cloud APIs, and whether local inference is cheaper than API tokens.Exo Labs also previewed an Exo CLI for consumer-device inference. This “vLLM for consumer devices” handles model and runtime configurations for users wanting to run AI models on their own hardware. The CLI was expected to arrive in the coming weeks.Sakana AI’s recently released Fugu now works in Codex and OpenCode. Fugu is Sakana AI’s multi-agent orchestration model that routes, coordinates, and verifies work across expert AI models for agents.Hugging Face announced that Every Eval Ever results are now appearing on Hugging Face model pages, integrating community evaluation results with model discovery. The post notes that AI evaluation data is currently scattered across papers, leaderboards, benchmark harnesses, and blog posts, and says the combined repository now contains around 229,000 results across more than 22,000 models and 2,200 benchmarks.Meta AI released Brain2Qwerty v2, a non-invasive brain-to-text research system that decodes sentences from MEG brain recordings without surgical implants. Meta reports 61% word accuracy overall and 78% for the best participant, a major improvement over prior non-invasive approaches, and it is releasing training code plus related data to support open neuroscience research.DeepSeek open-source released DSpark, that accelerates AI model inference by up to 85% with improved speculative decoding. DeepSeek released a DSpark checkpoint of their DeepSeek-v4 model to showcase the capability and published the DSpark paper to explain their innovation. DSpark uses semi-autoregressive generation with draft models and confidence-scheduled verification, dynamically tailoring verification to expected success rate. This reduces overhead and increases throughput. They also released the DeepSpec codebase, which enables training speculative decoding for other open-weight models.Google Research introduced TabFM, a zero-shot foundation model for tabular data. The goal is to bring the same kind of general-purpose model behavior seen in time-series forecasting to classification and regression tasks on structured tables, potentially reducing the need for task-specific model training in common business and scientific workflows.China’s Kling AI has raised $2.8 billion in venture capital funding, valuing the AI video operation at $18 billion as the company seeks to spinoff from Kuaishou Technology and expand its video AI operations.Mark Zuckerberg informed Meta staff at a recent town hall that the development of AI agents has not progressed at the pace executives previously expected. Despite laying off 8,000 employees and reassigning 7,000 others to AI groups earlier this year, Meta’s new structure has yet to yield its intended benefits and morale is terrible in Meta’s AI unit.Anthropic is exploring development of its own AI chips in collaboration with Samsung. The company is reportedly in contact with Samsung to investigate a partnership for custom hardware development.OpenAI CEO Sam Altman has proposed giving 5% of the company’s equity to a U.S. sovereign wealth fund, suggesting that providing a public financial interest could blunt public backlash and ease regulatory tensions with the Government. The 5% ownership could be worth $42 billion based on company metrics. The plan has sparked policy debates regarding AI oversight, as it could create systemic conflicts of interest for federal regulators.Microsoft announced a new Microsoft Frontier Company focused on enterprise AI deployments. The project will be backed by a $2.5 billion investment and 6,000 industry and engineering experts. Early partnerships for the venture include the London Stock Exchange Group, Unilever, Land O’Lakes, and Accenture.The Verge reported on a revived U.S. privacy bill that would restrict AI companies and data brokers from selling sensitive health and location data. As consumer AI assistants increasingly handle intimate personal information, some lawmakers are looking to treat chatbot data flows as a health-data surveillance problem.AI is eating the world. OpenAI published new data on how ChatGPT adoption has expanded globally, noting that users primarily using non-English languages now represent more than half of active users. The analysis says Spanish, Portuguese, and Arabic are the leading non-English languages used in ChatGPT.

AI Week in Review 26.07.04

AI Week in Review 26.07.04

Other newsrooms on this story

Related reading

AI Week in Review 26.02.28

AI Week in Review 26.03.28

AI Week in Review 26.06.13

Anthropic's Fable 5 is back worldwide after a two-week government ban over a…

AI Week in Review 26.04.18

AI Week in Review 26.04.11

Related reading

AI Week in Review 26.02.28

AI Week in Review 26.03.28

AI Week in Review 26.06.13

Anthropic's Fable 5 is back worldwide after a two-week government ban over a…

AI Week in Review 26.04.18

AI Week in Review 26.04.11

Other newsrooms on this story