AI Week in Review 26.01.31

Figure 1. Google’s Project Genie has built a web app for world building and exploration around their Genie 3 world model.OpenClaw is your personal computer AI assistant and a viral sensation with 115K GitHub stars!The open-source AI assistant named Clawdbot that runs 24/7 on your own computer became a viral sensation in recent weeks as users found it incredibly powerful and useful. Due to Anthropic legal challenges, it changed its name to Moltbot and is now OpenClaw.This open-source AI assistant connects to your messaging apps and computer applications, leverages Claude Skills, and remembers your preferences and prior tasks (via recorded Markdown files). With a good AI model (Claude AI models preferred), this versatile and useful assistant can be personalized to automate many online tasks. Alexander Temerev said:I am very impressed how many hard things Clawd gets right. Persistent memory, persona onboarding, comms integration, heartbeats. Warning: OpenClaw is in development and has many security risks! Connecting your applications and subscriptions to an AI assistant with few guardrails leads to risks of prompt injection attacks and privacy leaks. While open source, it requires using an AI model to run. To avoid massive on Claude AI model token bills, you can connect it to local AI models like gpt-oss-20b via LMStudio or Ollama, or use Antigravity with OpenClaw to get free token access.Crazy side-story: A social network for agents called Moltbook was started last week, and its going viral too.China’s Moonshot AI introduced Kimi K2.5, an open multi-modal (text, image and video input) upgrade of its Kimi family built for “visual agentic intelligence.” Kimi K2.5 comes in flavors of instant, thinking, agent, and an agent swarm version that includes an agent swarm orchestration feature for parallel agent tool use. Kimi K2.5 performs in the top tier of AI models, scoring 76.8% on SWE-Bench, comparable to Gemini 3 Pro and the strongest open AI model to date for coding. It is particularly strong in front-end development due to vision capabilities.The Kimi K2.5 Agent Swarm model decomposes complex tasks into parallel sub-tasks that are executed by dynamically instantiated domain-specific agents. The Kimi K2.5 Agent Swarm mode itself directs and coordinates the agent swarm, so no external AI agent orchestration is needed. This feature goes beyond any other frontier AI model, including Claude Opus 4.5, and gives a glimpse of where the next level of AI model performance may come from.Alibaba’s Qwen team released their latest frontier AI model Qwen3-Max-Thinking, a reasoning model trained for complex math, coding, and multi-step workflows. Qwen3-Max-Thinking has a 262,144-token context window and key new features: Adaptive tool-use, built-in web search, and test-time scaling with multi-round self-reflection. This helps it in deep research and verification, long-context reasoning, and tool-using agentic tasks. With use of search and test-time scaling, Qwen3-Max-Thinking gets a SOTA 58% on Humanity’s Last Exam (HLE) benchmark, and 92.8% on GPQA Diamond.Overall, Kimi K2.5 and Qwen3-Max-Thinking perform on par with GPT-5.2-Thinking, Claude-Opus-4.5, and Gemini 3 Pro. With an anticipated new DeepSeek v4 release that could be a banger, it appears the top Chinese AI labs are close followers to their US counterparts.Google introduced Project Genie, a web app powered by the Genie 3 world model, Nano Banana Pro and Gemini, that lets users create, explore and remix their own interactive worlds. This experimental research prototype lets users upload an image or describe a world, then turns it into a navigable environment. A remix feature can create new immersive environments from existing ones. Project Genie is available for Google AI Ultra subscribers in the U.S.Google updated Gemini in Chrome to support agentic browsing, with Gemini 3 support, more powerful integrations, enabling multi-tasking using AI agents with a new side panel interface. This feature allows the browser to perform multi-step actions such as filling out online forms or researching complex topics autonomously, transforming Chrome from a passive tool into an active productivity assistant. Google also plans to integrate Personal Intelligence in the coming months.Google introduced Agentic Vision in Gemini 3 Flash, enhancing its visual understanding by processing visual information based on reasoning and agentic intent. Traditional vision models (VLMs) take a static single pass to process a whole image, but Agentic Vision introduces a “Think-Act-Observe” loop to focus on details relevant to its task and iteratively can re-look or focus to understand important visual details. For example, it can focus on small image details, like a serial number or a road sign, interpret detailed data in an image, like reading a chart, or executing a task based on elements in a scene.As a new feature of Gemini 3 Flash, this is available today via the Gemini API and rolling out in the Gemini app (access by selecting Thinking from the model drop-down).Figure 2. Google’s Agentic Vision uses a loop to interleave vision, reasoning, tool use and code execution, which opens up more powerful multimodal AI.Google Enables direct transition from AI Overviews to AI Mode conversational chats, providing a seamless experience for search users while making Gemini 3 the default model for AI Overviews. This change reduces friction between standard search queries and interactive AI assistance, and it moves Google Search closer to a vision of an AI “answer engine.”OpenAI introduced Prism, a LaTeX-native, AI-powered research workspace for scientific research and writing. Built on GPT-5.2, Prism embeds reasoning, editing, and reference-aware assistance directly into the research workflow, helping to unify drafting, citation management, and collaboration for scientists and technical writers, while reducing reliance on fragmented tool-chains. Prism offers unlimited projects and collaborators and is available today to anyone with a ChatGPT personal account, with expanded features for ChatGPT Business, Enterprise, and Education users planned for later releases.Figure 3. Prism by OpenAI is an AI-first collaboration and editing environment for scientific papers.Mistral announced Vibe 2.0. Mistral has upgraded their terminal-based AI coding tool with custom subagents, multi-choice clarifications from the agent, slash-command skills, unified agent modes, and automatic updates. Vibe 2.0 leverages Mistral’s Devstral 2 for coding applications and is available via Mistral’s Le Chat Pro and Team plans.Anthropic has introduced interactive app integration into Claude’s AI assistant, enabling use of tools like Asana, Slack, Figma, and Box directly within Claude’s interface. These integrations allow users to trigger actions and manage workflows in third-party apps through Claude’s natural language interface, which should boost workflow efficiency by avoiding hopping between tools.Further integrating Claude AI with tools, Anthropic also recently made Claude in Excel available to anyone with a Pro subscription. Those lacking Excel skills can leverage Claude in Excel to do Excel wizardry like building 11-tab financial models.Vercel Labs has open-sourced agent-browser, an AI browser automation tool with a Rust-powered CLI, millisecond response times, and an innovative Refs system for token-efficient AI interaction. This improves efficiency, reducing context sent to the LLM by 90% versus browser automations based on MCPs and Playwright.Manus added Agent Skills, embracing the Skills standard for their platform, further cementing Skills as the way to package specialized AI agent capabilities and knowledge into reusable components.Luma AI updated its Ray3 model to Ray3.14, deliver faster and more affordable 1080p generative video capabilities:“Ray3.14 is our most professional and powerful model, now with native 1080p video generation. It is 4x faster and 3x cheaper, and gives you the best ever quality and stability, with improved motion consistency for Modify Video.”Ray 3.14 takes text, image and video input and is available in Luma’s Dream Machine platform.Figure 4. Ray 3.14 has improvements in resolution, speed, and quality over its predecessor.KREA AI released Realtime Edit, which enables instantaneous modification of AI-generated images. With Realtime Edit, users can tweak compositions and details with immediate visual feedback, significantly speeding up the creative workflow, making it a game-changer for controlled AI image creation.Airtable launched their SuperAgent AI agent, its first standalone AI agent dedicated to autonomous workflows to manage data and execute complex tasks across the Airtable platform without constant supervision.Decart launched the Lucy 2 real-time video model, capable of streaming 1080p video in real-time without buffering and offering immediate visual feedback for creators and developers. Lucy 2 video model is available on Decart’s platform.UAE’s MBZUAI, with G42 and Cerebras, released K2 Think V2, a 70B parameter open reasoning LLM built on the K2-V2 base model that supports improved long-context. It’s a “sovereign” AI model independent of both U.S. and Chinese labs and it is fully open, with model weights and data and training recipes, openly available, supporting further open AI model research.Gemini CLI now has custom hooks for Gemini CLI workflows. This enables Gemini CLI users to customize their agent loop, for example, in managing context, enforcing policies, and validating actions.Nvidia launched the Earth-2 Weather Model Suite, a fully open AI weather model suite for global and local forecasting. Earth-2 added a trio of open weather AI models: Earth-2 Medium Range for 15-day global forecasts, Earth-2 Nowcasting for 0–6 hour high-resolution severe-weather prediction, and Earth-2 Global Data Assimilation to generate rapid current atmospheric states. Climate researchers and meteorologists are using Earth-2 to accurately predict weather in many useful applications.Meta credited AI-driven improvements in content recommendation and ad performance for stronger-than-expected Q4 results. CEO Mark Zuckerberg confirmed Meta’s shift away from metaverse-first strategy toward AI-centric platforms, with plans to invest up to $169 billion in AI infrastructure and talent in 2026.During its Q4 2025 earnings call, Tesla revealed increased efforts in AI, self-driving cars, and robotics. They outlined next-generation AI chip development (AI5 and AI6), plans to scale Robotaxi in multiple cities, disclosed a $2 billion investment in xAI, and unveiled plans for Optimus 3 humanoid robot full production later in 2026. Tesla is no longer an EV company.Northslope raised $22M to build mission-specific enterprise AI applications on top of Palantir’s infrastructure. The company pitch is that generic SaaS tools fail to meet mission-critical needs in regulated industries, so they will deliver deeply embedded, task-specific AI applications.A growing gamer backlash against generative AI content has forced some game developers to cancel or revise AI-assisted projects. Several studios have scaled back or canceled AI-assisted features following player criticism that AI content, dialogue, and characters undermine creativity and quality.Meanwhile, US public skepticism about AI may influence upcoming elections and AI regulation debates. A survey shows that voters in Republican-leaning states broadly oppose rapid AI acceleration while supporting regulations on AI to protect minors.A recent Keyfactor industry survey found that two-thirds of companies view autonomous AI agents as a greater security risk than humans, indicating elevated concerns about attack surfaces, agent behavior, and oversight. One takeaway is that AI risk management strategies must adapt to risks from agent autonomy.AI risks are on the mind of Dario Amodei in his recent essay titled The Adolescence of Technology. The less-optimistic counterpart to his essay Machines of Loving Grace, the essay addresses the risks and challenges facing the AI industry and explores the need for robust safety as AI systems rapidly become more powerful. Amodei argues for a thoughtful approach to navigating the AI that soon could become a “country of geniuses in a datacenter.”I believe if we act decisively and carefully, the risks can be overcome… but we need to understand that this is a serious civilizational challenge. – Dario Amodei

AI Week in Review 26.01.31

AI Week in Review 26.01.31

Other newsrooms on this story

Related reading

Last Week in AI #334 - Kimi K2.5 & Code, Genie 3, OpenClaw & Moltbook

AI Week in Review 26.05.02

AI Week in Review 26.03.21

AI Week in Review 26.03.28

AI Week in Review 26.02.28

AI Week in Review 26.03.14

Other newsrooms on this story

Related reading

Last Week in AI #334 - Kimi K2.5 & Code, Genie 3, OpenClaw & Moltbook

AI Week in Review 26.05.02

AI Week in Review 26.03.21

AI Week in Review 26.03.28

AI Week in Review 26.02.28

AI Week in Review 26.03.14