SAN DIEGO, CA - MARCH 31: Qualcomm's new CEO Cristiano Amon poses for photos in the lobby at Qualcomm headquarters on Wednesday, March 31, 2021 in San Diego, CA. (Photo by Eduardo Contreras / The San Diego Union-Tribune via Getty Images)The San Diego Union-Tribune via Getty ImagesMany have heard a lot about Agentic AI, and how it will impact our lives, our businesses, and even our relationships with computing devices. This article lays out the basic landscape for agentic AI infrastructure, spanning personal devices, edge computing, and hyperscale cloud infrastructure, and assesses how one player, Qualcomm, is hoping that Agentic AI is just the opportunity it has been waiting for. (Disclosure: Qualcomm, Nvidia and many other AI semiconductor companies are clients of Cambrian-ai Research.)What is Agentic AI?Agentic AI describes systems that do more than generate outputs on request; they exhibit “agency” by setting or decomposing goals, choosing strategies, and taking action (APIs, apps, tools, other agents) to move forward to achieve those goals. A full solution may orchestrate one or more AI agents, each with their own degree of autonomy.Whereas traditional and generative AI are reactive, agentic AI can do work, not just predict an outcome or answer a question. For example, one can ask an AI agent to “research a vendor and draft an RFP.” The agent can break goals down into steps, sequence them and execute those steps via tools, APIs, workflows, bespoke code or other agents.The Agentic AI control plane runs on CPUs, requiring only limited human guidance. The critical endgame allows the user to provide approval authority to take actions. Agentic AI uses generative models as a “brain” inside the broader control loop that can call tools, query data sources, update state, and iterate to completion of a task. Agentic AI Impact on Computing InfrastructureAs agents run multiple steps over a longer time period, calling on one or more AI models in a loop for elements of a solution, agentic AI will consume far more tokens and demand lower latencies than generative AI does. Consequently, the infrastructure needed to run agentic AI must become more efficient to balance the cost/value equation.MORE FOR YOUAgentic AI systems demand lower latencies because they operate in feedback loops, orchestrate many tool and model calls per task, and often act in real time; even modest per-step delays compound into unacceptable end‑to‑end lag and unstable behavior.Traditional GenAI apps are often one request → one response, so a few seconds is tolerable. Agentic systems plan, act, observe, and re-plan in multiple iterations, so a 1–2 second delay per step can easily turn into many tens of seconds or longer, overall. In addition, a single user intent can trigger dozens of retrievals, augmented generation (RAG) calls, API/tool invocations, and inter-agent messages; each additional hop adds network and compute latency that accumulates linearly, or worse.In customer support, voice bots, and co‑pilot interfaces, users expect near‑instant turn‑taking; agents that take 15–30 seconds per action are perceived as broken, regardless of accuracy. In domains like trading, logistics control, or autonomous systems, decisions must land within tight time budgets; higher latency directly translates into missed opportunities or unsafe behavior.The industry is shifting from optimizing individual AI models to orchestrating complex, distributed AI systems—and this shift is redefining compute architectures across edge, cloud-edge, and data center.Agentic AI Puts the CPU Is Back in the SpotlightThe infrastructure for AI was initially optimized for high-throughput GPU training and now for inference. The CPU has acted as the control plane, sending heavy-duty processing to the GPU or other ASIC. Inference processing was once thought of as a simple one-shot walk through the neural network. Agentic AI is completely breaking this model, and a new architecture is emerging.Agentic AI is workflow-driven, placing significant demands on CPUs to plan, schedule and optimize over an optimization loop to find the best answer to a problem. As such, the CPU moves from a supporting role to an orchestration engine and decision-making agentic role. The orchestration occurs across multiple tool and AI model instantiations, so the accelerator workload also increases.Agentic workloads introduce much heavier control‑plane logic on the CPU side: planning, multi‑step tool invocation, retrieval orchestration, memory/context management, API calls, evaluations, multi‑agent coordination and task termination. In “AI agent era” data centers, CPU core demand per GW of accelerator capacity could rise some four-fold, driving the move to near‑parity ratios for large‑scale agentic services. For “classic” LLM assistants, many operators sized roughly 1 CPU socket per 4–8 accelerators. For the emerging agentic AI workloads, guidance and early deployments are moving more toward one or two CPUs per accelerator at the system level.This shift to CPU reliance, coupled with an increasing focus on energy efficiency, is why Qualcomm sees a tremendous opportunity in agentic AI; but it must move fast.Hybrid AI Infrastructure: The Scalable Model for Agentic AI WorkloadsIf one thinks about the continuum of compute resources available to the agentic AI user, it should become obvious that the infrastructure can improve overall efficiency if each layer contributes to the orchestrated workflow; each layer plays its logical part. Properly implemented, a hybrid infrastructure should be able to lower costs and energy consumption per token, and thereby lower the cost of agentic actions, while providing a higher level of responsiveness, reliability and scalability. This needs to be accomplished with a sharp focus on power consumption to be both affordable and acceptable to society, especially in the data center. Let’s look at the roles and limitations of the three layers of infrastructure: endpoints, edge servers, and the hyperscale data center. The mobile endpoints, or devices, provide intent classification, the front end of the workflow. What are the user’s objectives and priorities? A mobile phone can provide personal context / awareness that is key for agents to interpret a request and deliver a relevant result. Here, performance per watt is king; people won’t lug around extra batteries to run agentic AI. It has to be built into the devices we use every day. At the edge, perhaps an edge-cloud, workstation, or vehicle, the ready availability of power allows for more computation, more sensors, more memory and more storage. This allows the edge to perform intermediate reasoning and aggregation. Qualcomm has already attained leadership status in the intelligent automotive market.Of course, in the data center we expect nearly limitless computation for large-scale model execution. But massive data centers are becoming a political flashpoint, turning segments of the population against AI. So, it is reasonable to conclude that more power-efficient designs than currently available will see ready demand.As I look across this agentic landscape, Qualcomm is clearly strong in power-efficient CPUs and AI, but has been missing out where all the action is: the data center.How Might Qualcomm Fare in the Agentic AI Age? Given the angst about data center power consumption and costs, there is considerable interest in Qualcomm’s expected disclosures about its data center strategy at its upcoming Investor Day, June 24. Clearly, Qualcomm’s strength in mobile and edge devices like automobiles provide a launch pad for the company’s push to become a full-scale provider of agentic AI infrastructure. Most investors already know that Qualcomm Snapdragon has excellent AI at the edge, but without a strong play in the data center, it will be impossible for the company to leverage agentic AI to the extent Nvidia can. How will Qualcomm position its Cloud AI200, recently rebranded as Dragonfly? Will it have a strong enough power efficiency story for inference processing to make up for the fact that Qualcomm is late to the data center party? Here’s what we know so far about DragonflyQualcomm’s upcoming Data Center products (Qualcomm AI200 and Qualcomm AI250), under the new brand, Qualcomm Dragonfly, are being positioned as efficiency-first AI inference platforms, not a training machine. Qualcomm says it uses an innovative near-memory computing architecture that delivers more than 10x higher effective memory bandwidth with much lower power consumption, and the company ties that to high-performance-per-dollar-per-watt for data center AI inference. Anyone who has been watching AI of late knows that the battle for AI compute has shifted to a battle for new memory architectures to increase performance while reducing the energy spent on data movement.The Dragonfly brand was launched at Computex 2026.QualcommQualcomm’s launch material says AI250 is built for rack-scale AI inference, with a “generational leap” around memory efficiency and lower power draw. It also says AI250 will use direct liquid cooling, which suggests Qualcomm is targeting sustained efficiency at rack scale rather than peak burst performance. Qualcomm is clearly aiming at lower power consumption, better utilization, and lower total cost of ownership. Qualcomm had previously announced its intention to adopt NVLink in its data center roadmap; we don’t know if this first iteration will include the networking technology.Target Markets: Dragonfly encompasses three main product groups: Central Processing Units (CPUs), custom ASICs (Application-Specific Integrated Circuits), and dedicated AI inference accelerators. Custom Silicon: The brand relies on interconnect intellectual property and high-speed data transfer tech (such as PCIe, CXL, and Ethernet) obtained through Qualcomm’s acquisition of Alphawave Semi. Hyperscaler Partnerships: Qualcomm is heavily collaborating with cloud providers and enterprise customers—including rumored early production designs with companies like ByteDance (unverified by Qualcomm)—with high-volume production expected to generate billions in revenue.Form Factors: Dragonfly delivers processing hardware across multiple setups, ranging from standalone accelerator cards to dense servers and industrial-scale server racks.We will Learn More at the Qualcomm Investor DayThe next age of AI is already upon us. Agentic AI will transform jobs across every industry, allowing human workers to focus more time and attention where their creativity, values, and world-knowledge are most needed. But to make Agentic AI an affordable reality, it must be implemented using power-efficient yet high-performance technologies across a hybrid infrastructure. Qualcomm already has a strong CPU story at the device and the edge. Now we will see if the company can complete the pivot to become a broad-scale AI infrastructure provider. Disclosures: This article expresses the opinions of the author and is not to be taken as advice to purchase from or invest in the companies mentioned. My firm, Cambrian-AI Research, is fortunate to have many semiconductor firms as our clients, including Baya Systems, BrainChip, Cadence, Cerebras Systems, D-Matrix, Flex, Groq, IBM, Infleqtion, Intel, Micron, NVIDIA, Qualcomm, SImA.ai, Synopsys, Taalas, Tenstorrent, Ventana Microsystems, and scores of investors. I have no investment positions in any of the companies mentioned in this article. For more information, please visit our website at https://cambrian-AI.com.