Bigger Isn't Better: The Case For Rightsized AI

Iri Trashanski, Chief Strategy Officer at Ceva, is shaping the future of the Smart Edge with extensive experience across tech sectors.gettyThe conversation around artificial intelligence is still largely centered on AI compute. Faster processors, larger models and more powerful infrastructure dominate the headlines. But as AI moves from the cloud into real-world environments, a different challenge is starting to take shape at the edge. As processing speed is critical, delivering the rightsized, purpose-built inference engines that fit the constraints of real-world systems and the specific needs of each application will enable the best latency per use case and device type.We are entering a phase where AI must operate not just in data centers, but across billions of devices spanning cameras, sensors, vehicles, industrial systems and consumer electronics. These systems must connect, sense and interpret their environment and make decisions locally. That shift fundamentally changes what it takes to make AI work.Two Worlds Of AI EmergingToday, AI is split into two distinct domains.The first is the data center. This is the world of hyperscale and neocloud infrastructure, where performance is measured in throughput and scale, and where power and cost constraints are very different. Architectures in this environment are optimized for training large models and pre-filling and decoding multitrillion parameter models.The second is the edge, where AI interacts with the physical world. Devices must operate under tight constraints of limited power, AI compute cycles, real-time responsiveness and strict cost requirements. They must process data from multiple sources, often in unpredictable environments and low latency reaction times.What works in one world does not translate directly to the other.At the edge, success is not defined just by raw compute performance. Rather, deploying AI must be rightsized to specific applications with the appropriate performance and focus on specific steps of inference (i.e., pre-fill vs. decode or both, based on model size), in addition to power needs considered. Ultimately, success depends on how well multiple functions work together as a cohesive system.The Real Challenge: Making Systems Work TogetherIn edge environments, AI is only one part of a larger system.A device must have the ability to connect to other devices in its vicinity and the cloud to support the latest model and updates. It must sense its environment by capturing audio, video or other contextual data. Only when having the latest data and understating its surrounding can it make decisions by running local models to generate meaningful insights or actions.Each of these functions has its own requirements, and they are key to enabling AI at the edge.This is where complexity grows quickly.Why Flexibility Matters More Than Scale Unlike the data center, the edge is highly fragmented.There is no single dominant architecture. Requirements vary widely across industries, applications and devices. A wearable device has very different constraints than an autonomous vehicle or an industrial sensor.This diversity makes flexibility essential.Companies need to tailor their solutions to specific use cases. They need the ability to integrate different technologies and communication protocols and adapt quickly to changing requirements. Not every device needs Wi-Fi or 5G. Not every application requires the same level of AI performance.Rigid, one-size-fits-all approaches struggle in this environment. At the same time, solutions must scale across product portfolios and multiple SKUs, adding another layer of complexity.The Role Of The IP Model In The Edge EraRather than building complete systems from scratch or relying entirely on fully integrated platforms, companies can leverage specialized building blocks and assemble solutions tailored to their needs. This approach reduces development time, lowers risk and enables greater differentiation.As the market evolves, so does the IP model itself.The industry is moving beyond discrete components toward more integrated, system-level solutions. Instead of licensing individual IP blocks, companies increasingly look for subsystems and platforms that bring together multiple capabilities—reducing complexity and accelerating time to market.This is particularly important at the edge, where a unified foundation that brings together connectivity, sensing and inference with scalable processing and software is key. We think of this as an AI fabric. It allows companies to build rightsized AI solutions that are optimized for their specific applications, while simplifying integration and enabling faster deployment.It is not just about delivering compute. It is about providing a flexible, cohesive framework that allows systems to connect, sense and infer efficiently in real-world conditions.What Comes NextLooking ahead, AI capabilities will become standard across a wide range of devices. Neural processing units (NPUs) are already following a trajectory similar to CPUs and GPUs before them, moving toward broad adoption across the industry.At the same time, devices will need to process more data, from more sources, in more complex environments. The ability to connect, sense and infer as part of a unified system will become a key differentiator.Success in this next phase will depend not on maximizing compute, but on reducing complexity and delivering solutions that are tailored, efficient and scalable.A Broader Perspective On AI's FutureThe future of AI will not be defined by a single architecture or approach. It will be shaped by how effectively different technologies come together to solve real-world problems. That requires flexibility, specialization and system-level thinking.As AI continues to move beyond the cloud and into everyday devices, the industry’s attention will inevitably shift. The question will no longer be just about compute.The question is if we are building AI systems that are fit for purpose.For companies building the next generation of intelligent devices, the priority should be clear: Design for the edge from the start. Choose architectures and technologies that are purpose-built, scalable and integrated—and avoid trying to retrofit data center solutions into environments they were never designed for.That is where the next phase of innovation and competitive advantage will be defined.Forbes Technology Council is an invitation-only community for world-class CIOs, CTOs and technology executives. Do I qualify?

Bigger Isn't Better: The Case For Rightsized AI

Bigger Isn't Better: The Case For Rightsized AI

Other newsrooms on this story

Related reading

The Future Of Agentic AI Lives At The Edge

What Business Leaders Need To Know About Developing Edge AI

Smaller companies are rising quickly to challenge Big Tech as AI 's best trade

The Laws of Diminishing Returns in AI: When Bigger Is No Longer Better

Why Small Language Models Are Quietly Winning Where It Matters Most

AI compute is becoming a question of choice, not just raw power: Intel’s Anil…

Other newsrooms on this story

Related reading

The Future Of Agentic AI Lives At The Edge

What Business Leaders Need To Know About Developing Edge AI

Smaller companies are rising quickly to challenge Big Tech as AI 's best trade

The Laws of Diminishing Returns in AI: When Bigger Is No Longer Better

Why Small Language Models Are Quietly Winning Where It Matters Most

AI compute is becoming a question of choice, not just raw power: Intel’s Anil…