The scale wall

A computer vision pipeline that works on one image at one resolution isn't a pipeline. It's a prototype. The moment you move beyond controlled inputs, you hit the reality of production images: a 4K video frame, a satellite capture, a whole-slide pathology image, a high-resolution document scan. These images don't fit in a single model call. They're too large, too detailed, and too information-dense for one inference pass to handle well.

So you tile it. You divide the image into a grid of regions and run inference on each region independently. A 3×3 grid means 9 inference calls. An 8×8 grid means 64. A whole-slide pathology image at diagnostic resolution? Tens of thousands of tiles.

The orchestration problem scales directly with the image.

And as that tile count grows, so do the failure modes. Nine concurrent inference calls might all succeed. Sixty-four concurrent calls will occasionally hit a throttle limit or a timeout. At hundreds of tiles, partial failures aren't edge cases. They're expected. You need orchestration for your CV pipeline. The real requirement is that your orchestration scales with your image.