M* is a modular serving system for multimodal models that achieves up to 2.7x higher throughput vs. vLLM-Omni and 4x vs. SGLang-Omni on composite model workloads.