EMO: Pretraining mixture of experts for emergent modularity | Ai2
EMO is a new mixture-of-experts model trained so modular expert groups emerge from data, enabling users to select small task-specific expert subsets while preserving near full-model performance.