We’re introducing SilverTorch, a reimagining of recommendation systems that unifies all retrieval components for user generated content under a unified architecture.
SilverTorch shows up to 23.7x higher throughput compared to the state-of-the-art approaches. It’s also showing 20.9x more compute cost efficiency compared to a CPU-based solution while also improving accuracy.
Our research paper, “SilverTorch: A Unified Model-based System to Democratize Large-Scale Recommendation on GPUs,” accepted to the full paper track at SIGIR 2026, contains full technical details.
The retrieval system within industry recommendation systems have consisted of microservices stitched together, with neural networks inconsistently integrated. Our recommendation can scale to serve people across multiple platforms. Retrieval is responsible for narrowing from millions of pieces of content (e.g., reels and photos) down to thousands before passing them to ranking systems, all in less than 100 milliseconds.
However, the microservice based design had hard constraints on model complexity and the number of candidates evaluated, ultimately creating a ceiling on the quality of recommendations that people on our platforms see.













