Just as with LLMs, success in other frontiers of AI will require access to large volumes of high-quality data. That will require deliberate, research-driven dataset design.