February 27, 2026Ai2Tech ReportDataWhat do researchers actually do with AI-powered science tools? Turns out, their habits aren’t always in line with what agentic tool developers – including our own Asta team – expect. Our new open dataset of over 258,000 real researcher queries reveals that scientists aren't just using AI to search or synthesize; they're rewriting the rules of what search even means, submitting queries seven times longer than traditional searches, navigating results out of order, and importing tricks they've learned from general-purpose chatbots into tools that were never designed for it. The gap between how these tools were built and how researchers actually use them is wide—and, for the builders of AI tools, instructive.Today we’re releasing the Asta Interaction Dataset (AID)—258,935 queries and 432,059 clickstream interactions from researchers using Asta, our AI-powered research assistant integrated with Semantic Scholar (S2). Collected over six months (February–August 2025) from users across dozens of disciplines, AID captures not just what researchers ask, but how they engage with the results: which sections they expand, which citations they click, which reports they revisit days later, and so on.To our knowledge, this is the largest open dataset of how researchers interact with AI-powered scientific tools. Prior reports on AI tool usage – from Anthropic, OpenAI, Perplexity, and others – share only aggregate analyses without the underlying data. Existing public conversation datasets like LMSYS-Chat-1M, WildChat, and OpenAssistant contain general-purpose LLM conversations, but none are specific to scientific research tools or include rich clickstream signals. We’re releasing the full query text, interaction logs, and a reusable query taxonomy because we believe the community needs shared, open data to make progress on understanding how researchers actually use these tools.In this post, we walk through a few of the things we found.Asta: Two AI-powered research interfacesAsta is an open research assistant platform integrated with S2, a major academic search engine. It exposes two AI-powered interfaces:PaperFinder (PF): An AI-enhanced literature search tool that returns a ranked list of papers with lightweight LLM-generated synthesis. (In Asta, this powers the Find papers feature.)ScholarQA (SQA): A scientific question-answering tool that produces structured, multi-section reports with inline citations, essentially an automated literature summary tool that produces structured reports on demand. (In Asta, this powers the Generate a report feature.)Both tools use retrieval-augmented generation (RAG) over a scholarly corpus, grounding all claims in retrieved papers via inline citations. As a baseline, we also compare against traditional S2 keyword search.A note on privacy: We take protecting user data very seriously. In Asta, users can choose to share their de-identified interactions for inclusion in public research datasets—our study draws exclusively from users who opted in. For these opted-in interactions, we use hashed report identifiers with no user IDs, and remove queries flagged by an LLM as containing PII (less than 1%).Queries are longer, more complex, and more demandingUsers of AI-powered tools submit dramatically longer and more complex queries compared to those submitted to traditional academic search engines:
How do researchers actually use AI-powered science tools? Lessons from 250,000+ queries | Ai2
The Asta Interaction Dataset (AID) contains real researcher queries revealing how scientists actually use AI-powered research tools, and where their habits diverge from what tool builders expect.













