What Happened
BigFinanceBench (928 expert-authored tasks) and Hedge-Bench (102 real hedge-fund analyst tasks) dropped simultaneously, giving the market its first rigorous, rubric-graded measurement of where AI agents actually stand. Best-in-class models hit 58.8% on BigFinanceBench — and below 16% on the harder hedge-fund tasks. Both benchmarks grade the derivation, not just the final answer, which makes the results harder to game and more credible to institutional buyers.
Who Gets Hit
Positive: NVDA is the clearest beneficiary — closing a measurable, well-defined capability gap is the exact story that sustains GPU procurement cycles at major financial institutions. MSFT and GOOGL get a quieter lift: benchmark results hand their cloud AI sales teams a concrete "here's where you score today, here's the roadmap" pitch to every bank and asset manager. Mixed: FDS (FactSet) is at a crossroads — the benchmarks create a template for differentiated AI analytics products, but only if FactSet moves fast; slower incumbents could cede ground to AI-native data startups. Bloomberg (private) is likely best-positioned of all financial data players but offers no direct equity expression.
The Trade








