NVIDIA Blackwell Sets STAC-AI Record for LLM Inference in Finance | NVIDIA Technical Blog

Large language models (LLMs) are revolutionizing the financial trading landscape by enabling sophisticated analysis of vast amounts of unstructured data to generate actionable trading insights. These advanced AI systems can process financial news, social media sentiment, earnings reports, and market data to predict stock price movements and automate investment strategies with unprecedented accuracy.

The Strategic Technology Analysis Center (STAC) has been developing benchmarks for the workloads key to the financial industry for over 15 years. They have developed the STAC-AI benchmark to help companies assess the end-to-end retrieval-augmented generation (RAG) and LLM inference pipeline.

This post presents the results achieved on the STAC-AI LANG6 benchmark across multiple NVIDIA platforms. We will also share some recommendations on how any user can benchmark NVIDIA TensorRT LLM according to the specifications of their dataset.

STAC-AI LANG6 (Inference-Only) Benchmark

In the broader context of a RAG pipeline, STAC-AI LANG6 is the part of the benchmark focusing on LLM inference performance. The benchmark tests the hardware and software stack on the Llama 3.1 8B Instruct and Llama 3.1 70B Instruct models in combination with the following custom datasets:

STAC-AI LANG6 (Inference-Only) Benchmark

NVIDIA Blackwell Sets STAC-AI Record for LLM Inference in Finance | NVIDIA Technical Blog

NVIDIA Blackwell Sets STAC-AI Record for LLM Inference in Finance | NVIDIA Technical Blog

Other newsrooms on this story

Related reading

Accelerating LLM and VLM Inference for Automotive and Robotics with NVIDIA…

IEEE Rolls Out Large Language Models Virtual Training Course

LLM Trends and Future Outlook

Small language models: Rethinking enterprise AI architecture

Boost Inference Performance up to 15x on NVIDIA Blackwell Using DFlash…

Small Language Models Outperform Frontier AI On Cost, Speed And Accuracy

Other newsrooms on this story

Related reading

Accelerating LLM and VLM Inference for Automotive and Robotics with NVIDIA…

IEEE Rolls Out Large Language Models Virtual Training Course

LLM Trends and Future Outlook

Small language models: Rethinking enterprise AI architecture

Boost Inference Performance up to 15x on NVIDIA Blackwell Using DFlash…

Small Language Models Outperform Frontier AI On Cost, Speed And Accuracy