GPU-accelerated query engines are often constrained by memory and I/O bandwidth. NVIDIA hardware advances—including high bandwidth memory (HBM), NVIDIA NVLink-C2C, and dedicated decompression engines featured in NVIDIA GB200 NVL4—help remove these bottlenecks by increasing effective storage capacity, accelerating data movement between CPUs and GPUs, and speeding data access without consuming streaming multiprocessor (SM) resources.

In this post, we show how databases can use these technologies to accelerate GPU query execution. You’ll learn techniques for efficient CPU-GPU data movement, compression, partition pruning, and overlapping data transfer with computation.

Architecture overview of GQE

GQE (GPU Query Engine) is a reference architecture designed to execute SQL queries at high performance over large data sets on modern NVIDIA hardware. Under the hood, GQE uses NVIDIA cuDF and other NVIDIA CUDA-X libraries, including CCCL, nvCOMP, and nvSHMEM.

GQE can help influence query engines to: