CUDA 13.3 Lands, AI Writes Blackwell Kernels, & FP4 VRAM Optimization for LLMs

Today's Highlights

NVIDIA releases CUDA Toolkit 13.3, bringing new features and optimizations for GPU developers. Meanwhile, an AI system demonstrates the ability to write 'speed-of-light' CUDA kernels for Blackwell, and a new mixed-precision technique promises VRAM savings for long-context AI.

Info: Nvidia Cuda 13.3 landed (r/LocalLLaMA)

Source: https://reddit.com/r/LocalLLaMA/comments/1tp0vk1/info_nvidia_cuda_133_landed/