Microsoft Azure has achieved industry-leading results in AI inference workloads, as demonstrated in the recent MLPerf Inference results. Utilizing the new NC H100 v5 series VMs powered by NVIDIA H100 NVL Tensor Core GPUs, Azure reaffirms its commitment to optimizing AI infrastructure for cloud-based training and inferencing.
The evolution in generative AI models showcases a trend towards larger and more complex architectures, exemplified by models like Llama2 with 70 billion parameters. This growth underscores the increasing demands within the AI industry to handle more sophisticated tasks and outputs.
Specifically designed for dense inferencing needs, the Azure NC H100 v5 VMs represent a significant advancement in performance for generative AI applications. With enhanced memory capacity and bandwidth, these VMs deliver optimized performance, supporting organizations in efficiently managing complex AI workloads.
In the MLPerf Inference v4.0 benchmark, the NC H100 v5 series demonstrated a 46% performance improvement compared to competitors with GPUs featuring smaller memory capacities. This boost in performance enables efficient handling of large models on fewer GPUs, translating to faster inferencing speeds and greater operational efficiency.
Overall, the launch of the NC H100 v5 series underscores Azure’s leadership in AI infrastructure, setting new standards in performance and scalability. As Azure continues to innovate with advancements like the NVIDIA Grace Blackwell GB200 GPUs, the future promises further enhancements in AI capabilities within the cloud computing landscape.