During the ongoing Vision 2024 event in Phoenix, AZ, Intel has introduced a range of innovative hardware and software strategies aimed at propelling the company’s AI aspirations across diverse market sectors. While the recent debut of Meteor Lake Core Ultra processors extended NPUs to compact systems, today’s focus is squarely on data centers and enterprise solutions. Intel CEO Pat Gelsinger unveiled the cutting-edge Gaudi 3 AI accelerator and a fresh line of Xeon 6 processors during his keynote speech, both touted to deliver remarkable performance and efficiency enhancements compared to predecessors, positioning Intel competitively in the market.
Unveiling Gaudi 3 for Enhanced AI Acceleration in Training and Inference Intel’s latest Gaudi 3 AI accelerator builds upon the foundation of its predecessor but introduces several key advancements. Fabricated on a more advanced 5nm process node, Gaudi 3 enhances power efficiency and accommodates more features and transistors, enhancing overall chip performance. Boasting 64 5th-Gen Tensor Processor Cores (TPCs) and 8 Matrix Math Engines (MMEs), Gaudi 3 doubles the FP8 and quadruples the BF16 compute performance compared to Gaudi 2. Additionally, it offers double the network bandwidth and 1.5x the memory bandwidth of its predecessor, translating into heightened throughput for AI workloads and enhanced connectivity for clustered systems.
Intel claims Gaudi 3 delivers a 50% improvement in inferencing performance with approximately 40% greater power efficiency compared to NVIDIA’s H100, all at a lower cost. Other notable features include integrated SRAM cache with 96MB offering massive bandwidth and 128GB of HBM2e memory with peak bandwidth. With 24 RoCE compliant 200Gb Ethernet ports onboard, Gaudi 3 facilitates flexible on-chip networking without proprietary interfaces, enabling seamless integration into various system architectures.
Intel’s Gaudi 3 AI accelerator is available in multiple form factors, including Mezzanine cards and PCIe add-in cards, with systems adaptable for air or liquid cooling. The flexibility in configurations, combined with industry-standard Ethernet networking and Intel’s open software tools, empowers partners to tailor systems from single-node setups to powerful 1024-node clusters, achieving exascale computing performance.
Performance and Efficiency Claims Intel positions Gaudi 3 favorably against NVIDIA’s Hopper-based GPUs, projecting approximately 50% faster time-to-train across various models and substantial improvements in inference throughput and efficiency. The company expects Gaudi 3 to significantly outperform NVIDIA’s latest offerings, emphasizing its potential to revolutionize AI acceleration across diverse applications.
Expanding Xeon Portfolio and Future Roadmap In addition to Gaudi 3, Intel unveils plans for an open enterprise AI platform, aiming to accelerate the deployment of secure GenAI systems using retrieval augmented generation (RAG) techniques. This approach balances workloads between Xeon and Gaudi processors to build tailored, efficient systems aligned with partners’ requirements.
Intel also introduces the Xeon 6 family, featuring Sierra Forest and Granite Rapids architectures targeting edge, cloud, and compute-intensive workloads. Sierra Forest processors, with E-cores, offer significant performance per watt improvements and higher rack density, while Granite Rapids processors, with P-cores, deliver enhanced performance for AI applications.
Looking Ahead Intel anticipates OEM availability of Gaudi 3 in the second quarter, with volume production scheduled for Q3 (air-cooled) and Q4 (liquid-cooled). While details on the next-gen Gaudi processor remain scarce, Intel’s comprehensive strategy and performance enhancements signify its strong position in the evolving enterprise AI market.