Scale generative AI with new Azure AI infrastructure advancements and availability

Generative is a powerful and transformational technology that has the potential to advance a wide range of industries from manufacturing to retail, and financial services to healthcare. Our early investments in hardware and AI infrastructure are helping customers to realize the efficiency and innovation generative AI can deliver. Our Azure AI infrastructure is the backbone of how we scale our offerings, with Azure OpenAI Service at the forefront of this transformation, providing developers with the systems, tools, and resources they need to build next-generation, AI-powered applications on the Azure platform. With generative AI, users can create richer user experiences, fuel innovation, and boost productivity for their businesses. 

As part of our commitment to bringing the transformative power of AI to our customers, today we're announcing updates to how we're empowering businesses Azure AI infrastructure and applications. With the global expansion of Azure OpenAI Service, we are making OpenAI's most advanced models, GPT-4 and GPT-35-Turbo, available in multiple new regions, providing businesses worldwide with unparalleled generative AI capabilities. Our Azure AI infrastructure is what powers this scalability, which we continue to invest in and expand. We're also delivering the general availability of the ND H100 v5 Virtual Machine series, equipped with NVIDIA H100 Tensor Core graphics processing units (GPUs) and low-latency networking, propelling businesses into a new era of AI applications.

Here's how these advancements extend Microsoft's unified approach to AI across the stack.

General availability of ND H100 v5 Virtual Machine series: Unprecedented AI processing and scale

Today marks the general availability of our Azure ND H100 v5 (VM) series, featuring the latest NVIDIA H100 Tensor Core GPUs and NVIDIA Quantum-2 InfiniBand networking. This VM series is meticulously engineered with Microsoft's extensive experience in delivering supercomputing performance and scale to tackle the exponentially increasing complexity of cutting-edge AI workloads. As part of our deep and ongoing investment in generative AI, we are leveraging an AI optimized 4K GPU and will be ramping to hundreds of thousands of the latest GPUs in the next year.

The ND H100 v5 is now available in the East United States and South Central United States Azure regions. Enterprises can register their interest in access to the new VMs or review technical details on the ND H100 v5 VM series at Microsoft Learn.

The ND H100 v5 VMs include the following features today:

  • AI supercomputing GPUs: Equipped with eight NVIDIA H100 Tensor Core GPUs, these VMs promise significantly faster AI model performance than previous generations, empowering businesses with unmatched computational power.
  • Next-generation computer processing unit (CPU): Understanding the criticality of CPU performance for AI training and inference, we have chosen the 4th Gen Intel Xeon Scalable processors as the foundation of these VMs, ensuring optimal processing speed.
  • Low-latency networking: The inclusion of NVIDIA Quantum-2 ConnectX-7 InfiniBand with 400Gb/s per GPU with 3.2 Tb/s per VM of cross-node bandwidth ensures seamless performance across the GPUs, matching the capabilities of top-performing supercomputers globally.
  • Optimized host to GPU performance: With PCIe Gen5 providing 64GB/s bandwidth per GPU, Azure achieves significant performance advantages between CPU and GPU.
  • Large scale memory and memory bandwidth: DDR5 memory is at the core of these VMs, delivering greater data transfer speeds and efficiency, making them ideal for workloads with larger datasets.
  • These VMs have proven their performance prowess, with up to six times more speedup in matrix multiplication operations when using the new 8-bit FP8 floating point data type compared to FP16 in previous generations. The ND H100 v5 VMs achieve up to two times more speedup in large language models like BLOOM 175B end-to-end model inference, demonstrating their potential to optimize AI applications further.

Azure OpenAI Service goes global: Expanding cutting-edge models worldwide

We are thrilled to announce the global expansion of Azure OpenAI Service, bringing OpenAI's cutting-edge models, including GPT-4 and GPT-35-Turbo, to a wider audience worldwide. Our new live regions in Australia East, Canada East, East United States 2, Japan East, and United Kingdom South extend our reach and support for organizations seeking powerful generative AI capabilities. With the addition of these regions, Azure OpenAI Service is now available in even more locations, complementing our existing availability in East United States, France Central, South Central United States, and West Europe. The response to Azure OpenAI Service has been phenomenal, with our customer base nearly tripling since our last disclosure. We now proudly serve over 11,000 customers, attracting an average of 100 new customers daily this quarter. This remarkable growth is a testament to the value our service brings to businesses eager to harness the potential of AI for their unique needs.

As part of this expansion, we are increasing the availability of GPT-4, Azure OpenAI's most advanced generative AI model, across the new regions. This enhancement allows more customers to leverage GPT-4's capabilities for content generation, document intelligence, customer service, and beyond. With Azure OpenAI Service, organizations can propel their operations to new heights, driving innovation and transformation across various industries.

A responsible approach to developing generative AI

Microsoft's commitment to responsible AI is at the core of Azure AI and . The AI platform incorporates robust safety systems and leverages human feedback mechanisms to handle harmful inputs responsibly, ensuring the utmost protection for users and end consumers. Businesses can apply for access to Azure OpenAI Service and unlock the full potential of generative AI to propel their operations to new heights.

We invite businesses and developers worldwide to join us in this transformative journey as we lead the way in AI innovation. Azure OpenAI Service stands as a testament to Microsoft's dedication to making AI accessible, scalable, and impactful for businesses of all sizes. Together, let's embrace the power of generative AI and Microsoft's commitment to responsible AI practices to drive positive impact and growth worldwide.

Customer inspiration

Generative AI is revolutionizing various industries, including content creation and design, accelerated , personalized marketing, customer service, chatbots, product and service innovation, language translation, autonomous driving, fraud detection, and predictive analytics. We are inspired by the way our customers are innovating with generative AI and look forward to seeing how customers around the world build upon these technologies.

Mercedes-Benz is innovating its in-car experience for drivers, powered by Azure OpenAI Service. The upgraded “Hey Mercedes” feature is more intuitive and conversational than ever before. KPMG, a global professional services firm, leverages our service to improve its service delivery model, achieve intelligent , and enhance the coding lifecycle. Wayve trains large scale foundational neural- for autonomous driving using Azure and Azure's AI infrastructure. Microsoft partner SymphonyAI launched Sensa Copilot to empower financial crime investigators to combat the burden of illegal activity on the economy and organizations. By automating data collection, collation, and summarization of financial and third-party information, Sensa Copilot identifies money laundering behaviors and facilitates quick and efficient analysis for investigators. Discover all Azure AI and ML customer stories.

Learn more

Resources and getting started with Azure AI  

Azure AI Portfolio 

Azure AI Infrastructure 

Azure OpenAI Service 

The post Scale generative AI with new Azure AI infrastructure advancements and availability appeared first on Azure Blog.


This article was originally published by Microsoft's Azure Blog. You can find the original article here.