What’s new in Azure AI Language: BUILD 2024


At Azure AI Language, we believe that language is at the core of human and artificial intelligence. As part of Azure that offers a comprehensive suite of services and tools for AI developers, Azure AI Language is a service that empowers developers to build intelligent natural language solutions that leverage a set of state-of-the-art language models, including Z-Code++, fine-tuned GPT and more. While LLMs in Azure OpenAI and model catalog are good for general purposes, Azure AI Language provides a set of prebuilt and customizable natural language capabilities that are fine-tuned and optimized for a wide range of scenarios, such as Personal Identifier Information (PII) detection, document and conversation summarization, text analytics for healthcare domain, conversational intent identification, etc., with leading quality and cost efficiency.  These capabilities are available through a unified API that simplifies the integration and orchestration of natural language capabilities with no need of complex prompt engineering.

Today, we're thrilled to announce more new features and capabilities designed to make your workflow more seamless and efficient than ever before at this year's Microsoft Build with the following key highlights: 1) a unified experience for Azure AI Language in Azure AI Studio and improved integration with prompt flow, 2) improvements in existing prebuilt features such as Summarization, PII and NER, and 3) enhancements in custom features, especially in Conversational Language Understanding (CLU) to provide intent identification and entity extraction with higher quality in more regions.


Azure AI Language now available in Azure AI Studio and prompt flow

As part of Azure AI services, Azure AI Language now supports the new Azure AI service resource type for prebuilt capabilities like summarization, Personally Identifiable Information (PII) detection, and many others. It lets you access all Azure AI services, including Language, Speech and Vision, etc., with one single resource, which makes it easier to integrate the AI capabilities from across Azure AI. In the next few months, we will also support the customization capabilities in Azure AI Language in Azure AI Studio.

We are excited to introduce Azure AI Language in Azure AI Studio with two new playgrounds for you to try out: Summarization and Personally Identifiable Information detection. Both help infuse generative AI into your solutions. In Azure AI Studio, you have more options to try out and explore use them effectively for your needs.


Prompt flow in Azure AI Studio is a development tool designed to streamline the entire development cycle of AI applications. We are happy to announce that Language's prompt flow tooling is now available in Azure AI prompt flow gallery. With that, you can explore and use various natural language processing features from Azure AI Language in prompt flow. You can quickly start to make use of Azure AI Language, reduce your time to value, and deploy solutions with reliable evaluation.


What's new in prebuilt features in Azure AI Language service

Azure AI Language's prebuilt capabilities enable customers to set up and running quickly without the need for model training. These prebuilt services are designed to accelerate time-to-value through pretrained models optimized for specific Language AI tasks, including Personally Identifiable Information (PII), Named Entity Recognition (NER)SummarizationText Analytics for HealthLanguage DetectionKey Phrase Extraction and Sentiment Analysis and opinion mining, etc.

As we learned a lot of customers want to use Language AI to derive insights from native documents like Word docs and PDFs, to minimize the time and eliminates the need for data preprocessing, we have recently released a public preview of native documents support for PII detection and Summarization service. More file formats and capabilities will be added into the feature towards its GA.

Here is more information regarding what's new in Azure AI Language's prebuilt features:

Announcing GA general availability of Conversational PII

Azure AI Language's PII service can help to detect and protect an individual's identity and privacy in both generative and non-generative AI applications which are critical for highly regulated industries such as financial services, healthcare or government. This PII service also supports Protected Health Information (PHI) and Payment Card Industry (PCI) data, and it's available in 79 languages for around 30 general entity categories and more than 90 region-specific entity categories. By enabling users to identify, categorize, and redact sensitive information directly from complex text files, and native documents in .pdf, .docx and .txt file format, the PII service enables our customers to adhere to the highest standards of data privacy, security, and compliance with only 1 API call.


Today, we are excited to announce the general availability of conversational PII redaction in English-language contexts to further support customers looking to recognize and redact sensitive information in conversations, particularly now in speech transcriptions from meetings and calls for 6 recognized entity categories for conversations. Customers can now redact transcript, chat, and other text written in a conversational style (i.e. text with “um”s, “ah”s, multiple speakers, sensitive info in non-complete sentences, and the spelling out of words for more clarity) with better confidence in AI quality, Azure SLA support and production environment support, and enterprise-grade security in mind.


Conversational PII will be available starting in late June. Please see here for the full list of supported languages for the PII service and here for supported recognized for PII entities for conversation.

Enhanced address recognition for UK contexts with NER model updates

We are excited to share an updated NER model with improved AI quality and accuracy for both NER and PII detection. This model update will largely benefit location entities (e.g. addresses), finance entities (e.g. bank account numbers), and single letter spell outs where a speaker in a transcript may be spelling out a relevant entity (e.g. “M. I. CRO. S. O. F. and T”) where our new model shows improved F1 scores and decreased false positive recognitions. The updated model will be available starting in late June.

General availability of Recap summary for conversations in Summarization

Azure AI Language's Summarization service enables users to extract key points from the textual content and provide a comprehensive summary of documents or conversations. This service is powered by an ensemble of two sophisticated natural language models in which one is specifically trained for text extraction while the other fine-tuned GPT model is further optimized for text summarization without the need of any prompt engineering. In addition, Azure AI Language's Summarization service comes with built-in hallucination detection capability.

We appreciate customers' enthusiasm for Azure AI Language's Summarization service since we announced its general availability last year.  Document abstractive summarization and Conversation summarization capabilities are currently available in 6 regions and 11 languages whereas Custom Summarization is available in East US in English language. Please see Summarization region support article for the full list of supported regions, and Summarization language support article for supported languages.


Today, we are excited to announce the general availability of Recap summary for conversations in Azure AI Language service. This recap summary compresses a long conversation into one short paragraph and captures key information, which has been highly praised by preview customers, especially for many high-volume call center customers. Check out our product document to learn more about the key features in conversation summarization.

What's new in custom features in Azure AI Language service

Azure AI Language's custom capabilities empower customers to customize their multilingual models based on a few labeled examples according to their specific use case. These custom service include but are not limited to Custom Text ClassificationCustom Named Entity Recognition (NER), and Conversational Language Understanding (CLU). Powered by the state-of-the-art transformer models, Azure AI Language's custom multilingual models can be trained in one language and used for multiple other languages. In addition to custom features in Azure AI Language service, the advanced low-touch customization capability in Azure AI Language now also powers Azure AI Content Safety's Custom Category feature for custom content moderation.

As part of custom services in Azure AI Language, Conversational Language Understanding (CLU) enables reliable conversational AI experience with intent identification and entity extraction. Today, we are excited to announce three new features in CLU as follows:

Enhanced support for CLU applications to automate training data augmentation for diacritics

Today, we are introducing a suite of improvements to increase the AI quality of your CLU apps. Many customers already enjoy our training configuration that allows customers to train in one language and use the app in 100+ languages. Since many customers around the world use English keyboards to type in Germanic and Slavic languages, it can be more difficult to classify the utterance into the correct intent without diacritic characters. Because of this, we're excited to announce a new feature that allows you to automate the training data augmentation for diacritics. When this setting is enabled in your CLU project, CLU will automatically augment your training dataset to reduce the model's sensitivity to diacritic characters.


Derive more insights from additional granular entities in CLU applications

Many of our customers enjoy the ease of leveraging prebuilt entity recognition, like location, in their custom models. However, it can be helpful to know even more information about an entity phrase. We are excited to introduce more granular entities in CLU. So, for an utterance such as “New York”, you can now recognize more than just location, but also additional details such as city or state. Check out CLU supported prebuilt entity components for a full list of support prebuilt entities.

Improved CLU training configuration to address CLU model scoring inconsistencies

We have released a new CLU training configuration that is designed to address scoring inconsistencies, especially related to managing confidence scores and ‘None' intent classification for off-topic utterances. We are excited to see how this new training configuration (available in 2024-06-01-preview via REST API) improves your model's performance.

Availability of CLU authoring service in Azure US Government cloud

As our government and defense customers expand their use of conversational AI, the need for Azure AI in government-compliant clouds has grown, so we are announcing that CLU authoring service is now available in the Azure US Government cloud. This means that you can build, manage, and deploy your custom CLU models for government use cases with the same ease and functionality as in the public cloud.

We are looking forward to seeing how these new CLU capabilities will provide you with more flexibility and control, as you develop conversational AI solutions in your enterprise.


We look forward to seeing our customers use these capabilities to enhance productivity, summarize insights, protect data privacy and build intelligent chat experiences based on content in natural language. As always, Azure AI Language team remains committed to delivering innovative solutions that enable our customers to achieve their goals. We welcome your feedback as we strive to continuously improve and evolve our services with state-of-the-art AI models to offer the best managed and compliant natural language processing capabilities to our customers in Azure AI Language service.

Learn more about Azure AI Language in the following resources:


This article was originally published by Microsoft's Azure AI Services Blog. You can find the original article here.