What’s new in Azure AI Translator: document translation?

Seattle—June 18, 2024—Today, we are happy to announce new releases and enhancements to Azure Translator Service. We are introducing a new endpoint which unifies document translation async batch and sync operation and the SDKs are updated. Now, you can use and deploy document translation features in your organization to translate documents through Azure Studio and SharePoint without writing any code. Azure Translator container is now enhanced to translate both text and documents.

Overview

Document translation offers two operations: asynchronous batch and synchronous. Depending on the scenario, customers may use either operations or a combination of both. Today, we are delighted to announce that both operations have been unified and will share the same endpoint.

Asynchronous batch operation:                                                                                                  

  • Asynchronous batch translation supports the processing of multiple documents and large files. The batch translation processes source documents from an Azure Blob and uploads translated documents back into it.

The endpoint for the asynchronous batch operation is getting updated to:

{your-document translation-endpoint}/translator/document/batches?api-version=[Date]

The service will continue to support backward compatibility for the deprecated endpoint. We recommend new customers adapt the latest endpoint as new functions in the future will be added to the same.

Synchronous operation:

  • Synchronous operation supports the processing of single document translation. It accepts source document as part of the request body, processes the document in memory and return translated document as part of the response body.
{your-document translation-endpoint}/translator/document:translate?api-version=[Date]

This unification is aimed to provide customers with consistency and simplicity while using either of the document translation operations.

Updated SDK

The updated document translation SDK supports both asynchronous batch operation and synchronous operation. Here's how you can leverage it:

To run a translation operation for a document, you need a Translator endpoint and credentials. You can use the DefaultAzureCredential to try a number of common methods optimized for both running as a service and development. The samples below uses a Translator API key credential by creating an AzureKeyCredential object. You can set endpoint and apiKey based on an environment variable, a configuration setting, or any way that works for your application.

Asynchronous batch method:

Creating a DocumentTranslationClient

string endpoint = "";
string apiKey = "";
SingleDocumentTranslationClient client = new SingleDocumentTranslationClient(new Uri(endpoint), new AzureKeyCredential(apiKey));

To Start a translation operation for documents in a blob container, call StartTranslationAsync. The result is a Long Running operation of type DocumentTranslationOperation which polls for the status of the translation operation from the API. To call StartTranslationAsync you need to initialize an object of type DocumentTranslationInput which contains the information needed to translate the documents.

Uri sourceUri = new Uri("");
Uri targetUri = new Uri("");
var input = new DocumentTranslationInput(sourceUri, targetUri, "es");

DocumentTranslationOperation operation = await client.StartTranslationAsync(input);

await operation.WaitForCompletionAsync();

Console.WriteLine($"  Status: {operation.Status}");
Console.WriteLine($"  Created on: {operation.CreatedOn}");
Console.WriteLine($"  Last modified: {operation.LastModified}");
Console.WriteLine($"  Total documents: {operation.DocumentsTotal}");
Console.WriteLine($"    Succeeded: {operation.DocumentsSucceeded}");
Console.WriteLine($"    Failed: {operation.DocumentsFailed}");
Console.WriteLine($"    In Progress: {operation.DocumentsInProgress}");
Console.WriteLine($"    Not started: {operation.DocumentsNotStarted}");

await foreach (DocumentStatusResult document in operation.Value)
{
    Console.WriteLine($"Document with Id: {document.Id}");
    Console.WriteLine($"  Status:{document.Status}");
    if (document.Status == DocumentTranslationStatus.Succeeded)
    {
        Console.WriteLine($"  Translated Document Uri: {document.TranslatedDocumentUri}");
        Console.WriteLine($"  Translated to language code: {document.TranslatedToLanguageCode}.");
        Console.WriteLine($"  Document source Uri: {document.SourceDocumentUri}");
    }
    else
    {
        Console.WriteLine($"  Error Code: {document.Error.Code}");
        Console.WriteLine($"  Message: {document.Error.Message}");
    }
}

Synchronous method:

Creating a SingleDocumentTranslationClient

string endpoint = "";
string apiKey = "";
SingleDocumentTranslationClient client = new SingleDocumentTranslationClient(new Uri(endpoint), new AzureKeyCredential(apiKey));

To start a synchronous translation operation for a single document, call DocumentTranslate. To call DocumentTranslate you need to initialize an object of type MultipartFormFileData which contains the information needed to translate the documents. You would need to specify the target language to which the document must be translated to.

try
{
    string filePath = Path.Combine("TestData", "test-input.txt");
    using Stream fileStream = File.OpenRead(filePath);
    var sourceDocument = new MultipartFormFileData(Path.GetFileName(filePath), fileStream, "text/html");
    DocumentTranslateContent content = new DocumentTranslateContent(sourceDocument);
    var response = client.DocumentTranslate("hi", content);

    var requestString = File.ReadAllText(filePath);
    var responseString = Encoding.UTF8.GetString(response.Value.ToArray());

    Console.WriteLine($"Request string for translation: {requestString}");
    Console.WriteLine($"Response string after translation: {responseString}");
}
catch (RequestFailedException exception)
{
    Console.WriteLine($"Error Code: {exception.ErrorCode}");
    Console.WriteLine($"Message: {exception.Message}");
}

Ready to use solution in Azure AI Studio

Customers can easily build apps for their document translation needs using the SDK. One such example is the document translation tool in the Azure AI Studio, which was announced to be generally available at //build 2024. Here is a glimpse of how you may translate documents in this user interface:

AI Studio screenshot.png

SharePoint document translation

The document translation integration in SharePoint lets you easily translate a selected file or a set of files into a SharePoint document library. This feature lets you translate files of different types either manually or automatically by creating a rule.

Sharepoint screenshot.png

Learn more about the SharePoint integration here.

You can also use the translation feature for translating video transcripts and closed captioning files. More information here.

Document translation in container is generally available

In addition to the above updates, earlier this year, we announced the release of document translation and transliteration features for Azure AI Translator containers as preview. Today, both capabilities are generally available. All Translator container customers will get these new features automatically as part of the update.

Translator provide users with the capability to host the Azure AI Translator API on their own infrastructure and include all libraries, tools, and dependencies needed to run the service in any private, public, or personal computing environment. They are isolated, lightweight, portable, and are great for implementing specific security or data governance requirements.

With that update, the following are the operations that are now supported in Azure AI Translator :

  • Text translation: Translate the text phrases between supported source and target language(s) in real-time.
  • Text transliteration: Converts text in a language from one script to another script in real-time. 
  • Document translation: Translate a document between supported source and target language while preserving the original document's content structure and format.

References

 

This article was originally published by Microsoft's Azure AI Services Blog. You can find the original article here.