Azure OpenAI Service Launches GPT-4 Turbo and GPT-3.5-Turbo-1106 Models

keijik_0-1700225776329.png

At Microsoft Ignite 2023, Satya Nadella announced the imminent launch of the most advanced OpenAI generative models, GPT-4 Turbo and GPT-3.5 Turbo 1106 on Azure.  Today, we're thrilled to announce the global availability of GPT-4 Turbo and GPT-3.5 Turbo 1106 on Azure OpenAI Service, unlocking leading cost performance and generative capabilities for businesses to revolutionize their workflows.

GPT-4 Turbo (gpt-4-1106-preview) and GPT-3.5 Turbo 1106 (gpt-35-turbo-1106) are available to all Azure OpenAI customers immediately.

Model Regions
GPT-4 Turbo
(gpt-4-1106-preview)
Australia East
Canada East
East US 2
France Central

Norway East

South India
Sweden Central
UK South

West US

GPT-3.5 Turbo 1106
(gpt-35-turbo-1106)
Australia East
Canada East
France Central

South India
Sweden Central
UK South

West US

Launching simultaneously with the new models are 3 new Azure OpenAI regions: Norway East, South India, and West US.  This brings the total number of Azure OpenAI regions to 14, offering customers unparalleled global accessibility of the most advanced generative models.

Prepare for a transformative leap with the public preview of GPT-4 Turbo. This model offers lower pricing, extended prompt length, tool use, and structured JSON formatting, delivering improved efficiency and control.

GPT-4 Turbo is more capable and has knowledge of world events up to April 2023.  It has a 128K context window so your applications benefit from a lot more custom data tailored to your use case using techniques like RAG (Retrieval Augmented Generation).

GPT-4 Turbo is available to all Azure OpenAI customers immediately.  GPT-4 Turbo pricing is 3x most cost effective for input tokens and 2x more cost effective for output tokens compared to GPT-4, while offering more than 15x the context window.  To deploy GPT-4 Turbo from the Studio UI, select “gpt-4” and then select version “1106-preview” in the version dropdown.  Version 1106-preview has separate quota from the existing versions of GPT-4, enabling customers to start experimenting with it immediately without impacting existing GPT-4 deployments.  

Model Input Output
gpt-4-1106-preview $0.01 / 1000 tokens $.03 / 1000 tokens

Improved Function Calling

Function calling, launched in June 2023, enables builders to use Generative AI to connect applications to external tools using API calls.  GPT-4 Turbo improves the ability to generate function calls based on user natural language inputs.  In addition, GPT-4 Turbo offers the ability to generate multiple function and tool calls in parallel, so that applications can use external systems more efficiently.

JSON Mode

GPT-4 Turbo also introduces JSON Mode, which improves on GPT-4's ability to generate correctly formatted JSON output to interoperate with software systems.  This is a highly requested feature for builders using OpenAI models to work with their applications.  You can use JSON Mode by settings response_format to { “type”: “json_object” }.

Reproducible Output

Generative AI models like GPT-4 Turbo generate their outputs probabilistically.  In a wide variety of cases, this non-determinism is a benefit, enabling desirable outcomes like creative prose and imaginative drawings.  Application builders sometimes want more predictable output from similar inputs.  The new seed parameter in GPT-4 Turbo gives builders more control over the language model output.

Preview

The first version of GPT-4 Turbo, gpt-4-1106-preview, is in preview and will be replaced with a stable production-ready version in the coming weeks.  Customer deployments of gpt-4-1106-preview will be automatically updated with the GA version of GPT-4 Turbo when we launch the stable version.

GPT-3.5 Turbo 1106 brings the same new advanced capabilities as GPT-4 Turbo such as improved function calling and JSON Mode in the wildly popular GPT-3.5 Turbo format.  GPT-3.5 Turbo 1106 will become the new default GPT-3.5 Turbo model in the coming weeks, featuring 16K context window at an attractive price.

GPT-3.5 Turbo 1106 is generally available to all Azure OpenAI customers immediately.  GPT-3.5 Turbo pricing is 3x most cost effective for input tokens and 2x more cost effective for output tokens compared to GPT-3.5 Turbo 16k. To deploy GPT-3.5-Turbo 1106 from the Studio UI, select “gpt-35-turbo” and then select version “1106” from the dropdown.  Version 1106 has separate quota from the existing versions of GPT-3.5 Turbo, enabling customers to start experimenting with it immediately without impacting existing GPT-3.5 deployments.

Model Input Output
gpt-35-turbo-1106 $0.001 / 1000 tokens $.002 / 1000 tokens

Get started building with GPT-4 Turbo and GPT-3.5-Turbo 1106 today!  We will be making these highly capable and more cost effective models more widely available in the coming weeks, including availability with Provisioned Throughput.  We can't wait to see what you build!

 

This article was originally published by Microsoft's Azure AI Services Blog. You can find the original article here.