Affordable Innovation: Unveiling the Pricing of Phi-3 SLMs on Models as a Service

At this year's Microsoft Build, we introduced the Phi-3 series of small language models (SLMs), a groundbreaking addition to our Azure model catalog. The Phi-3 models, which include Phi-3-mini, Phi-3-small and Phi-3-medium, represent a significant advancement in the realm of generative , designed to deliver large model performance in a compact, efficient package.

The power of Phi-3 models

The Phi-3 series stands out by offering the capabilities of significantly larger models while requiring far less computational power. This makes Phi-3 models ideal for a wide range of applications, from enhancing mobile apps to powering devices with stringent energy requirements. These models support extensive context lengths—up to 128K tokens—pushing the boundaries of what small models can achieve.

Features and Benefits

  1. Versatility and Scalability: Phi-3 models are versatile across various NLP tasks, including text generation, summarization, and more complex language understanding tasks, making them adaptable to both commercial and academic uses.
  2. Optimized Performance: Designed for efficiency, these models excel in environments where quick response times are crucial without sacrificing the quality of outcomes.
  3. Cost-Effectiveness: By optimizing the quality-cost curve, Phi-3 models ensure that users can deploy cutting-edge without the high resource costs typically associated with large models.
  4. Ease of Integration: Available on Azure AI Studio, Hugging Face and Ollama, these models can be seamlessly integrated into existing systems, allowing developers to leverage their capabilities with minimal setup.

Pricing and Availability

Experience the efficiency and agility of Phi-3 small language models on Azure AI model catalog through Pay-As-You-Go (PAYGO) offering via Serverless APIs. PAYGO allows you to pay only for what you use, perfect for managing costs without compromising on performance. For consistent throughput and minimal latency, Phi-3 models offer competitive pricing per unit, providing you with a clear and predictable cost structure. The pricing starts on June 1st, 2024 at 00:00 am UTC i.e. 05:00 pm PST on May 31st, 2024.

These models are available in East US2 and Sweden Central regions.

Models

Context

Input (Per 1,000   tokens)

Output (Per 1,000 tokens)

Phi-3-mini-4k-instruct

4K

0.00028

0.00084

Phi-3-mini-128k-instruct

128K

0.0003

0.0009

Phi-3-small-8K-instruct

8K

0.00032

0.00096

Phi-3-small-128K-instruct

128K

0.00035

0.00105

Phi-3-medium-4k-instruct

4K

0.00045

0.00135

Phi-3-medium-128k-instruct

128K

0.0005

0.0015

Stay tuned for more updates on Phi-3, and prepare to transform your applications with the efficiency, versatility, and power of Phi-3 small language models. For more information, visit our product page or contact our sales team to see how Phi-3 can fit into your technology stack.

 

This article was originally published by Microsoft's AI - Machine Learning Blog. You can find the original article here.