ogkranthi

Navigating the Generative AI Landscape with Azure AI Services: Insights from Customer Round Table

Introduction The adoption of Generative AI (GenAI) technologies is accelerating, driven by the transformative potential they offer across various sectors. Recently, we hosted a customer round table conference at Microsoft Build, focusing on the journeys and adoption of GenAI using Azure AI services. Industry leaders from diverse fields shared their experiences, challenges, and strategies, providing […]

Navigating the Generative AI Landscape with Azure AI Services: Insights from Customer Round Table Continue Reading

Maximizing Performance: Leveraging PTUs with Client Retry Mechanisms in LLM Applications

Introduction Achieving maximum performance in PTU environments requires sophisticated handling of API interactions, especially when dealing with rate limits (429 errors). This blog post introduces a technique that exemplifies how to maintain optimal performance using Azure OpenAI’s API by intelligently managing rate limits. This method strategically switches between PTU and Standard deployments, enhancing throughput and

Maximizing Performance: Leveraging PTUs with Client Retry Mechanisms in LLM Applications Continue Reading

Load Testing RAG based Generative AI Applications

When developing applications for Language Models (LLMs), we usually spend a lot of time on both the development and evaluation phases to ensure the app delivers high-quality responses that are not only accurate but also safe for users. However, a great user experience with an LLM application isn’t just about the quality of responses—it’s also about

Load Testing RAG based Generative AI Applications Continue Reading