Step-by-Step

Evaluate Small Language Models for RAG using Azure Prompt Flow (LLama3 vs Phi3)

Introduction: Recently, small language models have made significant progress in terms of quality and context size. These advancements have enabled new possibilities, making it increasingly viable to leverage these models for retrieval-augmented generation (RAG) use cases. Particularly in scenarios where cost sensitivity is a key consideration, small language models offer an attractive alternative.   This post […]

Evaluate Small Language Models for RAG using Azure Prompt Flow (LLama3 vs Phi3) Continue Reading

Evaluating Small Language Models for RAG using Azure Prompt Flow (LLama3 vs Phi3)

Introduction: Recently, small language models have made significant progress in terms of quality and context size. These advancements have enabled new possibilities, making it increasingly viable to leverage these models for retrieval-augmented generation (RAG) use cases. Particularly in scenarios where cost sensitivity is a key consideration, small language models offer an attractive alternative.   This post

Evaluating Small Language Models for RAG using Azure Prompt Flow (LLama3 vs Phi3) Continue Reading

The LLM Latency Guidebook: Optimizing Response Times for GenAI Applications

Co-authors: Priya Kedia, Julian Lee, Manoranjan Rajguru, Shikha Agrawal, Michael Tremeer Contributors: Ranjani Mani, Sumit Pokhariyal, Sydnee Mayers Generative AI applications are transforming how we do business today, creating new, engaging ways for customers to engage with applications. However, these new LLM models require massive amounts of compute to run, and unoptimized applications can run

The LLM Latency Guidebook: Optimizing Response Times for GenAI Applications Continue Reading

Improving RAG performance with Azure AI Search and Azure AI prompt flow in Azure AI Studio

Content authored by: Arpita Parmar    Introduction If you’ve been delving into the potential of large language models (LLMs) for search and retrieval tasks, you’ve probably encountered Retrieval Augmented Generation (RAG) as a valuable technique. RAG enriches LLM-generated responses by integrating relevant contextual information, particularly when connected to private data sources. This integration empowers the

Improving RAG performance with Azure AI Search and Azure AI prompt flow in Azure AI Studio Continue Reading

Deploy a Gradio Web App on Azure with Azure App Service: a Step-by-Step Guide

A teaser image generated by DALL E 2 Context Gradio is an open-source Python package that you can use for free to create a demo or web app for your machine learning model, API, Azure AI Services integration or any Python function. You can run Gradio in Python notebooks or on a script. A Gradio

Deploy a Gradio Web App on Azure with Azure App Service: a Step-by-Step Guide Continue Reading

A Heuristic Method of Merging Cross-Page Tables based on Document Intelligence Layout Model

Introduction Tables contain valuable structured information for businesses to manage, share  and analyze data, make informed decisions, and increase efficiency. Cross-page tables are common especially in lengthy or dense documents. Azure AI Document Intelligence Layout model extracts tables within each page, effectively parsing the table may require reconstituting the extracted tables into a single table. This

A Heuristic Method of Merging Cross-Page Tables based on Document Intelligence Layout Model Continue Reading

Ingesting Non-Microsoft Cloud Security Data into Microsoft Sentinel for Gov & DIB customers part 2

Ingesting AWS Commercial and GovCloud data into Azure Government Sentinel This blog will be focusing on how to ingest AWS Commercial and AWS GovCloud data into a Microsoft Sentinel workspace in Azure Government. This picture provides a high-level visual of the architecture we will walk through in this part of the blog series.  Overview of

Ingesting Non-Microsoft Cloud Security Data into Microsoft Sentinel for Gov & DIB customers part 2 Continue Reading

How to Customize an LLM: A Deep Dive to Tailoring an LLM for Your Business

Introduction    In the world of large language models, model customization is key. It’s what transforms a standard model into a powerful tool tailored to your business needs.  Let’s explore three techniques to customize a Large Language Model (LLM) for your organization: prompt engineering, retrieval augmented generation (RAG), and fine-tuning.  In this blog you will learn

How to Customize an LLM: A Deep Dive to Tailoring an LLM for Your Business Continue Reading

Leveraging Cohere Embed V3 int8 embeddings with Azure AI Search

Last week, we announced our partnership with Cohere, enabling customers to easily leverage Cohere models via Azure AI Studio Model Catalog, including Cohere’s latest LLM – Command R+. Today, we are thrilled to announce that you can store and search over Cohere‘s latest Embed V3 int8 embeddings using Azure AI Search. This capability offers significant

Leveraging Cohere Embed V3 int8 embeddings with Azure AI Search Continue Reading

Building a Document Intelligence Custom Classification Model with the Python SDK

  Introduction: In the world of document processing and automation, one of the most frequent use cases is categorizing and organizing documents into predefined classes. For instance, an organization may have a process that ingests documents that then need to be classified into separate categories such as “invoices”, “contracts”, “reports”, etc. Azure AI Document Intelligence

Building a Document Intelligence Custom Classification Model with the Python SDK Continue Reading