Azure Cognitive Search and LangChain: A Seamless Integration for Enhanced Vector Search Capabilities

Content and LangChain integration credit to: Fabrizio Ruocco, Principal Tech Lead, AI Global Black Belt, Microsoft



In a fast-paced world, the ability to access relevant and accurate information quickly is critical for enhancing productivity and making informed decisions. With an ever-growing volume of digital data, being able to find the right piece of information has become a higher priority task. Thankfully, recent advancements in LLMs (Large Language Models) have transformed the landscape of information retrieval, making it more efficient and effective.

A significant breakthrough in this area is the development of embedding models, which have revolutionized the way we search for information. Unlike traditional keyword-based search methods, embedding models leverage the power of natural language to deliver more meaningful and contextually relevant results to end users. The embedding models work by converting words, phrases, or even entire documents into mathematical representations known as vectors. These vectors, which exist in a high-dimensional space, capture the meaning and relationships between different words and concepts.


What is vector search?

Vector search is a capability for indexing, storing, and retrieving vector embeddings from a search index. The vector search retrieval technique uses these vector representations to find and rank relevant results. By measuring the distance or similarity between the query vector embeddings and the indexed document vectors, vector search is capable of finding results that are contextually related to the query, even if they don't contain the exact same keywords.

You can use vector search to power similarity search, multi-modal search, recommendation engines, or applications implementing the Retrieval Augmented Generation (RAG) architecture.


Support for vector search in Azure Cognitive Search is in public preview and available through the 2023-07-01-Preview REST API, the Azure portal, and the more recent beta packages of the Azure SDKs for .NETPython, and JavaScript.


Vector search conceptual flow

To use vector search in Azure Cognitive Search, there are some steps that need to be followed for data ingestion and at query time.

Data ingestion steps

Here is a summary of the steps to prepare and load the data to the Cognitive Search index.

  1. Retrieve source documents from the data source. This can be accomplished by using Azure Cognitive Search built-in pull indexers or by building custom indexers through Azure Functions or Azure Logic Apps.
  2. Chunk your data before vectorizing it, since you need to account for embedding model token input limits and other model limitations.
  3. Since Cognitive Search doesn't generate embeddings at this time, your solution should include calls to an Azure OpenAI embedding model (or other embedding model) to create a vector representation of various content types (e.g., image, audio, text).
  4. Add a vector field in your index definition in Cognitive Search.
  5. Load the index with the document's payload containing the chunks' vector embeddings. Your index at this point should be now ready for querying.

You can index vector data as fields in documents alongside textual and other types of content.

Query time steps

In the same way your solution must contain calls to an embedding model to create the embeddings before you save them to an index, you need to also call the same embedding model to vectorize your search query before sending it to Cognitive Search.

Vector queries can be issued independently or in combination with other query types, including keyword queries (vector and keyword combination is called hybrid search) and filters in the same search request.

Here is the order you need to follow to perform the pure vector or hybrid search queries.

  1. Once the user submits the query in the client application, call the same Azure OpenAI embedding model (or other embedding model) used to create the vector embeddings that were saved initially in the index.
  2. Submit the vector or hybrid query to your Cognitive Search index.


Search modalities 

Some of the existing search modalities include the traditional full-text search (keyword search), and of course the subject of this article: vector search and hybrid search. You might be wondering when to use each approach, so here's some guidance.

Since vector search retrieves results that are contextually like the query, even if the exact keywords are not present in the index, it's ideal for complex and nuanced queries, as well as for situations where synonyms or related terms are used.

On the other hand, full-text search relies on matching specific terms within the query to terms in the indexed documents. This approach is simple, fast, and effective for straightforward queries where the desired results contain the exact terms used in the search, such as product and serial numbers, identifiers and similar terms. However, this traditional keyword search can fall short when it comes to understanding context or identifying semantically similar results.

In many cases, a hybrid search approach that combines the strengths of both vector and keyword search can provide the best results. By utilizing the contextual understanding of vector search and the precision of keyword search, a hybrid system can deliver highly relevant and accurate results across a wide range of query types. Also, Cognitive Search offers a re-ranker through semantic search that in multiple scenarios returns more relevant results by applying language understanding to the initial search result.

For a comparison table of the search modalities refer to Announcing Vector Search in Azure Cognitive Search Public Preview.

What is LangChain?

LangChain is a framework for developing applications powered by language models. It allows you to connect a language model to other sources of data, interact with its environment, and create sequences of calls to achieve specific tasks.

You can use LangChain to build applications such as chatbots, question-answering systems, natural language generation systems, and more.

LangChain provides modular components and off-the-shelf chains for working with language models, as well as integrations with other tools and platforms.

The framework provides multiple high-level abstractions such as document loaders, text splitter and vector stores.


Getting started with Azure Cognitive Search in LangChain

Where does LangChain fit in the Cognitive Search vector search story? The Azure Cognitive Search LangChain integration, built in Python, provides the ability to chunk the documents, seamlessly connect an embedding model for document vectorization, store the vectorized contents in a predefined index, perform similarity search (pure vector), hybrid search and hybrid with semantic search. It also provides configurability to create your own index and apply scoring profiles to achieve better search accuracy. With LangChain, you can combine native workflows (indexing and querying) with non-native workflows (like chunking and embedding) to create an end-to-end similarity search solution.

Here are the minimum set of code samples and commands to integrate Cognitive Search vector functionality and LangChain. The following samples are borrowed from the Azure Cognitive Search integration page in the LangChain documentation.

Install an Azure Cognitive Search SDK

pip install azure-search-documents==11.4.0b6
pip install azure-identity


Import the required libraries

import openai
import os
from langchain.embeddings.openai import OpenAIEmbeddings
from langchain.vectorstores.azuresearch import AzureSearch


Configure OpenAI settings

Configure the OpenAI settings to use Azure OpenAI or OpenAI:

os.environ["OPENAI_API_TYPE"] = "azure"
os.environ["OPENAI_API_VERSION"] = "2023-05-15"
model: str = "text-embedding-ada-002"

Configure vector store settings

Set up the vector store settings using the Azure Cognitive Search endpoint and admin key. You can retrieve those in the Azure portal:

vector_store_address: str = "YOUR_AZURE_SEARCH_ENDPOINT"
vector_store_password: str = "YOUR_AZURE_SEARCH_ADMIN_KEY"

Create embeddings and vector store instances

Create instances of the OpenAIEmbeddings and AzureSearch classes:

embeddings: OpenAIEmbeddings = OpenAIEmbeddings(deployment=model, chunk_size=1)
index_name: str = "langchain-vector-demo"
vector_store: AzureSearch = AzureSearch(

Insert text and embeddings into vector store

Chunks documents and adds the content (already vectorized) to the vector store:

from langchain.document_loaders import TextLoader
from langchain.text_splitter import CharacterTextSplitter
loader = TextLoader("path_to_your_file", encoding="utf-8")
documents = loader.load()
text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)
docs = text_splitter.split_documents(documents)

Perform a vector similarity search

Execute a pure vector similarity search using the similarity_search() method:

# Perform a similarity search
docs = vector_store.similarity_search(
    query="What did the president say about Ketanji Brown Jackson",

Perform a hybrid search

Execute hybrid search using the search_type or hybrid_search() method:

# Perform a hybrid search
docs = vector_store.similarity_search(
    query="What did the president say about Ketanji Brown Jackson",


For the full code and more samples for the LangChain and Cognitive Search vector search integration visit the official Azure Cognitive Search LangChain integration documentation.


This article was originally published by Microsoft's Azure AI Services Blog. You can find the original article here.