Create Your First Visual Fashion Agent Using AOAI and AI Search – Search Product Catalog Images

Search Product Catalog Images Using Azure Search and OpenAI with Langchain

In the ever-evolving landscape of retail, businesses are continually seeking innovative solutions to streamline their operations and enhance customer experiences. One such breakthrough is the implementation of artificial intelligence () to search product catalog images efficiently. This transformative technology not only simplifies the search process but also empowers businesses to provide personalized and seamless shopping experiences for their customers.






The Need for in Product Catalog Image Search: Traditional methods of searching through product catalogs involve manual tagging and categorization, which can be time-consuming and prone to human error. As the volume of products in a catalog grows, managing and searching for specific items becomes a daunting task. , particularly computer vision, addresses these challenges by automating the recognition and categorization of products in images.

Key Features of AI-Powered Product Catalog Image Search:

  1. Object Recognition and Tagging: AI algorithms can identify and tag objects within images, providing accurate and consistent categorization of products. This reduces the reliance on manual tagging, ensuring that products are correctly labeled in the catalog.
  2. Visual Similarity Search: AI enables visual similarity search, allowing users to find products based on visual attributes rather than relying solely on text-based queries. This feature is especially valuable for customers who may struggle to describe a product in words but can easily recognize it visually.
  3. Enhanced Product Discovery: By understanding the visual characteristics of products, AI facilitates a more sophisticated recommendation system. Customers can discover related or complementary items, leading to increased cross-selling opportunities and a more engaging shopping experience.
  4. Improved Accuracy and Efficiency: AI-powered image recognition is highly accurate and can process large volumes of images in a fraction of the time it would take a human. This efficiency not only reduces operational costs but also enhances the speed at which customers can find and purchase products.
  5. Integration with E-Commerce Platforms: AI-driven image search can seamlessly integrate with existing e-commerce platforms, making it easy for businesses to adopt this technology without major disruptions. This integration allows for a smoother transition and ensures that the AI-enhanced search becomes an integral part of the overall shopping experience.

Now lets try to implement this with Azure OpenAI.

Firs you need to import some libraries


import azure.cognitiveservices.speech as speechsdk
import datetime
import io
import json
import math
import matplotlib.pyplot as plt
import numpy as np
import openai
import os
import random
import requests
import sys
import time

from azure.core.credentials import AzureKeyCredential
from import SearchClient
from import SearchIndexClient
from import SearchIndexerClient
from import (
from import BlobServiceClient, generate_blob_sas, BlobSasPermissions
from azure.cognitiveservices.speech import (
from import AudioOutputConfig
from import VectorizedQuery,VectorizableTextQuery

from dotenv import load_dotenv
from io import BytesIO
from IPython.display import Audio
from PIL import Image
import os
import base64
import re
from datetime import datetime, timedelta

import requests
import os
from tenacity import (
import json
import mimetypes


Initiate some environmental variable for your

  • Azure OpenAI Endpoint
  • Azure Cognitive Service End point
  • Azure Search End point
# Azure Open AI
openai_api_type = os.getenv("azure")
openai_api_base = os.getenv("AZURE_OPENAI_ENDPOINT")
openai_api_version = os.getenv("AZURE_API_VERSION")
openai_api_key = os.getenv("AZURE_OPENAI_KEY")

# Azure Cognitive Search
acs_endpoint = os.getenv("ACS_ENDPOINT")
acs_key = os.getenv("ACS_KEY")

# Azure Computer Vision 4
acv_key = os.getenv("ACV_KEY")
acv_endpoint = os.getenv("ACV_ENDPOINT")

blob_connection_string = os.getenv("BLOB_CONNECTION_STRING")
container_name = os.getenv("CONTAINER_NAME")

# Azure Cognitive Search index name to create
index_name = "azure-fashion-demo"

# Azure Cognitive Search api version
api_version = "2023-02-01-preview"

Now lets create a function to create text embedding using vision API


def text_embedding(prompt):
    Text embedding using Azure Computer Vision 4.0
    version = "?api-version=" + api_version + "&modelVersion=latest"
    vec_txt_url = f"{acv_endpoint}/computervision/retrieval:vectorizeText{version}"
    headers = {"Content-type": "application/json", "Ocp-Apim-Subscription-Key": acv_key}
    payload = {"text": prompt}
    response =, json=payload, headers=headers)

    if response.status_code == 200:
        text_emb = response.json().get("vector")
        return text_emb
        print(f"Error: {response.status_code} - {response.text}")
        return None


Lets Now lets create a function to create Image embedding using vision API


def image_embedding(image_path):
    url = f"{acv_endpoint}/computervision/retrieval:vectorizeImage"  
    mime_type, _ = mimetypes.guess_type(image_path)
    headers = {  
        "Content-Type": mime_type,
        "Ocp-Apim-Subscription-Key": acv_key  
    for attempt in Retrying(
        wait=wait_random_exponential(min=15, max=60),
        with attempt:
            with open(image_path, 'rb') as image_data:
                response =, params=params, headers=headers, data=image_data)  
                if response.status_code != 200:  
    vector = response.json()["vector"]
    return vector


Next thing we require is to create a function which takes a text prompt as input and search Azure Search for most relevant images. Here Buy Now Link is a dummy link which can be replaced with actual product URL


def prompt_search(prompt, topn=5, disp=False):
    Azure Cognitive visual search using a prompt
    results_list = []
    # Initialize the Azure Cognitive Search client
    search_client = SearchClient(acs_endpoint, index_name, AzureKeyCredential(acs_key))
    blob_service_client = BlobServiceClient.from_connection_string(blob_connection_string)
    container_client = blob_service_client.get_container_client(container_name)
    # Perform vector search
    vector_query = VectorizedQuery(vector=text_embedding(prompt), k_nearest_neighbors=topn, fields="image_vector")
    response =
        search_text=prompt, vector_queries= [vector_query], select=["description"], top = 2
    for nb, result in enumerate(response, 1):
        blob_name = result["description"] + ".jpg"
        blob_client = container_client.get_blob_client(blob_name)
        image_url = blob_client.url
        sas_token = generate_blob_sas(
                                        expiry=datetime.utcnow() + timedelta(hours=1)
        sas_url = blob_client.url + "?" + sas_token
        results_list.append({"buy_now_link" : sas_url,"price_of_the_product": result["description"], "product_image_url": sas_url})
    return results_list


Lets ingest some Product Images to the Azure Search. Here we are basically the idea is we have folder called images having all the product images stored. We are basically creating a container and uploading all the images from the folder to the specific container.


EMBEDDINGS_DIR = "embeddings"
os.makedirs(EMBEDDINGS_DIR, exist_ok=True)
image_directory = os.path.join('images')
embedding_directory = os.path.join('embeddings')
output_json_file = os.path.join(embedding_directory, 'output.jsonl')

for root, dirs, files in os.walk(image_directory):
    for file in files:
        local_file_path = os.path.join(root, file)
        blob_name = os.path.relpath(local_file_path, image_directory)
        with open(local_file_path, "rb") as data:
            blob_client.upload_blob(data, overwrite=True)


Next we will create the embedding of the product images and store the same locally in the embedding directory. Point to note is that we have used only 2 metadata id and description. You can basically extend to many more metadata like price, buy now link etc.


with open(output_json_file, 'w') as outfile:
    for idx, image_path in enumerate(os.listdir(image_directory)):
        if image_path:
                vector = image_embedding(os.path.join(image_directory, image_path))
            except Exception as e:
                print(f"Error processing image at index {idx}: {e}")
                vector = None
            filename, _ = os.path.splitext(os.path.basename(image_path))
            result = {
                "id": f'{idx}',
                "image_vector": vector,
                "description": filename


print(f"Results are saved to {output_json_file}")


Now since have created the local embedding file , we can upload the same into a Azure Search. Before that lets create an index .


from import SearchIndexClient
from import (
credential = AzureKeyCredential(acs_key)
# Create a search index 
index_client = SearchIndexClient(endpoint=acs_endpoint, credential=credential)  
fields = [  
    SimpleField(name="id", type=SearchFieldDataType.String, key=True),  
    SearchField(name="description", type=SearchFieldDataType.String, sortable=True, filterable=True, facetable=True),  
# Configure the vector search configuration  
vector_search = VectorSearch(  
# Create the search index with the vector search configuration  
index = SearchIndex(name=index_name, fields=fields, vector_search=vector_search)  
result = index_client.create_or_update_index(index)  
print(f"{} created")


Once you have created the index , you can upload the locally stored index file.


from import SearchClient
import json

data = []
with open(output_json_file, 'r') as file:
    for line in file:
        # Remove leading/trailing whitespace and parse JSON
        json_data = json.loads(line.strip())

search_client = SearchClient(endpoint=acs_endpoint, index_name=index_name, credential=credential)
results = search_client.upload_documents(data)
for result in results:
    print(f'Indexed {result.key} with status code {result.status_code}')


Congratulations you have finally ready to implement your Agent using OpenAI

Lets create tool called image search which will be used by the Agent


from typing import Optional
from langchain_core.callbacks import CallbackManagerForToolRun
from import BaseTool
from util import prompt_search

class ImageSearchResults(BaseTool):
    """Tool that queries the Fashion Image Search API and gets back json."""

    name: str = "image_search_results_json"
    description: str = (
        "A wrapper around Image Search. "
        "Useful for when you need search fashion images related to cloth , shoe etc"
        "Input should be a search query. Output is a JSON array of the query results"
    num_results: int = 4

    def _run(
        query: str,
        run_manager: Optional[CallbackManagerForToolRun] = None,
    ) -> str:
        """Use the tool."""
        return str(prompt_search(prompt = query, topn=self.num_results))


Here we will be using Langchain to implement our Fashion Agent called Luca


from import (
from langchain_core.messages import AIMessage, HumanMessage, SystemMessage
from langchain_core.runnables import Runnable, RunnablePassthrough
from import format_tool_to_openai_function
from langchain_core.utils.function_calling import convert_to_openai_function
from langchain.agents.output_parsers.openai_functions import (
from langchain.agents.format_scratchpad.openai_functions import (
from langchain.agents import AgentExecutor
from langchain_openai import AzureChatOpenAI
from langchain_core.runnables import RunnableConfig

from custom_tool import ImageSearchResults
import openai


Lets initialize our LLM


from langchain_openai import AzureChatOpenAI
llm = AzureChatOpenAI(
llm(messages=[HumanMessage(content = "Hi")])
prefix="""You are Luca a helpful Fashion Agent who help people navigating and buying products online


 Show Prices always in INR
 Always try user to buy from the buy now link provided"""
suffix = ""


Lets attach tool we created, here we are using LCEL to implement out agent


tools = [ImageSearchResults(num_results=5)]
llm_with_tools = llm.bind(
    functions=[convert_to_openai_function(t) for t in tools]
messages = [
input_variables = ["input", "agent_scratchpad"]
prompt = ChatPromptTemplate(input_variables=input_variables, messages=messages)
agent = (
        agent_scratchpad=lambda x: format_to_openai_function_messages(
    | prompt
    | llm_with_tools
    | OpenAIFunctionsAgentOutputParser()


Congratulation !! You are ready to test your Agent


response = agent_executor.invoke(
        "input": "I am looking for some summer dress as I am travelling to new Delhi",
        "chat_history": [
            HumanMessage(content="hi! my name is bob"),
            AIMessage(content="Hello Bob! How can I assist you today?"),



Hurray !! You are now ready to deploy this Agent to a Enterprise App with some good looking UI.


Here is the reference github repo with all the code artifact.


Favor : Please clap if you like this and Follow me for more such content.



This article was originally published by Microsoft's Azure AI Services Blog. You can find the original article here.