Elevate Your LLM Applications to Production via LLMOps


Today we are announcing the General Availability (GA) of Azure Machine Learning prompt flow, marking the next step in Azure empowering engineers and data scientists to build quality generative applications. Prompt flow was initially offered as a feature in Azure Machine Learning, and designed to streamline the prompt engineering process. Through our preview, we have enhanced prompt flow significantly, in large part based on customer feedback. We have refined it for enterprises, improving the tooling for developers to incorporate evaluation capabilities as well as leverage their existing CICD paths for applications by formalizing our asset management and code-first experiences. Through prompt tuning, evaluation, experimentation, orchestration, and LLMOps for , prompt flow significantly accelerates the journey from development to production, embodying a trusted platform for LLM application development & deployment. 

In the prompt flow preview blog,  we delved into the core functionalities of prompt flow that simplified the iterative process of prompt development, offering a structured workflow from ideation through refinement to deployment. The essence of prompt flow remains intact – providing a robust foundation for developers to harness the full potential of LLMs, tailored to specific business scenarios.

As we unveil the General Availability of prompt flow, we extend the horizon further with enhanced features aimed at operationalizing LLM applications via LLMOps, bolstering enterprise readiness, and fostering an open, collaborative ecosystem. This chapter unfolds the new avenues opened by the GA release, setting a solid steppingstone for building production ready LLM applications efficiently with confidence.


Figure 1 Prompt flow General Available highlights

Automate the LLM Apps Building Through Code and LLMOps

The endeavor of constructing Large Language Model (LLM) applications is significantly bolstered by a robust development environment. Prompt flow pioneers a code-first approach, treating flow assets as files, thereby promoting versioning and collaboration via any source code management tool. This foundation is pivotal as it seamlessly transitions into the realm of LLMOps, resonating with DevOps principles, to streamline the entire process from development to deployment. The essence of LLMOps lies in , , evaluation, and fostering a collaborative atmosphere among development, data science, and operations teams. This structured approach substantially reduces the time from development to deployment while ensuring the quality and reliability of LLM applications.

The prompt flow SDK and CLI, unveiled for review as open source since September, have now attained a significant milestone with the General Availability (GA) release, signifying its v1.0 for enhanced stability and backward compatibility. These tools are instrumental in:

: Manage various versions of your flow assets such as prompts, code, configurations, and environments via code repo, with the capability to track changes and rollback to previous versions when necessary, promoting a collaborative ethos among team members.

Package and Deploy: Leverage the SDK/CLI to package and build the flow, readying deployment artifacts for seamless deployment either in Azure or locally.

Workflow Automate: Utilize the SDK/CLI to automate various facets like connection setup, prompt tuning, flow evaluation, experimentation and CICD, thus significantly reducing the requisite time and effort.


Figure 2 Use prompt flow for LLMOps to automate the LLM application building

To further expedite the LLM application development journey, a solution accelerator template for LLMOps, based on the prompt flow SDK/CLI, has been crafted. This template provides a predefined, customizable workflow blueprint, empowering developers to swiftly kickstart their LLM application projects with embedded best practices. The template encompasses:

Preconfigured Workflows: Featuring preconfigured workflows (Azure DevOps pipelines or GitHub actions) for prompt engineering, evaluation, and deployment, which substantially reduce setup time.

Customization: Tailor the template to align with your specific requirements, organizational goals, and workflow preferences.

Seamless Integration: The template integrates impeccably with the Prompt Flow SDK/CLI, presenting a comprehensive solution for efficient LLM application development and deployment.

For a deeper dive into the LLMOps solution accelerator, feel free to explore here.

Secure Your LLM Apps Development, Experimentation, and Deployment via Enterprise Readiness

Enterprise readiness is paramount for the seamless integration and operation of LLM applications within an organizational ecosystem, encompassing security, compliance, scalability, and manageability.

Security: As a core component of Azure Machine Learning, prompt flow supports comprehensive enterprise security and governance measures including access control, virtual (VNet), and data . For a deeper understanding, explore Enterprise security and governance – Azure Machine Learning | Microsoft Learn .

Manageability: Facilitates ease of management and maintenance of the flow development and deployment environment for admin and operation teams. Discover more on creating a secure workspace and resources via templates here.

Scalability: Prompt flow provides the tools and infrastructure to scale your LLM applications to meet the evolving demands of your enterprise. From team collaboration for flow development, evaluation, and experimentation, to large-scale deployments of flow endpoints backed by Azure Machine Learning.

These capabilities ensure a fortified environment, significantly enhancing the trust and confidence in developing, experimenting, and deploying LLM applications.

Build High Quality LLM Apps via Advanced Evaluation Experience

Building high-quality LLM applications necessitates an encompassing evaluation strategy, of which prompt flow is a cornerstone. Here's how prompt flow aligns with the crucial aspects of evaluation to ensure the development of superior LLM applications:

  1. Evaluation Data Management: Preparation, generation, and efficient management of evaluation data is pivotal.
  2. Evaluation Metrics: Employing a range of metrics like helpfulness, honesty, harmlessness, performance, and cost to measure various facets of LLM applications.
  3. Evaluation Methods and Visualization: Utilization of diverse evaluation methods, and visualization to compare results.
  4. Evaluation Journey: Engaging in in-development, pre-deployment (CI), or post-deployment evaluation (CE) to ensure the helpfulness and robustness of LLM applications.
  5. User Feedback Loop: Incorporating user feedback for continuous improvements.
  6. via LLMOps: Streamlining the evaluation process through automation to enhance efficiency and precision.

Prompt flow simplifies this evaluation process, aligning with an evolving evaluation maturity model derived from extensive customer engagement and feedback. Below are the key aspects of how prompt flow facilitates advanced evaluation experiences to build high-quality LLM applications:

Pre-built Evaluation Flow: Utilize pre-configured evaluation flows and metrics to initiate the evaluation process swiftly.

Customizable Evaluation Flow: Tailor the evaluation flows to meet your application specific needs and requirements, ensuring your LLM applications adhere to desired quality, safety and cost standards.

Evaluation Data and Experimentation Management: Efficiently manage your evaluation data and the experimentation process, maintaining a structured approach to assessment.

Multi-Evaluation Run Comparison: Compare different evaluation runs side by side with drill down to single instance to ascertain the progress and quality of your LLM applications, facilitating continuous improvement.


Figure 3 Compare and analyze flow evaluation results

Integrating Semantic Kernel's Intelligence with Prompt Flow Infrastructure

Semantic Kernel, an open-source SDK, empowers the orchestration of models and plugins, paving the way for sophisticated, knowledge-driven LLM applications. The integration with prompt flow melds robust infrastructure with intelligent capabilities, allowing developers to evaluation and deploy planners and plugins seamlessly while also incorporating deployed flow endpoints as plugins within Semantic Kernel for advanced orchestration.

Automated Evaluation and Deployment: Utilize prompt flow to automate the testing and evaluation of planners and plugins built with Semantic Kernel. Create new flows, run batch tests, and quantifiably measure the accuracy of your planners and plugins. Prompt flow also facilitates prompt engineering to enhance the quality of your planners, thereby ensuring they function as intended even as complexity grows.

Endpoint Deployment for Advanced Orchestration: Develop, evaluate, and deploy flows using prompt flow, and seamlessly integrate these deployed flow endpoints as plugins within Semantic Kernel. This enables the building of LLM applications with advanced orchestration, leveraging the robust evaluation and deployment capabilities of prompt flow alongside the orchestration powers of Semantic Kernel.


Figure 4 Leverage prompt flow to evaluate semantic kernel intelligent capabilities

Additionally, prompt flow supports integration with other LLM application frameworks, providing a versatile platform for evaluation, deployment, and monitoring, thereby fostering a broader scope of LLM application development and management.

Expanding Capabilities through Custom Tools and OSS Contributions

Prompt flow's extensible design invites developers to create custom tools, broadening its capabilities. Custom integrations enrich workflow and data flow, while community-driven tools augment functionality. Prompt flow's design is inherently extensible, enabling developers to create custom tools tailored for specific use cases. These tools, developed by the community, broaden prompt flow's capabilities following the guidelines on creating and using tool packages. Though not officially maintained or endorsed by the prompt flow team, these tools form a pivotal part of the ecosystem, facilitating enhanced connectivity and functionality. For more insights on building your own tools with prompt flow, feel free to explore here.

The open-source nature of prompt flow nurtures a collaborative development environment where community contributions are highly valued. These contributions can take the form of code enhancements, new feature proposals, or bug fixes, which collectively drive the continuous improvement of prompt flow. For more information to contribute to prompt flow OSS, feel free to explore here.

Conclusion: Unveiling a New Horizon for LLM Applications

As we unveil the General Availability of prompt flow, we open a realm of possibilities for AI developers to build, evaluate, and deploy LLM applications efficiently with confidence. Through robust LLMOps, enterprise readiness, comprehensive evaluation and experimentation, flexible tool development, and seamless integrations like with Semantic Kernel, we invite developers and the community to explore, contribute, and elevate the ecosystem of LLM application development to new pinnacles of innovation and collaboration.

Useful links


This article was originally published by Microsoft's AI - Machine Learning Blog. You can find the original article here.