AutoML Images & NLP gets new brains and becomes more open

AutoML for Images and AutoML for NLP, part of Azure (AzureML), are solutions that help to seamlessly build models for computer vision and natural language tasks. As a user, you only need to bring in the labeled data, and AutoML will find the best model for that dataset. Users can opt for an automated search or define specific models and hyperparameters for AutoML to explore.

While the existing array of models in AutoML encompasses a wide range in both vision and NLP, the AI landscape is moving at an astonishing pace, continually introducing novel architectures and models. Established platforms such as Hugging Face and MMDetection offer an extensive collection of models for various tasks, constantly expanding thanks to the contributions of the open-source community. Moreover, with the recent release of foundation models in Azure in a unified model catalog, Azure customers can easily access the latest foundation models and use these models for fine-tuning operationalization in their own specific workloads.

With this release, we are thrilled to bring all these capabilities to AutoML! Now, AutoML users can use any model from Hugging Face, MMDetection or AzureML curated models for the currently supported tasks. They can sweep over any of these models and their hyperparameters, as well as compare performance against existing models. As soon as a model becomes available in these open-source frameworks, AutoML users can seamlessly incorporate it into their projects.

New end-to-end architecture

In the current architecture, AutoML service uses a driver script for fine-tuning which is essentially a monolithic script that includes all phases of model training such as model selection, data preprocessing, finetuning as shown in the below figure:


Figure 1: Runtime based on driver script

In the new architecture, each of these fine-tuning steps are developed as components. The overall end -to-end workflow is depicted in the figure below:


Figure 2: Runtime based on Pipeline components

This enables a modular training stack that can be composed of different components based on the vertical (Vision, NLP) and task (Object detection, Text Classification, etc.). There are several benefits to this modular design over monolithic design:

Component reuse: Pipeline architecture enables efficient component reuse which reduces the training time and saves compute cost. For example, when a sweep run is submitted with a range of batch sizes, outputs from Model Import and Data Preprocessing components are reused across the sweep runs. This can save time and resources, as well as improve the overall throughput of the training runs. Below is the sample pipeline graph for a run that reuses Model Import and Data Preprocessing components as indicated by the reuse icon on the component card.


Figure 3: Pipeline runtime with Component reuse

Improved scalability and flexibility: Component based training pipelines are more flexible because they can be easily reconfigured or modified by adding or removing components. This allows the pipeline to be tailored to meet the specific needs of a particular training scenario such as Vision Object Detection, NLP Text Multi Class classification, etc.

Better debuggability: The new backend, based on Pipeline Components, is easier to debug because it is straightforward to identify and fix problems in a specific component, rather than having to debug the entire training stack. This can also make it easier to upgrade or replace individual components as needed.

Improved reliability: Pipeline Components are more reliable because they are less prone to errors or failures that can cascade throughout the entire training stack. This is because the components are isolated from each other, which can reduce the risk of problems spreading.

New models

One of the main benefits of migrating to a pipeline backend based on AzureML's foundation models is the availability of a plethora of new models. AutoML users can now use models from any of the following sources:

AzureML foundation model catalog This catalog contains a list of foundation models curated by the AzureML team from open-source repositories. These models are tested extensively by the Azure ML team. They come with a good set of default hyperparameters that work well across datasets from various domains. These models can be used to establish a quick baseline on your datasets before jumping to tuning various hyperparameters. Models are constantly being added to the model catalog and will be immediately available for use with AutoML.

Hugging Face Customers can use any model from Hugging Face for the tasks below. As soon as a new model is available in Hugging Face, it can be used with AutoML, thereby giving immediate access to novel architectures and models.

MMDetection Similar to Hugging Face models, customers can use any model from MMDetection for the following tasks.

Customers can try out a model by just specifying the model's name while configuring the AutoML job. This would launch a single trial with the default hyperparameters associated with the model.


In addition to this, customers can also sweep over the set of models currently supported by AutoML and the new models by configuring the search space appropriately. This would launch a hyperparameter sweep over the chosen search space and can be used to evaluate the performance of different models and to choose the model with best performance on their custom dataset.

            learning_rate=Uniform(0.00001, 0.0001),
            number_of_epochs=Choice([10, 15]),
            model_name=Choice(["seresnext", "resnet50"]),
            learning_rate=Uniform(0.001, 0.01),


Get started today with AutoML in Azure Machine Learning



This article was originally published by Microsoft's AI - Machine Learning Blog. You can find the original article here.