Fine-tuning Florence-2 for VQA (Visual Question Answering) using the Azure ML Python SDK and MLflow

Released by Microsoft in mid-June 2024 under an MIT license, Florence-2 is less than 1B in size (0.23B for the base model and 0.77B for the large model) and is efficient for vision and vision-language tasks (OCR, captioning, object detection, instance segmentation, and so on). All of Florence-2’s weights are publicly available, so you can […]

Fine-tuning Florence-2 for VQA (Visual Question Answering) using the Azure ML Python SDK and MLflow Continue Reading