CVAT integrates Ultralytics YOLO models, unlocking scalable auto-annotation for ML teams

Announcing the new Ultralytics YOLO support for automatic annotation via CVAT agents.

Powerful computer vision libraries such as Ultralytics YOLO, Detectron2, and MMDetection have made it easier to train high-performing models for a wide variety of tasks. However, using these models for automated annotation often requires custom code, format conversions, and one-off integrations, especially when labeling workflows span multiple tasks. As a result, many teams fall back on manual labeling because they find automation too complex to adopt at scale.

Ultralytics YOLO is one of the most widely used model families in the computer vision community. Until now, CVAT included a single built-in YOLO model for auto-annotation, but expanding beyond that required manual setup.

That's why we're excited to announce our new integration with Ultralytics YOLO via the CVAT AI annotation agent.

‍

Introducing the new Ultralytics YOLO and CVAT integration

With this new integration, you can use native Ultralytics models (YOLOv5, YOLOv8, YOLO11) and third-party YOLO models with Ultralytics compatibility (YOLOv7, YOLOv10, etc.) for automatic image or video annotation for a wide range of computer vision tasks, including:

Classification
Object detection
Instance segmentation
Oriented object detection
Pose estimation

Just pick a YOLO model you want to label your dataset with, connect it to CVAT via the agent, run the agent, and get fully labeled frames or even entire datasets, complete with the right shapes and attributes, and all, in a fraction of the time.

‍

Annotation possibilities unlocked

This integration opens up multiple workflow optimization and automation opportunities for ML and AI teams. Here are just a few.

(1) Pre-label data using the right model for the task

Connect the YOLO models that match your annotation goals and run them sequentially to pre-label your data. Each model can be triggered individually through the CVAT interface, allowing you to generate different types of labels for the same dataset without custom scripts or external tools. This works for any YOLO model, out-of-the-box or fine-tuned.

(2)Label entire tasks in bulk

Working with a large dataset? You don’t have to annotate each frame manually. Apply a YOLO model to the entire task in one step. Just open the Actions menu in your task and select Automatic annotation. CVAT will send the job to the agent and automatically annotate all frames across all jobs in a task, saving you time and reducing repetitive work.

(3) Share models across teams and projects

Register a model once via a native function and agent, and make it instantly available across your organization in CVAT. Team members can use it in their own tasks without any local setup.

(4) Validate model performance on real data

Test your fine-tuned YOLO model directly on annotated datasets and compare its predictions side-by-side with human labels in CVAT. Spot mismatches, edge cases, or underperforming classes, all without leaving your annotation environment.

‍

How it works

Here’s what a typical YOLO auto-annotation setup via agents looks like:

For more in-depth information about how to set up automated data annotation with a YOLO or any custom model using a CVAT AI agent, read this article.

Step 1. Write and register the function

Start by implementing a native function–a Python script that loads your YOLO model (e.g., yolov8n, yolo11m-seg) and defines how predictions will be generated and returned to CVAT. Then register this function in CVAT using the CLI.

Note: You can reuse the same native function both in CLI-based annotation and agent-based mode.

Step 2. Start the agent

Once the function is registered, launch an agent using the CLI command. This starts a local service that automatically connects to your account in CVAT Online or Enterprise, and listens for annotation requests from CVAT. The agent then runs the model (inside your function), generates predictions, and sends them back to CVAT.

Step 3. Create or select a task in CVAT

Log into your CVAT instance and create a new task (or select an existing one). Upload your images or video, and define the labels you want to annotate (e.g., "person", "car", "helmet"). Depending on your use case, you can define different types of labels such as bounding boxes, polygons, or skeletons to match the expected output from your model.

Step 4. Choose the model in the UI

Once your task and the job are created and the agent is running, go to the AI Tools panel inside your job. Select the Detector tab and the YOLO model you registered earlier.

Step 5. Run AI annotation on selected frames

After selecting the model, CVAT sends a request to the running agent. The agent runs the model and returns predictions in the form of shapes (e.g., boxes, polygons, or keypoints), each associated with a label ID.

‍

Get started now

Ready to speed up your annotation workflow with YOLO? Sign in to your CVAT Online account and try it out yourself.

For more information about Ultralytics YOLO models and the tasks they support, check the Ultralytics documentation page.
For more information about CVAT AI annotation agents, check this article.

CVAT Integrates Ultralytics YOLO Models, Unlocking Scalable Auto-Annotation for ML Teams