Tutorials & How-Tos

Whether you're developing precision agriculture systems to detect crop diseases, creating AI-powered tools for early lung cancer detection from CT scans, or building theft detection systems for convenience stores, the success of your AI project hinges on one crucial element: high-quality annotated data. Even the most sophisticated AI models are only as good as the data they're trained on.‍“OK, but how do I make sure the data we get from our in-house annotation team or data labeling agency is actually good?”, you ask. And we answer: data labeling specifications.‍What are data labeling specifications? And, why does your project need them?Data labeling specifications (or annotation specifications) are documentation that provides clear instructions and guidelines for annotators on how to annotate or label data. Depending on the project, these guidelines may include class definitions, detailed descriptions of labeling rules, examples of edge cases, and visual references such as annotated images or diagrams.Labeling specifications serve several critical purposes:‍Ensure all annotators follow the same standardsMaintain consistency across large datasetsEnable quality controlHelp achieve the required accuracy for model trainingServe as a reference document for both the client and annotation teamThe lack of well-thought-out specifications leads to all sorts of issues for all stakeholders involved—clients, labeling service providers, annotation teams, and ultimately, the end users of the data:#1 Inconsistent annotation resultsPoor specifications result in inconsistent annotation outcomes, as annotators are left to make assumptions and interpret tasks as they see fit. For example, if the guidelines don't specify how to handle occluded objects (e.g., a pedestrian behind a car), one annotator might use a bounding box while another uses a polygon. These inconsistencies can make the dataset unusable for model training, and often require its complete re-annotation. Source: https://cocodataset.org#2 Wasted time and moneyInconsistent annotation results inevitably trigger a costly cycle of revisions and rework, with each iteration requiring additional time from annotators, reviewers, and project managers. The result? Blown budgets and missed deadlines that could have been avoided with clear specifications from the start.‍#3 Frustrated annotation teamNothing kills team morale faster than having to redo work that's already been done. When annotators spend hours labeling data only to learn that the requirements weren't clear or complete, it's more than just frustrating—it's demoralizing. Productivity drops, attention to detail suffers, and the entire project enters a downward spiral. ‍#4 Project management overheadUnclear specifications turn project managers into full-time firefighters. Instead of focusing on strategic tasks, they're stuck in an endless cycle of retraining annotators, clarifying instructions, and double-checking work. Every vague requirement creates a ripple effect of questions, corrections, and additional reviews. This translates into more management hours, higher costs, and project managers who can't focus on what really matters—delivering quality results on time. ‍So, what makes a good specification?A well-crafted specification is like a detailed roadmap—it guides annotators to their destination without leaving room for wrong turns. Based on our experience working with hundreds of clients, here's what separates great specifications from the rest:‍Project Context. Don't just tell annotators what to do—help them understand why they're doing it. Whether your AI will be scanning crops for disease or monitoring store security, this context helps annotators make better decisions when they encounter tricky cases.Comprehensive Class Definitions. Think of this as your annotation dictionary. Every object class should be clearly defined, along with its key characteristics. For instance, what exactly counts as a "ripe tomato" in your agricultural dataset? What specific visual indicators should annotators look for?Clear Annotation Rules. Spell out exactly how you want things labeled. Should that partially visible car be marked with a bounding box or a polygon? How precise should segmentation masks be? Leave no room for guesswork.Edge Case Playbook. Every dataset has its tricky cases. Maybe it's a car hidden behind a tree or a disease symptom that's barely visible. Document these scenarios and provide clear instructions on how to handle them consistently.Red Flags and Common Pitfalls. Show annotators what not to do. By highlighting common mistakes upfront, you can prevent errors before they happen and save countless hours of revision time.Visual Examples (That Actually Help). A picture is worth a thousand words. This is true for labeling specs too. Include plenty of annotated examples showing both perfect and poor annotations. These real-world references are often more valuable than written descriptions alone.When you nail your specifications, the benefits cascade throughout your entire project:‍Every annotator follows the same playbook, delivering uniform results that your AI models can actually learn from. No more dealing with a mishmash of annotation styles that confuse your training process.Clear instructions mean fewer mistakes and less back-and-forth. Your team can work confidently and efficiently, keeping your project timeline on track.Every round of corrections burns through your budget. With crystal-clear specifications, you slash the need for revisions and keep costs under control. Plus, modern annotation platforms like CVAT come with built-in specification support, making it even easier for your team to stay on track.Now, let's put it to the test and see how good vs. bad labeling specs play out with a real-world dataset.‍“Good vs. Bad” Labeling Specifications: A Head-to-Head Test‍Source: https://cocodataset.orgThe setupAn image of a parking lot with different cars, road signs, people, trees, and fences. .Two annotators.Two different specs.The specs‍The first annotator was given very basic instructions: ‍Annotate the road, signs, people and vehicles using masks. Transportation must additionally be annotated with boxes.‍That's it. No quality guidelines, no examples, no nothing.The second annotator got a bit more lucky, and received a few more details:Annotate only the driveway and exclude the sidewalk from the annotation.Annotate signs together with their posts.Use only a mask, not a bounding box, for vehicles with less than 50% visibility.The results‍‍The results are quite descriptive. Without extra clarification, the first annotation is less accurate, missing some attributes such as signposts, and incorrectly labeling the sidewalk as part of the street. The second annotation is 100% accurate.‍A super-simple example, but when applied to a real use case, leaving out extra details can lead to thousands of inconsistent annotations, missed deadlines, unhappy annotators, and, worst of all, AI models that fail to perform reliably in production.‍Build better AI with better specifications Creating thorough labeling specs takes time and effort, but it's an investment that pays off many times over through consistent results, faster delivery, and significant cost savings. ‍To help you get started, we've created a comprehensive data labeling specification template based on our experience with hundreds of successful annotation projects. It covers all the essential elements we discussed and includes practical examples you can adapt for your specific needs. ‍Free Data Labeling Specs Template‍Download our free template and set your AI project up for success from day one. ‍{{labeling-specs-banner="/blog-banners"}}

Tutorials & How-Tos

February 5, 2025

How to Create Data Labeling Specifications for Your Annotation Project: A Client's Guide (+ Free Template)

In today's fast-paced digital environment, the efficiency of team collaboration can make or break project success, especially when it comes to complex tasks like data annotation, team management and workflow setup. ‍For businesses and research teams that depend on precise image and video annotation, Computer Vision Annotation Tool (CVAT) offers a powerful solution to improve team productivity and accuracy. One of the standout features of CVAT.ai is the Organization feature, designed specifically for teamwork. ‍Here’s a practical guide on how to use Organizations for image annotations, structured around a common use-case scenario.‍BackgroundImagine you are responsible for a data annotation project that requires organizing and labeling large volumes of images or videos. You have a team ready to do the work, and your goal is to ensure that everyone operates efficiently and cohesively to deliver high-quality results. To help achieve this, you've chosen CVAT.ai as your preferred tool.‍Or you are a student leading a research project, working with your peers on a similar task of data annotation. This project involves organizing and labeling images or videos for academic research purposes. Your main objective is to make sure that your classmates understand the tasks clearly, ensuring that everyone is on the same page, which will allow you to annotate the dataset effectively and proceed with your research. For this, CVAT.ai is your tool of choice.‍This article will guide you through using CVAT.ai effectively with your team to ensure the best possible results. From setting up your project to managing tasks and collaborating. Whether you're annotating data for commercial use or an academic study, these guidelines will help you and your team succeed in your efforts.‍Step 1: Setting Up Your Organization in CVAT.ai‍Setting up CVAT.ai for optimal team collaboration involves a number of necessary steps: from registering in CVAT.ai to subscribing to the Team plan. Below is a guide outlining each action you need to take for a successful start:‍1. Create an Account and Log In: Begin by going to the CVAT.ai website and creating an account. Once you've registered, log in with your credentials.‍2. Create an Organization: In CVAT.ai, an Organization acts as a central hub where all projects, team members, and tasks can be managed under a single umbrella. Once logged in, create an Organization.‍3. Switch to the Organization Account: After creating your Organization, switch from your individual account to your newly created organization account. Switching to the Organization account is mandatory for the next step, where you need to subscribe to a Team plan if you want to collaborate and annotate without any limits.‍4. Subscribe to the Team Plan: To lift all the limitations of the Free plan and start working on the project with your team you need to subscribe. Before subscribing, check if you need to add any additional information to your invoices. Also, note that you, as an organization owner, are also part of the team. So, if you have three annotators working, you’ll need to pay for 4 seats (3 annotators + 1 organization owner (you!)).Now all done and you are ready to invite team members and start working on the project. Let’s move on to the next step and invite team members for collaboration. ‍Step 2: Adding Team Members‍Once your organization within CVAT.ai is established, the next step is to add your team members, ensuring that each participant has the appropriate access and tools needed to annotate. Here’s how to manage this process smoothly:1. Invite members: Go to Organization > Settings you will see an Organization page with list of members and an Invite member button. Click on it to proceed. A dialogue box will appear where you can enter the email addresses of the people you want to add—these could be your annotators, reviewers, and any supervisory staff.‍‍2. Assigning Roles and Responsibilities: As you invite each member, you’ll have the option to assign specific roles. Assigning roles is crucial for establishing a clear hierarchy and division of responsibilities within the team. Depending on their role, users will have access only to the functionalities necessary to perform their specific tasks.Once you’ve added members and assigned roles, you can create project add tasks and assign jobs to annotators.‍Step 3: Creating a Project and Uploading DataOnce your organization in CVAT is up and running and team members have accepted invitations to join, you'll need to create projects, add tasks, and assign jobs to the annotators. Here's how you can proceed:‍1. Creating Projects: In CVAT.ai, projects serve as broad categories that organize related tasks under a specific theme or goal. Any labels or specifications added at the project level will automatically apply to all tasks and jobs within that project, ensuring consistency and saving time. ‍To create a project, go to the Projects section within your organization’s dashboard, and click + to create a new Project.. You'll be prompted to enter your project details such as the project name, description, and so on. ‍2. Adding Tasks to Projects: Tasks are the specific assignments that annotators work on within a project. Each task involves annotating a particular set of images or videos according to predefined guidelines and objectives.‍To add a task to a project in CVAT.ai, first navigate to the project page. Then, click on + > Create a new task. Have your dataset ready, as you will need to upload it for the task to be successfully created. When you create a task, CVAT.ai automatically generates jobs within that task. You can divide a single task into several jobs, allowing multiple annotators to work on different parts of the task simultaneously. ‍3. Specifications for Annotators: Clear specifications with guidelines for annotators help maintain consistency across annotations, which is crucial for training machine learning models. They also ensure that all team members are aligned with the project's standards, which helps in achieving high-quality outputs. You can easily create specifications within CVAT and add them at the Project or Task level, so all annotators can be on the same page. ‍4. Quality Assurance and QA: In CVAT.ai, you can ensure the quality of annotations through two methods: by creating a specific job known as a Honeypot for automatic QA, or by assigning a dedicated worker for manual QA. If you opt for the Honeypot, it's important to create this job before beginning the annotation process.‍Step 4: Assigning Jobs/Tasks to Annotators and Annotation. Once tasks are created and specifications set, assign them to individual annotators or, on the later stage, to reviewers. To do this, click on the Task you will see a list of jobs, all of them having an assignee field. ‍‍Click on it and select the name of the annotators and the Job’s stage. And with that, you're all set!‍Step 5: Configure Webhooks‍This step is optional, but we recommend setting up webhooks for a seamless workflow.Webhooks are a powerful tool within CVAT that allow for real-time notifications and automated reactions to specific events within the platform. By configuring webhooks, you can set up CVAT to send instant alerts or perform automated tasks whenever certain actions occur within your projects.Step 6: Annotation‍After you assign the jobs, annotators will see them and proceed with the annotation of the images or videos. This step is critical as it involves the direct application of data labeling based on the project guidelines.Note, that after annotation is done, annotators need to save the work and change the job state to completed.‍Step 7: Quality Assurance and Issue ResolutionAfter annotations are completed, it is essential to verify the quality of the work before acceptance. In CVAT.ai, you can do this in two ways: with automatic or manual quality assurance options.‍1. Honeypot for Automatic QA: If you have set up a Honeypot (also known as the Ground Truth job), allow some time for the CVAT platform to accumulate data. This setup helps in checking the accuracy and quality of the annotations made by comparing them with pre-validated 'ground truth' data.‍2. Assign Jobs to Validators for Manual Validation: You can manually validate annotations by assigning jobs to validators. Input the validator’s name in the 'Assignee' field and change the 'Job stage' to 'Validation'. Validators will review the assigned jobs and report any issues found. In CVAT.ai validators can easily report any discrepancies or errors in the annotations.Once all reports are available, the annotators can review and address any identified issues. ‍3. Correction of Issues: Review the issues reported by validators. If there is a need for further improvements, reassign the jobs to either the original annotator. Once annotators receive the reports, they can review and address any identified issues. Validators may also correct issues directly. This dual role of validating and correcting improves the quality control process, ensuring more accurate outcomes in the annotation project. However, the best process ultimately depends on your preference.‍Step 8: Analytics and PerformanceWhat follows is the annotating and quality assurance process. To streamline these stages, CVAT.ai offers analytics tools that help monitor the progress and performance of your team. These analytics provide valuable insights into task completion rates, annotator performance, and can highlight areas that may require additional attention or adjustment. ‍Step 9: Export DataOnce the annotation and validation stages are complete, and all quality checks are satisfied, export the annotated data. This data is now ready for use in machine learning models or for any other required purpose.‍ConclusionCVAT.ai Organizations were designed for team collaboration on annotation projects, making it easier to handle complex tasks. By using described steps, businesses can improve the data annotation processes, which in turn helps speed up the development of dependable and effective machine learning models. ‍Not a CVAT.ai user? Click through and sign up here‍Do not want to miss updates and news? Have any questions? Join our community:‍Facebook‍DiscordLinkedInGitterGitHub

Tutorials & How-Tos

May 8, 2024

Mastering Image Annotation Crowdsourcing for Computer Vision with CVAT.ai and HUMAN Protocol

Data annotation is a critical step in the development of machine learning models. However, manual annotation can be time-consuming specifically for big datasets. What if you could automate this process and therefore make it faster? And with any model? ‍In CVAT.ai it is possible with CVAT CLI. But before diving into the technical details, let’s set up the basic understanding. ‍The CVAT CLI leverages Software Development Kit (CVAT SDK) to auto-annotate, or pre-annotate your dataset, allowing you to focus more on model development and less on data preparation. ‍The SDK enables you to incorporate functionalities from a variety of machine learning libraries. Including torchvision, but you can use others. The SDK provides you with a range of options for automated annotation, also known as Auto-Annotation (AA) functions.‍What are AA Functions?‍Auto-Annotation, or AA, functions, are Python objects designed to perform specific annotation tasks. These functions translate your raw data into annotations. ‍A typical AA function generally includes the following components: Code to load the machine learning model.A specification outlining the types of annotations that can be generated.Code to transform CVAT data into a format the machine learning model understands.Code to run the model to obtain predictions.Code to convert predicted annotations back into a CVAT-friendly format.‍The CVAT SDK is built on a layered architecture comprising several parts:‍The Interface: Defines the protocol that any AA function must implement.The Driver: Manages the execution of AA functions and performs the actual annotation on the CVAT dataset. Predefined AA Functions: Includes a set of predefined functions.‍This is just a glimpse; the following article will walk you through the steps and specifics to get you started on your automated annotation journey.‍There are two ways to auto annotate using the CVAT CLI:‍Annotating with predefined Auto-Annotation Functions in CVAT SDK.Annotating with your own Auto-Annotation Function‍Before starting the annotation‍Before starting the annotation process let’s set up a task in CVAT Cloud. In this case it is a simple dataset with animals and labels “cat” and “dog”.‍CVAT screen with image for annotation‍For both cases, first we need to create an environment where we could run the function. Let's begin by installing a few Python packages on the local machine. Please note that commands might vary for different operating systems. For the sake of this article, all commands that we use are for Windows.‍Run the following command:‍python -m venv venv‍When the virtual environment is ready, you will need to activate it:‍.\venv\Scripts\Activate.ps1‍The next step is to install the CVAT.ai CLI. Execute the command and wait for the installation to complete.‍pip install cvat-sdk[pytorch] cvat-cli‍To allow CVAT CLI access to CVAT, you'll need to store your CVAT password in the PASS environment variable. We'll utilize the Read-Host command here to prevent the password from being displayed.‍$ENV:PASS = Read-Host -MaskInput‍Enter your CVAT password and hit Enter. Now you are ready to run the automatic annotation.‍Easy Guide to Using Predefined Auto-Annotation Functions in CVAT SDK ‍You can auto-annotate with its two functions that utilize models from the torchvision library.The CVAT SDK includes two predefined AA functions. Each function is implemented as a module to allow usage through the CLI auto-annotate command. ‍After you’ve installed Python and environment is ready, run an Automatic Annotation from CLI we will use the following command:‍cvat-cli auto-annotate "<task ID>" --function-module cvat_sdk.auto_annotation.functions.torchvision_detection \ -p model_name=str:"<model name>" ...‍Let’s come back to the task that was created earlier. To run the function you will need a host, task ID, and username. For the model name check the torchvision documentation. In the example below we’ll use fcos_resnet50_fpn.‍The score_thresh=float:0.7 parameter is used to specify the threshold for object detection confidence scores. In this case, it's setting the confidence score threshold to 0.7, meaning that only object detections with a confidence score greater than or equal to 0.7 will be included in the results of the auto-annotation process. Objects with lower confidence scores will be filtered out. ‍CVAT screen showing where to get all parameters‍‍With these elements added to the command, you will get the following result:‍cvat-cli --server-host app.cvat.ai --auth mk auto-annotate 274373 --function-module cvat_sdk.autocvat_sdk.auto_annotation.functions.torchvision_detection -p model_name=str:fcos_resnet50_fpn -p score_thresh=float:0.7 --allow-unmatched-labels ‍Where app.cvat.ai is the host, 274373 is the task ID, and mk is the username.‍By default, the CLI will check that every label that the function can output exists in the task. In this case, our task only has "cat" and "dog" labels, while the function can output 80 labels in total, --allow-unmatched-labels tells the CLI to ignore all labels that don't exist in the task.‍It's a good practice to start with a clean state. So if there are any annotations that were done before, you can add –-clear-existing option the command, that will clear all existing annotations. ‍The annotation will start. Wait until it’s over, then go back to the task. You might need to refresh the page for annotations to be visible.‍CVAT annotated image‍It's time to check the quality. Go through the dataset to ensure that the annotations meet your requirements.‍How to Auto-Annotate Your Dataset with Model of Choice and the Command Line Interface‍The second method is when you use the auto-annotation feature not with predefined functions but with any model of your choice. In this guide, we'll walk through using YOLO v8 for auto-annotation via the Command Line Interface (CLI). Here is the task that will be annotated:‍CVAT with image to be annotated‍When the environment is ready, you can run a model function. Something like this:‍import PIL.Image from ultralytics import YOLO import cvat_sdk.auto_annotation as cvataa import cvat_sdk.models as models _model = YOLO("yolov8n.pt") spec = cvataa.DetectionFunctionSpec( labels=[cvataa.label_spec(name, id) for id, name in _model.names.items()], ) def _yolo_to_cvat(results): for result in results: for box, label in zip(result.boxes.xyxy, result.boxes.cls): yield cvataa.rectangle(int(label.item()), [p.item() for p in box]) def detect(context, image): return list(_yolo_to_cvat(_model.predict(source=image, verbose=False))) open_in_new MORE content_copy COPY @cvataicode at thiscodeWorks.com‍To move to the next step, you'll need to install the Ultralytics library, which houses the YOLO models. To do it, execute the following command. Wait for the installation to finish.‍pip install ultralytics‍It's a good practice to start with a clean slate. For this purpose, the –-clear-existing option is added to the command, which will clear all existing annotations. ‍Note that you’ll need to specify the path to the file implementing the function. ‍You can also exclude labels that you don't need.‍Here’s how you'd run the command in the CLI:‍cvat-cli --server-host app.cvat.ai --auth mk auto-annotate 274373 --function-file .\yolo8.py --allow-unmatched-labels –-clear-existing‍Press Enter and wait for the auto-annotation by YOLO8 to be accomplished.‍Once the auto-annotation is complete, it's time to check the quality. Go through the dataset to ensure that the annotations meet your requirements.‍CVAT annotated image‍There you have it! Now you know how to use any model, including YOLO v8, to auto-annotate your dataset via the CLI. Using auto-annotate can save you a tremendous amount of time and help you achieve consistent annotation across your datasets. If you have more questions, please see Auto Annotation documentation.‍And check the video to see the full process:‍‍Remember to like, share, and subscribe for more updates!Happy Annotating!‍‍Not a CVAT.ai user? Click through and sign up here‍Do not want to miss updates and news? Have any questions? Join our community:‍Facebook‍DiscordLinkedInGitterGitHub‍‍

Tutorials & How-Tos

October 5, 2023

An Introduction to Automated Data Annotation with CVAT CLI

Ensuring top-quality annotations in Computer Vision Annotation Tool (CVAT) is simpler than you think. Whether you're a project owner, an annotator, or a QA specialist, our platform makes the process seamless.‍Watch the tutorial and read on to discover how to navigate this critical aspect of machine learning and video annotation.‍‍‍‍Setting Up Your Project‍The first step is initializing your annotation project. After creating a Project and adding a Task, you assign Jobs to your Annotators. These jobs contain images for annotation. In our demonstration video, we've intentionally introduced errors for educational purposes—such as labeling "dogs" as "cats".‍Switching Roles for Quality Assurance (QA)‍When the annotator has completed their tasks, it's time for Quality Assurance. To show how this works, we'll switch back to the Project owner's account to initiate the QA process.Assigning a QA specialist to review the annotations is a breeze. Just invite the person to your project and assign them to the specific job. Then change the status of the Job to "Validation".‍Review and Issue Tracking‍The person assigned as QA will log in and have access to the QA interface which has been designed specifically for issues reporting and tracking. It lacks the typical annotation tools but includes an "Issue tracker" icon.‍QA will go through each annotation to identify errors. Once found, QA creates an issue and submits it. CVAT also provides predefined issues for common errors, saving time and ensuring consistency.‍Navigating and Resolving Issues‍After the QA specialist completes their review, we’ll go back to the annotator’s account and interface to see how the reported issues look. The annotator can easily navigate through the list of issues and correct the errors. After all is done, the annotator saves the work, making the annotations complete and ready for future use. And that’s it!‍Happy Annotating!Not a CVAT.ai user? Click through and sign up hereDo not want to miss updates and news? Have any questions? Join our community:‍Facebook‍DiscordLinkedInGitterGitHub‍‍

Tutorials & How-Tos

September 14, 2023

How to Ensure Quality in Image Annotation with CVAT.ai

In the fast-paced world of data annotation, striking the perfect balance between speed and accuracy is essential. With complex datasets and strict deadlines becoming the norm, annotation professionals are constantly seeking solutions to streamline their workflow without compromising the quality. ‍This is where the power of Layers in CVAT.ai comes into play.‍Understanding the Challenge‍Imagine having to annotate a dataset that features intricate objects or multiple subjects in each image. Traditional annotation methods might force you to choose between speed and accuracy – a decision that can have significant implications on the overall quality of your work.‍Introducing Layers ‍Layers in CVAT.ai improve the way you approach annotation tasks. Whether you're dealing with multi-object images, complex scenes, or projects with strict timelines.‍By allowing you to separate objects or subjects into distinct layers, CVAT.ai lets you focus on annotating individual elements without the clutter of overlapping annotations. This focused approach translates into increased efficiency as you no longer need to be worried about gaps between annotated objects and you also reduce the number of objects to be annotated overall.‍Want to know how to do it? Check out our latest video!‍‍Share your opinion and stay tuned!‍Happy annotating!‍Do not want to miss updates and news? Have any questions? Join our community:‍Facebook‍DiscordLinkedInGitterGitHub‍‍

Tutorials & How-Tos

August 17, 2023

Improve Annotation Speed and Accuracy with Layers in CVAT.ai

Every day, the vast world of artificial intelligence (AI) becomes increasingly interconnected with our lives. In essence, the backbone of AI technology is data. More specifically, for AI to understand and interpret the visual world around us, we need data in the form of images. These images must be labeled or annotated, turning them from raw pixels into a language that AI can understand. This process, called image annotation, is an integral part of AI development.However, image annotation is not simply a case of attaching labels to pictures. It requires meticulous work to ensure the data is labeled correctly and consistently. This is where the importance of image annotation quality comes into play.‍Imagine you have a dataset full of labeled images. But how do we know the labels are accurate? How can we be sure that these labels are reliable enough to train AI models? To answer these questions, we need to assess the quality of our data. Fortunately, in CVAT, an image labeling tool specifically designed for such tasks, there's a simple way to do this using a method known as the 'Honeypot'.‍‍The Magic of CVAT Honeypot ‍n the world of image annotation, quality is king. That's where CVAT and its Honeypot method come in. The Honeypot method is all about comparing actual annotations with a 'ground truth', or known correct annotations. This ground truth is set up in a unique job within CVAT.‍Worried about double-annotating an entire dataset for the ground truth? Fret not. Just a fraction of images, say 5-15%, is enough to give an estimate of the overall quality. The size of the 'ground truth' job is flexible, you can set it as a specific number or a percentage of frames .‍These 'ground truth' jobs are different from regular jobs. They don’t mingle with your main dataset, whether it's exporting, importing, or automatic annotation. And if you ever need to tweak parameters, you can delete the ground truth job and create a fresh one.‍Once your ground truth job is complete, annotate the dataset and let CVAT crunch the data. Once processed, all the information will appear on the Task Analytics page which is dedicated to showing annotation quality results. There you'll find your task's quality score, including the average annotation quality, the number of conflicts and issues, and a per-job breakdown. For a closer look, you can always download the detailed report for the task or for each job.‍And if you need to customize quality score requirements? CVAT's got you covered. You can set parameters, for example what counts as a 'bad' overlap and others. Once set, these will be applied in the next quality check. So, there you have it. Ensuring high-quality annotations is a breeze with CVAT and the Honeypot feature.‍Check the video about this new feature:‍‍‍Thank you for choosing CVAT!‍Stay tuned and follow the news here:‍Facebook‍DiscordLinkedInGitterGitHub‍‍

Tutorials & How-Tos

July 20, 2023

Quality control of Image annotation in CVAT

Animal classification is the process of categorizing different species of animals based on their physical and biological characteristics. ‍Here, when we say physical biological charastetrisitcs, we mean:1. Symmetry: radial like a starfish or bilateral like a butterfly?2. Body plan: does it have a backbone? Is it covered in fur or scales?3. Reproduction: does it lay eggs or is it some kind of internal fertilization?4. Metabolism: is it a herbivore or carnivore? Or something else?And many more. ‍By studying these characteristics, scientists can better understand the evolutionary relationships between different species and how they have adapted to different environments over time. They can aslo see the changes in the behavior and appearance of the animals by checking the data from different time periods. Based on this information, scientists make conclusions and provide recommendations for ecological improvements that can benefit both endangered and non-endangered species alike.‍So animal classification is something really important. And challenging. As it requires both: collection and processing of Information. For this very reason it is also very time consuming and costly: the animal kingdom is big and so is the volumes of the collected data. ‍Image annotation can help with animal classification by providing a way to analyze large amounts of visual data quickly and efficiently. ‍The procedure is straightforward: assign labels, such as bird, starfish, bear, or zebra. When necessary, add attributes like the presence of a backbone, radial or bilateral symmetry, or even the gender of the subject. Once completed, export the annotated dataset and apply the machine learning model to it. This will classify animals based on the provided labels, resulting in a quicker and more accurate animal classification procedure.To make the process of adding classification labels easier, ecologists use different tools and one of them is CVAT (Computer Vision Annotation Tool).‍ Here is the short video describing the whole process step-by-step:‍We are waiting for your feedback here:DiscordLinkedInGitterGitHub‍You can find more information at our YouTube channel

Tutorials & How-Tos

April 26, 2023

Accelerate image classification with CVAT

Tutorials & How-Tos

February 14, 2023

CVAT SDK PyTorch adapter: using CVAT datasets in your ML pipeline

Object detection is a field within Computer Vision that involves identifying and locating objects within an image. Advances in object detection algorithms have made it possible to detect objects in real-time as they are moving.‍There are a number of Object detection technologies available. One of the most popular ones to date is the YOLO object detector. Currently, YOLO is almost the gold standard algorithm for Object detection, owing to its high speed. As such, it finds widespread applications in a number of crucial areas like security and surveillance, traffic management, autonomous vehicles, as well as healthcare.In this article, we will learn how YOLO works and how you can use it to annotate images in CVAT automatically. ‍A brief history of YOLOEarly object detectors were mainly region-based. They used a 2-step process to detect objects. In the first step, these algorithms proposed regions of interest that are likely to contain objects. In the second step, they classified the images in these proposed regions.‍Some of the popular region-based algorithms include:R-CNNFast R-CNNFaster R-CNNRFCNMask R-CNN‍Region-based detectorsR-CNN (Regions with CNN features) was the first region-based object detector, proposed in the year 2014. This detector used a process of selective search to cluster similar pixels into regions and generate a set of region proposals. These regions were then fed into a Convolutional Neural Network (CNN) to generate a feature vector. The feature vector was then used to classify and put a bounding box around detected objects. Besides other limitations, this algorithm proved to be quite time-intensive.‍So, it was succeeded by the Fast R-CNN detector, which put the whole image through a CNN and used ROI pooling to extract the region proposals. The feature vectors thus generated were passed through several fully connected layers for classification and bounding box regression. Although this was faster than R-CNN, it was still not fast enough, as it still required selective search.‍The Faster R-CNN detector drastically speeded up the detection process by getting rid of the selective search approach and using a Region Proposal Network instead. This network used an ‘objectness score’ to produce a set of object proposals. The objectness score indicates how confident the network is that a given region contains an object. ‍Another approach used was the R-FCN detector, which used position-sensitive score maps. ‍All the above methods required 2 steps to detect objects in an image:‍Detect the object regionsClassify the objects in those regions‍This 2-step process made the object detection process quite slow. A more sophisticated approach was required if object detection was to be used in real-time applications. ‍Emergence of YOLOThe YOLO algorithm was first proposed by Joseph Redmond et al in 2015. In contrast to earlier Object Detection algorithms, YOLO does not use regions to find objects in a given image. Neither does it require multiple iterations over the same image. ‍It passes the entire image through a Convolutional Neural Network that simultaneously locates and classifies objects in one go. That is how the algorithm gets its name (You Only Look Once).‍This approach enables the algorithm to achieve substantially better results than other object detection algorithms. ‍How YOLO worksThe YOLO algorithm divides the image into an NxN grid of cells (typically it is 19X19). It then finds B bounding boxes in each cell of the grid. For each bounding box, the algorithm finds 3 things: The probability that it contains an objectThe offset values for the bounding box corresponding to that objectThe most likely class of the object‍After this, the algorithm selects only the bounding boxes that most certainly contain an object. ‍IoU (Intersection Over Union)The YOLO algorithm uses a measure called the IoU to determine how close the detected bounding box is to the actual one.The IoU is a measure of the overlap between two bounding boxes. During training, the YOLO algorithm computes the IoU between the bounding box predicted by the network and the ground truth (the bounding box that was pre-labeled for training). ‍It is calculated as follows:IoU = area of intersection of the overlapping boxes / area of union of all the overlapping boxes‍An IoU of 1 means that both bounding boxes completely overlap one another, whereas an IoU of 0 means that the two bounding boxes are completely distinct. A threshold for the IoU is fixed, and only those bounding boxes that have an IoU above the threshold value are retained, while others are ignored. This helps eliminate a lot of unnecessary bounding boxes so that you’re left with only the ones that best fit the object.‍Non-Maximum SuppressionDuring the testing phase, since there are a number of cells detecting the same object, it is possible to be left with several bounding boxes corresponding to the same object. YOLO takes care of this by using a technique called Non-maximum suppression.Non-max suppression involves first selecting the bounding box with the highest probability score and removing (suppressing) all other boxes that have a high overlap with this box. This again makes use of the IoU, this time between all the candidate bounding boxes and the one with the highest probability score.‍‍Bounding boxes that have a high IoU with the most probable bounding box are considered to be redundant and are thus removed. However, those with a low IoU are considered to perhaps belong to a different object of the same class and are thus retained.In this way, the YOLO algorithm selects the most appropriate bounding box for an object.‍The YOLO ArchitectureYOLO is essentially a CNN (Convolutional Neural Network). The YOLOv1 network consists of 24 convolutional layers, and 4 max-pooling layers, followed by 2 fully connected layers. The model resizes the input image to 448x488 before passing it through the CNN.‍ ‍The convolutional layers in the network alternate 1x1, followed by 3x3 reduction layers to reduce the feature space as the image goes deeper into the network. ‍The last convolutional layer uses a linear activation function, while all others use leaky ReLU for activation. ‍Limitations of YOLOThe YOLO algorithm has been a great leap in the field of object detection. Since it can process frames much faster than traditional object detection systems, it is ideal for real-time object detection and tracking. However, it does come with some limitations.‍The YOLO model struggles when there are small objects in the image. It also struggles when the objects are too close to one another. For example, if you have an image of a flock of birds, the model would not be able to detect them very accurately. ‍Popular YOLO VariationsTo overcome the limitations of YOLOv1, many new versions of the algorithm have been introduced over the years. ‍The YOLOv2 was introduced in 2016 by the same author (Joseph Redmond). It addressed the most important limitations of YOLOv1 - the localization accuracy and the detection of small clustered objects. The new model allowed the prediction of multiple bounding boxes (anchor boxes) per grid cell, so more than one object could now be detected in a single cell. Moreover, to improve accuracy the model used Batch Normalization in the convolutional layers. ‍YOLOv2 uses the Darknet-19 network, which consists of a total of 19 convolutional layers and 5 max-pooling layers.‍Following YOLOv2, the YOLO9000 was introduced. This model was trained on the COCO dataset (which is almost a superset of ImageNet), allowing it to detect more than 9000 image classes.‍When YOLOv3 came about, it brought with it an architectural novelty that made up for the limitations of both YOLO and YOLOv2. So much so that it is still the most popular of the YOLO versions to date. This model uses a much more complex network – the Darknet-53. It gets its name from the 53 convolutional layers that make up its architecture. The model itself consists of 106 layers, with feature maps extracted at 3 different layers. In this way, it allows the network to predict at 3 different scales. This means that the network is especially great at detecting smaller objects.‍Besides that, YOLOv3 uses logistic classifiers for each class, instead of a softmax function (used in the previous YOLO models). This allows the model to label multiple classes for a single object. For example, an object could be labeled as both a ‘man’ as well as a ‘person’.‍After YOLOv3, other authors introduced newer versions of YOLO. For example, Alexey Bochkovskiy introduced the YOLOv4 in 2020. This new version mainly increased the speed and accuracy of the model with new technologies like weighted residual connections, cross mini batch normalization, and more.‍Many other versions have come about following the YOLOv4, like the YOLOv5, YOLOACT, PP-YOLO, and more. The latest version to date is the YOLOv7. The paper for this model was released in July 2022 and is already quite popular. ‍According to the authors, the YOLOv7 could outperform most conventional object detectors, including YOLOR, YOLOX, and YOLOv5. In fact, the YOLOv7 is being hailed by its authors as the ‘New State-of-the-Art for Real-Time Object Detectors’.‍How you can use YOLO in CVAT / Integration of YOLO and CVATTo train any object detection model on image data, you need pre-annotated images (containing labeled bounding boxes). There are a number of tools available both online and offline to help you do this. One such tool is CVAT (Computer Vision Annotation Tool). ‍This is a free, open-source online tool that helps you label image data for computer vision algorithms. Using this tool, you can simply annotate your images and videos right from your browser.‍Here’s a quick tutorial on how to annotate objects in your image using CVAT.‍Using CVAT to Annotate ImagesLet’s say you have the following image and you want to put bounding boxes and labels on the two cars, dog, and pedestrians. ‍To do it, you need to go to cvat.ai, create an account and upload an image.‍When it comes to image uploading, the whole process includes several steps. First step is to set up a project and add task with the labels of choice (in this case: ‘pedestrian‘, ‘dog‘, and ‘car‘). ‍Second step is to upload one or more images you want to annotate and click ‘Submit and Open‘.‍Once everything is in place, you will see your task and all the details as a new job (with a new job number). The window below is the ‘Task dashboard’. Click on the job number link:‍‍ It will take you to the annotation interface:‍‍Now you can start annotating.‍How to Manually Annotate Objects in an ImageIn this example we show annotations with rectangles. To add a rectangular bounding box manually, you need to select a proper tool on the controls sidebar. Hover over the ‘Draw new Rectangle’, and from the drop-down list select the label you want to add to the annotated object. Click `Shape`.‍‍‍With a rectangle, you can annotate using either 2 or 4 points. Let’s say, you chose 2 points, then simply click on the top left corner and then the bottom right corner of the object, like this:‍‍CVAT will put the bounding boxes with specified labels around the objects. ‍This method is good when you do not have too many objects on the image. But if you have a lot of them, then the manual method can get quite tedious. For multiple objects cases CVAT has a more efficient tool to get the job done – the YOLO object detection.‍Using YOLO to Quickly Annotate Images in CVATCVAT incorporates YOLO object detection as a quick annotation tool. You can automate the annotation process by using the YOLO model instead of manually labeling each object. ‍Currently, two YOLO versions are available in CVAT: YOLO v3 and YOLO v5. In this example, we will use YOLO v3.To use the YOLO v3 object detector, on the controls sidebar hover over the AI Tools and go to Detectors tab. You will see a menu with a drop-down list of available models. From this drop-down, select ‘YOLO v3’.‍‍The next thing you need to do is the labels’ matching. This need is based on the fact that some models are trained on the datasets with a predefined list of labels. ‘YOLO v3’ is a model like that and to start annotating you need to give YOLO a hint - how its model’s labels are correlating with the ones you’ve added to CVAT. ‍For example, you want to label all people on the image and added a ‘pedestrian’ label in CVAT. The most fitting YOLO label for this type of object will be ‘person’. To start annotating, you need to match the YOLO label ‘person’ to the CVAT label ‘pedestrian’ in the Detectors menu.‍Luckily, for other objects there is no need to think twice, as YOLO has `dog` and `car` model labels:‍‍‍Once you’re done matching the labels, click ‘Annotate’. CVAT will use YOLO to annotate all the objects for which you have specified labels.‍‍After annotation is done, go ahead and save your task by clicking the Save button, or export your annotations in the .xml format from Menu > Export Job Dataset.‍Quickly Annotating Objects in VideosYou can use CVAT Automatic Annotation with YOLO detector to label objects in videos directly from your Task Dashboard with a few simple steps.‍First step is to find the task of the required video. Once you’ve identified it, hover over three dots to open the pop up menu.‍‍‍In the menu click on Automatic Annotation to open the dialog box, and from the drop down menu select ‘YOLO v3’ . ‍Second step is to check the labels matching, and adjust them to fit your needs and requirements (if needed).‍‍When all is set and ready, click ‘Annotate’ to start labeling objects in the video.‍It will take some time for automatic annotation to complete. The progress bar will show the status of the process. ‍‍When it is done, you will see a notification box along with a link to the task.‍‍Click on the link to open the task dashboard, and again on the job link to open the annotation interface.‍‍Where you will see the video with objects automatically labeled in every frame: ‍‍‍You can now go ahead and edit the annotations as needed if you find any false positives or negatives. ‍ConclusionYOLO is a specialized Convolutional Neural Network that detects objects in images and videos. It gets its name (You Only Look Once) from its technique of localizing and classifying objects in an image in just one forward pass over the network. ‍The YOLO algorithm presented a major improvement over the previous 2-stage object detection algorithms like R-CNN and Faster R-CNN in the inference speed. In an attempt to increase the speed and accuracy of object detection, numerous versions of YOLO have been introduced over the years. The latest version is the YOLOv7.‍Using YOLO on the CVAT platform, you can annotate images and videos within minutes, significantly reducing the amount of manual work that image annotations usually call for.‍We hope this tutorial helped you understand the concept and architecture of YOLO, and that you can now use it to detect and annotate objects in your own image data.‍

Tutorials & How-Tos

January 2, 2023

Tutorials & How-Tos

Four Ways to Automate Your Labeling Process in CVAT

Save Time,
Annotate Better

Subscribe to the CVAT Newsletter

Product & Services

Company

Resources

Tutorials & How-Tos

Four Ways to Automate Your Labeling Process in CVAT

How to Create Data Labeling Specifications for Your Annotation Project: A Client's Guide (+ Free Template)

Annotate Images and Videos in CVAT.ai as a Team: A Step-by-Step Guide

CVAT.ai Annotation Actions: Perform Bulk Actions on Filtered Shapes

Tips on how to annotate overlapping objects with CVAT

Mastering Image Annotation Crowdsourcing for Computer Vision with CVAT.ai and HUMAN Protocol

An Introduction to Automated Data Annotation with CVAT CLI

How to Ensure Quality in Image Annotation with CVAT.ai

Improve Annotation Speed and Accuracy with Layers in CVAT.ai

Quality control of Image annotation in CVAT

Accelerate image classification with CVAT

CVAT SDK PyTorch adapter: using CVAT datasets in your ML pipeline

How to automatically detect objects with YOLO in CVAT

Save Time, Annotate Better

Subscribe to the CVAT Newsletter

Product & Services

Company

Resources

Save Time,
Annotate Better