The article examines the implementation of Identity and Access Management (IAM) in the Computer Vision Annotation Tool (CVAT), which is a part of the OpenCV ecosystem. CVAT helps to annotate raw images and video files to produce a ready-to-use Computer Vision dataset in popular formats such as MS COCO, PASCAL VOC, YOLO, etc. Here we will cover:
When CVAT was released as an open-source project, it already had multi-user support. Each user could perform one or more roles such as observer, annotator, user, or admin. A primitive form of Attribute-Based Access Control (ABAC) was used to access resources like tasks, jobs, and projects. Thus, the owner or an assignee of the resource could view, modify, or delete it.
The system was poorly designed and inflexible. Users reported many issues about that. Overall, the most frequent request was to limit data annotators’ rights. Workers should not be able to export datasets, reassign jobs to others, arbitrarily change the job’s statuses, finish jobs, etc. Many CVAT users are AI developers who delegate annotation tasks to other workers, however they still need to control all the stages of the data annotation pipeline. Another part of CVAT’ss userbase is labeling companies that annotate data for external customers, and it is critical for them to avoid leaks.
The snippet of code has a predicate that checks rights to change a job object and a permission class to enforce these rights. The complete code for the example can be found here. Note that this code presents several roadblocks for maintaining and extending this solution:
It was necessary to find an alternative solution to overcome these problems, and a number of open-source products cover many aspects of IAM. One such product, KeyCloak, was found to be too heavy for our needs; the matchers for another product, pycasbin, are only suitable for simple use cases but otherwise require Python to implement functions.
Ultimately the clear winner was Open Policy Agent (OPA)- a light, powerful, and flexible solution. OPA (pronounced “oh-pa”) is an open-source, general-purpose policy engine. For more information, see their website. The rest of this article describes only the features that are used for integration with CVAT.
For CVAT, OPA is a microservice which has a REST API. Thus, CVAT sends HTTP requests (Query) to OPA that replies with “allow” or “deny” in the simplest case (Decision). Those who are acquainted with the fundamentals of authorization might recognize the PDP-PEP pattern. In this case, OPA is the Policy Decision Point (PDP) and makes decisions, while CVAT is the Policy Enforcement Point (PEP) and enforces these decisions by providing information about requested resources or responding with an error.
The question is how the policy engine understands how to respond to our queries correctly. OPA provides a high-level declarative language that lets you specify the policy as code and simple APIs to offload policy decision-making from your software. CVAT controls what queries look like and implements policies to handle them. After a query is processed by OPA in accordance with implemented policies, OPA replies to CVAT with its decision.
For example, suppose that somebody wants to protect information about users. In case of CVAT, this information can be extracted using the following operations:
GET /api/users - get the list of users
GET /api/users/self - get information about the current user
GET /api/users/1 - get information about the user with id #1
DELETE /api/users/1 - delete the user with id #1
Below is an example of a reply from GET /api/users/self:
Obviously, a newly registered user without admin permissions should not be able to delete another account. Our goal is to implement some policies to protect resources from undesirable actions.
# TODO: check that readers of the article will not be able to delete my account on cvat.ai
For the sake of practice, play with the example below at the rego playground. This code is simplified but the general ideas are the same as used in CVAT. We’ll start with a query:
Every query is an arbitrary JSON document, but in CVAT it consists of the scope, the resource, and information from the authentication system. In the example above, the scope is view. It means that the query requests permissions to view the specific resource. In general, the scope is an action that a user wants to do with a resource like view, update, create, import:dataset, etc. In this case the resource contains information about a user record which has one available attribute (id #3). The user on whose behalf the request is made has the privilege user and id #3.
The policies are written in the Rego language and grouped into modules:
Modules consist of a package, optional import statements, and optional rules. In the first line the instruction defines the package users, which will contain the policy for the user resource. Next, the allow variable is defined with a value of false by default. The policy consists of four rules that define the content of Virtual Documents in OPA. When OPA evaluates a rule, we say that OPA generates the content of the document. All expressions inside one allow rule are joined using AND, but different allow rules are joined by OR. Thus, the user policy in the code above means deny by default but allow if the user has the admin privilege OR the query’s scope equals to list OR the scope is one of update, delete, or view AND the identifier of the resource is the same as the identifier of the user on whose behalf the request is made.
Previously policies were encoded using Python, now it uses Rego; why is one better? There are multiple advantages:
However, there are also some disadvantages:
Handling Requests for Lists of Objects
There is one more significant and non-obvious advantage of OPA. One of the biggest problems in the PDP-PEP pattern is the handling of lists of objects. Imagine that it is necessary to return a list of users (in this case the scope is list), but the list should contain only objects which the user has the right to view. In other words, the list must be consistent with which queries with the scope set to view would be allowed.
The simplest solution is to send a number of queries with the scope set to view equal to the number of users in the system. Next, return the list of users for which corresponding requests were approved. Understandably, such a solution will work slowly.
By the end of March 2022, CVAT public server had more than 50 thousand registered users. The good news is that OPA can return an arbitrary JSON document. Hence there is another solution: it’s possible to replicate CVAT’s database into OPA as described in the documentation. Then, use Rego to extract from the data the list of filtered users. Unfortunately, this approach leads to a lot of technical difficulties.
There is one more lightweight, elegant solution: instead of a list of users, OPA can return a filter expression to apply on the PEP side. The filter expression will depend on the request. For a user with the “admin” privilege, it will filter nothing. But for a regular user, it will extract a restricted number of objects. A filter expression is a list of Django Q objects and operations on these objects in the reverse Polish notation for building a filter on the CVAT server side. In this case the filter expression contains only one Django Q object:
Now we hope it is clear how OPA is integrated into CVAT. The method described can be used as a guide to integrate this policy engine into other products.
But there is one more important question: How do we describe the implemented IAM system for the end user? The Rego language is a solution to develop and maintain policies. For end users, the code in .rego files is unfriendly and unreadable. We spent much time before a suitable solution was found. All permissions are now described in a table with a predefined set of columns. In the CVAT repository for each resource there is a CSV file that describes all permissions in a simple and human-readable form.
For example, the users.csv file describes permissions for working with information about users. Every line in the file is a primitive rule that tells us who has rights to perform a specific action. But how is users.csv related to users.rego? Each line in the CSV file corresponds to a rule in the .rego file. It is automatically checked by tests which have been generated from CSV files (e.g., users_test.gen.rego). It is known that it is a bad practice to put generated files into a repository, and we plan to fix this aspect in the future. The script that was used to generate each *.gen.rego file can be found at the end of the file as a comment.
Let’s look at users.csv and understand what it contains. All CSV files that describe permissions have the same set of columns:
At the moment it is difficult to say how suitable such an IAM system will be for all the needs of our community over time. It looks logical and allows us to solve all the related problems that were submitted by users in the past. If somebody needs to change the default behavior of their own CVAT code, it is possible to modify the policies defined in .rego files and restart the OPA microservice. Additionally, sharing a resource with other users, a popular feature in many applications, can be easily implemented in the future because data and policies in OPA can be changed dynamically using the OPA REST API.
Note: The world is not perfect. The biggest challenge is when a query affects multiple resources at the same time. For example, if somebody moves an existing task from project #X to project #Y. In this case it is necessary to update the project field in the task. Obviously, the current user must have permission to both modify the task and update the field. But it is less obvious that the user must also have permissions to create a task in the project #Y. Such cases are handled in the CVAT server code directly. In this specific case two sequential requests to OPA will be sent and both must be allowed by the policy engine.
To summarize: we’ve described how the new identity and access management system is implemented in CVAT. It was demonstrated how Open Policy Agent is integrated in accordance with the PDP-PEP pattern. The pros and cons of the approach were discussed. A solution was provided for how to describe user permissions in a readable tabular format (CSV).
Hopefully, developers and community members who are trying to solve a similar problem got a helpful overview of how the permission management system works in CVAT. Please leave your comments to help improve future articles and tools. Thank you!