In computer vision, machine learning, and spatial analysis, 3D point cloud annotation is often used to help convert raw 3D data into structured, meaningful information. In doing so, the transformation enables algorithms to recognize objects, environments, and spatial relationships and powers the use of real-world applications from autonomous navigation to industrial inspection.But what exactly is a point cloud, what applications use them, and what are the best ways to annotate them?This article explains what point clouds are, where they’re used, and how CVAT streamlines the annotation process, from raw scan to labeled dataset.What is a Point Cloud?A point cloud is a digital map of an object's surface, made up of individual points captured by a scanning system. These datasets form the foundation for most 3D computer vision tasks and are produced using methods such as LiDAR, photogrammetry, stereo vision, laser light or structured light-based systems. There are subtle variations between how each type of 3D scanner works, but fundamentally, they all use light to capture surface geometries, resulting in a 3D map.Depending on the size of the object, the number of points can range from just a few up to trillions of points. For example, a point cloud of a cube could be created from just 8 points (one for each corner), while the recent 3D scan of the Titanic wreck resulted in a 16 terabyte dataset comprising billions of points. The Titanic scan (made using photogrammetry) is so detailed that individual bolts on the ship and even a necklace are visible in the scan data.Titanic 3D scan (Source)Urban planning and planetary LiDAR scans are even bigger and can feature trillions of points in the point clouds.Once the raw scan data has been acquired by the scanning hardware of choice, it can be processed into a usable format. This is typically achieved by aligning, filtering, and converting the raw data into a structured point cloud for visualization, measurement, modeling…or annotation.Applications of Point Clouds Data in Computer VisionFrom autonomous vehicles and robotics to drones and geospatial analysis, point clouds enable machines to interpret real-world geometry. Annotated 3D data enables detection, reconstruction, and spatial reasoning across a broad swath of AI applications such as:Autonomous DrivingLiDAR-generated point clouds are used for 3D object detection, lane and road mapping, obstacle avoidance, and real-time vehicle localization.RoboticsRobots need point clouds to map their surroundings, stay clear of crashes, spot objects, and move around in dynamic environments.Augmented and Virtual Reality (AR/VR)Used to reconstruct real-world environments for immersive experiences, enabling realistic interaction with virtual objects.Industrial Inspection and Quality ControlCapture precise 3D models of manufactured parts to detect defects, verify tolerances, and ensure conformity with design specifications.Construction and ArchitectureUsed for as-built documentation, site surveys, clash detection, and the creation of accurate digital twins for buildings and infrastructure.Geospatial Analysis and MappingPoint clouds from aerial LiDAR or drones are used for terrain modeling, land classification, flood simulation, and urban planning.AI and Machine LearningAnnotated point clouds are used by researchers to train machine learning models for segmentation, object classification, and scene understanding in 3D.Each application relies on the precision and richness of point cloud data to bridge the gap between raw spatial input and actionable digital insight.Point Cloud Raw Data Point clouds are typically represented using Cartesian (XYZ) coordinates, but they aren't always captured that way at the source. Many 3D scanners (LiDAR systems in particular) initially collect data in spherical coordinates, recording each point’s distance from the scanner (r), horizontal angle (θ), and vertical angle (φ). In other cases, such as tunnel inspection or pipe mapping, scanners may use cylindrical coordinates. Range imaging systems often store depth as pixel intensity in a 2D grid. These native coordinate systems reflect the scanner’s internal geometry and sensing method, optimized for capturing specific environments. However, for consistency and compatibility (whether in CAD, simulation, or AI pipelines), these formats must be converted. Using trigonometric transformations, spherical and cylindrical data are recalculated into standard XYZ coordinates, where each point is defined by its position along three perpendicular axes. The XYZ format is universally supported by common point cloud file types like .ply, .pcd, .xyz, and .las, making it essential for downstream processing. So, while point clouds may originate in various coordinate systems, they are almost always converted into XYZ for storage, visualization, and further analysis.Readily Available 3D Datasets2D datasets are quite laborious to obtain manually, so you can imagine how much of a resource-intensive task gathering training data for 3D applications is.Thankfully, there is a wide range of publicly available point cloud datasets available. Many are free and open access, and some require the purchase of a license. The table below shows a selection of popular datasets available, which you may wish to use for your model training.<table class="table-class"> <tr> <th>Dataset</th> <th>Application Type</th> <th>Environment</th> <th>Open Access?</th> </tr> <tr> <td>KITTI Autonomous Driving, SLAM</td> <td>KITTI Outdoor (Urban)</td> <td>KITTI Yes</td> </tr> <tr> <td>nuScenes Autonomous Driving</td> <td>nuScenes Outdoor (Urban)</td> <td>nuScenes Yes</td> </tr> <tr> <td>Waymo Open Autonomous Driving</td> <td>Waymo Open Outdoor (Urban/Suburban)</td> <td>Waymo Open Yes (non-commercial use)</td> </tr> <tr> <td>ApolloScape 3D Scene Parsing</td> <td>ApolloScape Outdoor (Urban)</td> <td>ApolloScape Yes</td> </tr> <tr> <td>ScanNet 3D Reconstruction, Semantic Segmentation</td> <td>ScanNet Indoor</td> <td>ScanNet Yes</td> </tr> <tr> <td>ShapeNet Object Classification, Segmentation</td> <td>ShapeNet Object-Level</td> <td>ShapeNet Yes</td> </tr> <tr> <td>ObjectNet3D 2D-3D Alignment, Pose Estimation</td> <td>ObjectNet3D Object-Level</td> <td>ObjectNet3D Yes</td> </tr> <tr> <td>Toronto-3D Urban Scene Segmentation</td> <td>Toronto-3D Outdoor (Street-Level)</td> <td>Toronto-3D Yes</td> </tr> <tr> <td>DALES Aerial Mapping, Segmentation</td> <td>DALES Outdoor (Aerial/Suburban)</td> <td>DALES Yes</td> </tr> <tr> <td>NPM3D Mobile Mapping, Localization</td> <td>NPM3D Mixed (Indoor/Outdoor)</td> <td>NPM3D Yes (CC-BY-SA License)</td> </tr>
</table>3D Point Cloud Annotation in CVATCVAT contains a variety of tools for annotating a range of data types, from static images to moving video. A later addition to the software saw the ability to annotate 3D scan data in the form of point clouds.This is particularly important for those who wish to train their models with three-dimensional data.Cuboids are used for point cloud annotation in CVATWhile the 2D image and video data comes with a large selection of annotation tools (such as skeleton, ellipse, square, mask), annotation of point cloud data in CVAT is done exclusively with cuboids.Cuboids provide a balance between simplicity and spatial context. Cuboids are:Easy to manipulate in 3D spaceSufficient for common tasks like object detection and trackingCompatible with widely used datasets like KITTI and formats used in frameworks like OpenPCDet, MMDetection3D, etc.Understanding Labeling Tasks and TechniquesBefore beginning the annotation workflow in CVAT, it helps to understand how labeling tasks are structured and what techniques can improve accuracy and consistency.What Is a Labeling Task in CVAT?In CVAT, a labeling task defines the scope of your annotation project. Each task includes:A name and descriptionA set of labels or object classes (e.g., “car,” “pedestrian,” “tree”)Optional attributes (e.g., “moving,” “occluded”)A dataset to be annotated (images, video frames, or 3D point clouds)For 3D point clouds, tasks support both static scans and sequential frames, allowing for temporal annotation (e.g., tracking objects across time). Each task can contain multiple jobs, which are the individual segments of data assigned to annotators. It allows us to make the annotation process parallel. Defining Clear Labels and AttributesBefore uploading your dataset, define your label structure carefully according to your business or research requirements. Avoid vague labels, and keep class names consistent. For example, use “vehicle” consistently instead of alternating with “car” or “van.” Add attributes to capture additional information, such as:Object state: moving, stationary, partially_visiblePhysical characteristics: damaged, open, closedAttributes can be marked as mutable (changes over time) or immutable (stays constant), which helps simplify the annotation interface and improve training consistency.Techniques for Effective 3D AnnotationTo annotate efficiently:Use Track mode to maintain object IDs across framesPlace cuboids in the main 3D viewport and refine them in the Top, Side, and Front orthogonal views. Two projections are usually enough to place the cuboid correctly across all axes.Use contextual 2D images (if available) to support difficult annotationsApply interpolation for objects in motion across multiple framesFlag ambiguous or occluded annotations with appropriate attributesProper task setup and labeling discipline not only make the process smoother, they also ensure that the resulting dataset is accurate, structured, and ready for downstream AI training.Annotating Point Clouds in CVAT: OverviewCVAT’s 3D point cloud annotation workflow is straightforward. The user simply creates a task, loads the dataset, places and adjusts cuboids, and optionally propagates or interpolates the cuboids across frames. Here’s an abbreviated overview of the full process, from start to finish.Create a 3D annotation task.Open the task and explore the interface layout.Navigate the 3D scene using mouse and keyboard controls.Create and adjust cuboids for annotation.Copy and propagate annotations across frames.Interpolate cuboids between frames.Save, export, and integrate annotated data into your pipeline.Before training begins, it’s good practice to run a validation script to check for label inconsistencies, misaligned cuboids, or frame mismatches. Ensuring clean, well-structured annotation data is just as critical as the model architecture itself.For a more fleshed out tutorial on how to annotate 3D point clouds, head on over to our official guide.Challenges in 3D Point Cloud AnnotationAnnotating 3D point clouds comes with a distinct set of challenges (both technical and human) that can significantly affect the quality of your dataset.One of the biggest issues is occlusion. Since point clouds are generated from specific sensor perspectives, any surfaces not visible to the scanner (the back of an object or areas blocked by other objects, for example) simply don’t appear. This missing data can make it difficult to annotate complete geometries with confidence. This lack of visual information can make it difficult to fully interpret object boundaries, identify shapes, or distinguish between overlapping items. In dense or cluttered scenes, occlusion can lead to under-representation of key objects and introduce ambiguity during annotation.Example of an occluded and non-occluded 3D objects annotated in CVAT Point density is another problem. Objects close to the sensor may be richly detailed, while distant objects can appear sparse or fragmented. Low-density regions often result in uncertainty when drawing precise cuboids or estimating object boundaries.Add to that sensor noise, which can result from misfired points, ghosting from reflective surfaces, or jitter from moving elements in a scan, and the result is a lot of visual clutter that annotators must mentally filter out.Then there’s annotation fatigue. Unlike 2D image annotation, working in 3D often involves constant panning, zooming, and adjusting the scene from different angles. This level of interaction increases the mental load and can lead to inconsistency across sessions.To help mitigate this, CVAT allows the use of contextual 2D images alongside point clouds, displayed in separate windows within a 3D annotation task.Best Practices for High-Quality Point Cloud Data AnnotationGetting 3D point cloud annotation right isn’t complicated, but it does require discipline. The most common issues come from inconsistency and lack of structure, both of which are easy to avoid if you put the right systems in place from the start.Be consistent: Start with label consistency. Stick to a fixed label set. Don’t call something a “car” in one frame and a “vehicle” in another. CVAT’s label constructor locks this down, so use it. It stops annotators from improvising with naming conventions.Attribute properly: If you’re annotating objects with different states (like “open/closed” or “damaged”), don’t create separate labels. Add a mutable attribute. For fixed traits (like color or make), use immutable ones. It keeps the label space clean and keeps your training data flexible.Establish Annotation Guidelines: Whether you're working solo or with a team, define clear rules for edge cases, like how to handle partial occlusion, ambiguous shapes, or overlapping objects. A short internal guideline document can eliminate confusion and reduce rework later on.Quality assurance: Apply some basic QA. Do a second pass. Spot-check frames. Use annotation guidelines. If multiple annotators are involved, establish consensus rules for edge cases. You don’t need a formal pipeline, you just need to try to avoid leaving junk labels, floating cuboids, or inconsistent tags in the data.Don’t Over-Label: Last, but not least, it can be tempting to annotate every object in the scene, but not all data is equally useful. Focus on what your model actually needs to learn. Prioritize annotation quality over quantity, especially when resources are limited.Clean data is trainable data. Anything else just provides more work unnecessarily down the line.CVAT Labeling ServiceWhile point cloud data annotation with CVAT is relatively straightforward, not everybody has the luxury of time or other resources to commit to the data annotation process - it can be time-consuming after all, particularly when dealing with huge datasets.If you fit into this category and would rather outsource your data labeling needs, you will be pleased to know that CVAT offers our own services for such tasks.Our professional data annotation services offer expert annotation of your computer vision data at scale, regardless of if the data is point cloud, image- or video-based. Our team of experts ensures high-quality annotations and provides detailed QA reports so you can focus on your core experience and computer vision algorithms. Why You Should Consider CVAT for Point Cloud Data Annotation3D point cloud annotation isn’t glamorous, but it’s a vital step in building reliable 3D perception systems. Whether you’re working on autonomous vehicles, machine learning, robotics, or spatial AI, well-structured annotations make the difference between a model that just runs and a model that performs.CVAT offers annotation tools such as cuboids, multi-view layouts, contextual 2D images, and export formats compatible with common frameworks like OpenPCDet, TensorFlow, and MMDetection3D. Getting high-quality annotations means doing the basics right: using consistent labels, applying attributes carefully, and maintaining coherence across frames. CVAT’s propagation and interpolation tools help speed that up while reducing manual error. And before you push your dataset into training, take the time to review and validate it, because annotation mistakes are a lot cheaper to fix before the model starts learning from them.In short, clean data leads to cleaner results. The effort you put into annotation shows up later in model accuracy, stability, and generalization. And CVAT gives you the foundation to build clean datasets. What you choose to do with the annotated data afterwards is down to your own ingenuity!To try CVAT for your own workflows, you can sign up for a free account here.


Annotation 101
June 2, 2025
Point Cloud Annotation: A Complete Guide to 3D Data Labeling