Top 10 bounding boxes annotation best practices
It is an imaginary rectangle that serves as a point of reference in detecting an object and it creates a collision box for that object. It can simply be described as a rectangle drawn around an object that indicates its position and defining the image as X and Y coordinates. This rectangle surrounds the object. The main purpose of a bounding box is to make it easier for Machine Learning Algorithms (MLA) to learn, find what they are looking for and preserve important computing resources. Bounding boxes are important for image annotation because they are responsible for training and testing data for a model that is expected to perform a Computer Vision task. Without these annotations, machines won’t be able to detect the objects of interest. Bounding box is one of the most popular image annotation techniques in deep learning. This method is preferred because it is cost effective and has higher annotation efficiency as compared to the others.
Whiles image augmentation is the act of increasing your dataset size through manipulating existing training data and helping a model generalize better to a wide array of contexts, a
bounding box level augmentation creates new training data by only revamping the content of a source image's bounding boxes. As this is done, it gives the developers much control in creating training data that is more suitable to their problem’s conditions. The process of augmenting bounding boxes involves 4 processes:
1. Import the required libraries
2. Define an augmentation pipeline
3. Read images and bounding boxes from the disk.
4. Pass an image and bounding boxes to the augmentation pipeline and receive augmented images and boxes.
Bounding Boxes are used in diverse areas to train algorithms to identify patterns. Some typical areas where bounding boxes are used includes:
Bounding box training data aids machines to identify objects on the road or streets. Such objects include traffic lights, other vehicles, street signs, pedestrians, and lanes. When the training data is extremely adaptable, it helps the machines to better recognize obstacles on the streets and execute instruction based on the perceived information.
Bounding boxes can also extend over object recognition with robotics and drone imagery. For instance, drones can detect damaged roofs, AC units, and the migration of species, if combined with precision, annotated training data. Bounding boxes allow robots and drones to easily identify physical objects from a distance.
When plant diseases are identified early, there is an increased chance of detection and prevention at an early stage. With the advent of smart farming, bounding box annotation helps to collect training data to train models to detect plant diseases.
Bounding boxes help to detect damages for insurance claims. In insurance, bounding box annotations are used to train a model that can immediately identify regular incidents or accidents. When there is havoc or damages on the body, the roof, front and trail light, broken window glasses, these defects can be identified by Computer Vision. Bounding box annotations help machines evaluate the extent of damage so that insurance companies can process claims properly.
Bounding box annotations aid in better product visualization in retail stores or online shops. They can recognize objects like skincare products, fashion items, pieces of furniture etc. when well labeled. Bounding box annotations can address the following in retail: Incorrect search results, the continuous digitization process and chaotically organized supply chains.
1. Introducing blur to objects
2. Rotating objects
3. Flipping the orientation of objects
4. Making objects brighter or reducing the brightness
5. Cropping images
1. Surrounding Sphere (SS)
2. Axis-Aligned Bounding Box (AABB)
3. Oriented Bounding Box (OBB)
4. Full Direction Hull (FDH)
5. Convex Hull (CH)
Our experts give you their best practices to get the most accurate annotations possible!
Intersection Over Union is a technique in computer vision used to identify as well as locate objects in digital images or videos. Bounding box annotations are used to achieve this purpose through localisation.
When there is excessive overlap between the predicting box and the bounding box, the model is unable to identify the targeted object within a dataset. The less the overlap, the more accurate the bounding box becomes.
In bounding box annotation, it is not advised to start with diagonally shaped items. This is because they naturally take up smaller spaces within the bounding box. The machine model may mistake the actual background as the targeted object, defeating the original purpose of the annotation.
It is advisable to focus on smaller objects because it hardly creates room for errors. Larger images are less accurate when used in bounding box datasets.
Every object of interest needs to be labelled as training models need to identify them all for accurate performance. It will be a catastrophic error to label an image and leave another. This is because models have been built in a way to know which pixel patterns are compliant with an object of interest.
Box size variations must be consistent. This is owing to the fact that large objects usually underperform in datasets, especially when the same type of object looks smaller. Training models with same size objects will also not allow it to function to perfection.
To ensure that pixels are perfectly tight the bounding box edges must touch the most out of pixels of the labelled object. Where gaps are left, a model may be unable to give accurate predictions.
Occluded objects are those which are not in full view because there is an obstruction blocking its visibility. That does not warrant it being left partially labelled or completely unlabelled. It should be annotated just like the fully visible images in the bounding box.
An object of interest should be thoroughly and fully labelled as partial labelling leads to confusion on the part of the model what the full object truly consists of.
It is best to be very specific in annotation rather than being general in outlook. This is because in the event of an error, a very specific label can be adjusted or re-annotated. However, in the event that labelling was general, a need will arise to label the whole dataset again to gain the needed accuracy.
We have a wide range of solutions and tools that will help you train your algorithms. Click below to learn more!