Create a new Dataset Version with merged labels

Here, we will learn how to create a new Dataset version with less labels in order to specify a model

Let's do it with an example !

Objectives

We want to build a Dataset in order to train a model to differentiate people from vehicles in an aerial view, but unfortunately we do not have a custom Dataset for this precise task.

Hopefully we do have a Dataset that could be a great fit with some modifications.

We have a Dataset called VizDrone Dataset (available on our Dataset Hub), that contains 6470 pictures, and more thant 30k + objects annotated with the given repartition.

As you can see, the Dataset repartition is quite unbalanced, but don't worry, we'll find a solution

Creation the right version of this Dataset

Ok, so let's create a new version of this Dataset in order to perform operations on it.

First let's go the VizDrone Dataset overview and select all assets

Then click on create new version :)

We will call it tutorial , now click on "Create blank" in order to create a dataset with all the images but no configured labels and annotations.

The version creation can be quite long, but don't worry you can change page of grab a cup of tea !

We can see that our 'tutorial' version of our Vizdrone dataset has been correctly created, now click on the 'Settings' button to go configure your labels.

Configure your new labels

We can go to the 'Labels' tab in our settings just like below to start the configuration.

Click on 'New Labels', you should see a list of available tasks appear below

Our original dataset is annotated with bounding-boxes so we will select 'Object Detection' as our task.

Now you can enter new labels, we have only created the label 'people' and 'vehicle' as our final goal is to merge the classes from the original dataset in those two classes of our new dataset.

You can now click on 'Create labels'

Merge your annotations

Let's now go back to the settings but to the 'Annotations' tab.

We first select the label 'people' as the target label.

We can then select the original labels 'people' and 'pedestrian' in the list, then we select 'vehicle' as target label

We select all the vehicles in the original label list, as you can see in the instructions, 'people' and 'pedestrian' are going to be merged into 'people' and 'car, van, truck, motor and bus' are going to be merged into 'vehicle', it looks great so far !

Now we can click on 'Execute Instructions' to perform the merge operation, and we just have to wait !

This operation can take a few minutes so don't worry if you don't see them appear correctly in seconds.

And here we are, after a few minutes, we can now work with a dataset that has only two classes, with a lot of objects in each of them !

Last updated