Create a new Dataset version

From the Datalake

In order to create a new version of a dataset, you need to go to your "Datalake" page and select the assets that you want to add to the new version.

Then click on "Create Dataset Version"

You can then select the dataset version to base your new version.

Then chose the name to give to this dataset version.

From a Dataset

If you want to fork an already existing version of one of your dataset, this is the simplest way to go.

When you are on the page listing all your datasets, you can click on the name of the one you want to create a new version of.

You will arrive on a page listing all the versions of your dataset in the form of a 'Git graph'. There you can click on the icon in the top-right corner of the version you want to fork.

Then you will have to choose wether you want to :

  • clone the images alone

  • clone the images and the labels

  • clone the images, the labels and all the annotations of the selected version

Enter a name for your new version and then select one of the three button according to your need.

With Python SDK

pip install picsellia

First make sure that you have Picsellia Python package installed

then you will need to initialize the Client with your API Token, available in you profile page.

from picsellia.client import Client
clt = Client(api_token="your token")

If are not obliged to fetch pictures to create a new dataset version, if you don't, the new version will have the same assets as the origin.

pictures = clt.datalake.picture.fetch(quantity=1, tags=['tag1'])

But if you want to add new pic to this version, fetch assets like this :)

then you can create a new version of the Dataset

clt.datalake.dataset.new_version(name='dataset2', 
                            version='4th',
                            from_version='latest',
                            pictures=pictures)

The from_version parameter allows you to select the origin version from your new one, default parameter is 'latest' corresponding to the most recent version of your dataset :).

Last updated