Register
Login
Resources
Docs Blog Datasets Glossary Case Studies Tutorials & Webinars
Product
Data Engine LLMs Platform Enterprise
Pricing Explore
Connect to our Discord channel

annotate_data.md 3.7 KB

You have to be logged in to leave a comment. Sign In
title description
DagsHub Data Engine - Annotating Data Documentation on using Data Engine to annotate data

Annotating your data

Data Engine provides an easy way to annotate data and generate high-quality datasets for training. Using Dagshub's existing integration with Label Studio, you can easily attach annotations to relevant data points, visualize these annotations and use them to train your model.

Before annotating your data, make sure you connect your datasource to the Data Engine.

Sending data points to annotations

There are 2 ways to send data to be annotated. You can use the DagsHub client or use the local visualization instance.

Send to annotation using the DagsHub client

To send data points to annotations using Dagshub client, use the .annotate() function:

# Send all the data points in the datasource to annotation
ds.annotate()

Send to annotation from the visualization instance

To annotate selected data points from the local visualization instance, navigate to the Dagshub tab (if there is no Dagshub tab, click on the ‘+’ button and choose Dagshub) and click on the ‘Annotate selected in LabelStudio’ button:

Send to Annotations from Voxel

After sending data points for annotation, a new window with the DagsHub web platform will open. From here you can either choose the annotation project to add the tasks to, or create a new project. This means you can manage the annotation process with multiple annotators, assigning the right tasks to the relevant annotator.

To add the selected annotation tasks to an existing annotation project, select the first option, Continue with one of the existing projects, and choose an existing one. To create a new annotation project, select the second option, Create new, and specify a name for it.

Add Datapoints to Annotation Project

!!! note "Existing project configurations" You can import existing configurations (annotation templates, auto-labelers, etc.) to your new project by checking the ** Use project settings of:** option and choosing an existing project."

Click start, and you will be directed to your project with the relevant tasks.

Annotation Project with Tasks

Saving annotations back to your datasource

You can update your datasource with the new annotations. To do that, annotate a datapoint (a task) and click on the Submit button once finished:

submit_annotation

Once you are done annotating, click on the Save button at the right top of the screen:

save_annotations

Each annotation is saved as an enrichment field, named annotation, on the corresponding data point. The annotation is saved as blob with a Label Studio json format as the content.

After saving the annotations your enrichment fields will be updated. You can then display them within your visualization instance. Run the .visualize() command again to update and display your new annotations:

Display Annotations in Voxel

!!! note "Auto labeler configurations" To set up an auto labeler (ML backend) and convert model predictions to annotations in your project, check out the official Label Studio Documentation.

Tip!

Press p or to see the previous file or, n or to see the next file

Comments

Loading...