Are you sure you want to delete this access key?
title | description |
---|---|
DagsHub Data Engine - Annotating Data | Documentation on using Data Engine to annotate data |
Data Engine provides an easy way to annotate data and generate high-quality datasets for training. Using Dagshub's existing integration with Label Studio, you can easily attach annotations to relevant data points, visualize these annotations and use them to train your model.
Before annotating your data, make sure you connect your datasource to the Data Engine.
There are 2 ways to send data to be annotated. You can use the DagsHub client or use the local visualization instance.
To send data points to annotations using Dagshub client, use the .annotate()
function:
# Send all the data points in the datasource to annotation
ds.annotate()
To annotate selected data points from the local visualization instance, navigate to the Dagshub tab (if there is no Dagshub tab, click on the ‘+’ button and choose Dagshub) and click on the ‘Annotate selected in LabelStudio’ button:
After sending data points for annotation, a new window with the DagsHub web platform will open. From here you can either choose the annotation project to add the tasks to, or create a new project. This means you can manage the annotation process with multiple annotators, assigning the right tasks to the relevant annotator.
To add the selected annotation tasks to an existing annotation project, select the first option, Continue with one of the existing projects, and choose an existing one. To create a new annotation project, select the second option, Create new, and specify a name for it.
!!! note "Existing project configurations" You can import existing configurations (annotation templates, auto-labelers, etc.) to your new project by checking the ** Use project settings of:** option and choosing an existing project."
Click start, and you will be directed to your project with the relevant tasks.
You can update your datasource with the new annotations. To do that, annotate a datapoint (a task) and click on the Submit button once finished:
Once you are done annotating, click on the Save button at the right top of the screen:
Each annotation is saved as an enrichment field, named annotation, on the corresponding data point. The annotation is saved as blob with a Label Studio json format as the content.
After saving the annotations your enrichment fields will be updated. You can then display them within your visualization
instance. Run the .visualize()
command again to update and display your new annotations:
!!! note "Auto labeler configurations" To set up an auto labeler (ML backend) and convert model predictions to annotations in your project, check out the official Label Studio Documentation.
Press p or to see the previous file or, n or to see the next file
Are you sure you want to delete this access key?
Are you sure you want to delete this access key?
Are you sure you want to delete this access key?
Are you sure you want to delete this access key?