Skip to content

DagsHub Annotations

DagsHub Annotations provides a fully configured labeling workspace, with full access to your project files, fired up and ready to go. It is fully integrated with Data Engine, and built on our integration with Label Studio.

DagsHub Annotations provide a unique labeling flow that ensures full reproducibility, scalability, and efficient version control of your annotations and data.

Looking for the Old Annotation Flow?

In the past, DagsHub used a git-flow based annotation system, which will soon be deprecated. See the old annotation docs

How DagsHub Annotations works?

Every repository on DagsHub is configured with a labeling workspace based on Label Studio.

When you create a data engine dataset or datasource, you can easily send any datapoint to be annotated from the Python Client, the UI, and our locally support Voxel51 visualization instance.

The workspace has full access to the project files, making them available to annotate directly from DagsHub's interface. To scale your work, DagsHub Annotations enable you to create multiple labeling projects on the workspace that are isolated from one another.

Once you're done labeling, you can save the annotations to your Data Engine enrichments, which are fully versioned. This enables you to return to previous annotation versions, compare them, and select the best option to train your model on.

In addition to the annotation metadata layer, it will create another enrichment layer with the annotation classes (called <annotation_enrichment_name>.labels.str). You can use this to query the classes per datapoint easily (e.g. ds["annotation.labels.str"].contains("cat")).

Getting Started with Annotations

The easiest way to get started with annotations is through our annotations use case guide.

Annotation Project & Annotator Management

DagsHub Annotations provides an easy way to have multiple annotators work simultaneously. When you send a dataset to be annotated, you can select whether to add the annotation tasks to an existing workspace or create a new one.

Add or create new annotation workspace

When you save annotations, each workspace will get it's dedicated metadata column, enabling you to version annotations, and compare between different options to get the best quality annotations for your training data.

Custom Label Configurations

DagsHub Annotations supports all of Label Studio's annotation templates, and the ability to create custom templates. To choose or customize label templates, simply click the settings button inside your label studio project, and select "Labeling Interface".

Go to label studio labeling interface

You can also fully customize your labeling interface by clicking on the "code" tab. For the full customization options see the Label Studio documentation.

Custom Labeling Interface

Auto Labeling with custom Machine Learning models

Auto labeling is critical for active learning, and can boost the amount of annotated data you have significantly. DagsHub supports connecting custom models to pre-annotate your data. We even wrote an entire tutorial about it. Read it now.

Label Studio SDK

DagsHub Annotations is fully compatible with the Label Studio Python SDK.
This SDK is the easiest way to try advanced usage of Label Studio - such as uploading custom tasks, computing metrics on annotations, programmatically doing active learning, etc.
The DagsHub Python client makes it very easy to get an authenticated instance of the Label Studio Python client!

Label Studio API

DagsHub Annotations is fully compatible with the Label Studio API. You can use it to hook into your label studio project and use it for any of your production annotation needs.