Are you sure you want to delete this access key?
Label Studio is a powerful open-source tool that supports the labeling of many unstructured and structured data types. It provides an easy-to-use and intuitive UI with various templates you can easily customize. DagsHub Annotations – our integration with Label Studio provides a fully configured labeling workspace, with a built-in Label Studio instance fired up and ready to go.
Every repository on DagsHub is configured with a labeling workspace that has Label Studio installed. The workspace has full access to the project files, making them available to annotate directly from DagsHub's interface. To scale your work, DagsHub Annotations enable you to create multiple labeling projects on the workspace that are isolated from one another.
DagsHub Annotations provides a unique Git-Flow for labeling to ensure full reproducibility, scalability, and efficient version control of the labels and data. When creating a new labeling project, you associate it with a tip of an active branch which simulates the branching action. DagsHub loads and associates the annotations held on the branch with their tasks. You can version the annotations using Git, and once the labeling task is complete, create a pull request on DagsHub, where the reviewer can see and comment on every label.
To create a new labeling project for the first time, navigate to the Annotations tab and create a new workspace. This process can take 2-3 minutes as DagsHub spins up the Label Studio machine behind the scenes.
??? illustration "Create Label Studio workspace"
<br/>
<center>
<video autoplay loop muted playsinline width="80%">
<source src="../../tutorial/assets/create-workspace.webm" type="video/webm">
<source src="../../tutorial/assets/create-workspace.mp4" type="video/mp4">
</video>
<sub>Create Label Studio workspace</sub></center>
<br/>
Once the workspace is ready, create a new project and associate it with an active branch. This marks the project's starting point and will make all the files hosted on DagsHub Storage, under the selected branch, available for labeling.
??? illustration "Create a Label Studio project"
<br/>
<center>
<video autoplay loop muted playsinline width="60%">
<source src="../../tutorial/assets/create-project.webm" type="video/webm">
<source src="../../tutorial/assets/create-project.mp4" type="video/mp4">
</video>
<sub>Create Label Studio project</sub></center>
<br/>
When you open the labeling project for the first time, you will have the option to select the files to annotate (AKA tasks). You can choose a specific file or an entire directory by checking the box next to its name.
??? illustration "Choose the files to annotate"
<br/>
<center>
<video autoplay loop muted playsinline width="60%">
<source src="../../tutorial/assets/choose-files.webm" type="video/webm">
<source src="../../tutorial/assets/choose-files.mp4" type="video/mp4">
</video>
<sub>Choose the files to annotate</sub></center>
<br/>
DagsHub lets you version your annotations with Git and commit the changes to a remote branch directly from the UI. By
clicking on the commit button, DagsHub Annotations saves your work in open source formats to a .labelstudio
directory and provide the following options:
JSON
, COCO
, CSV
, TSV
, etc.)..labelstudio
directory ?The .labelstudio
directory is the source of truth for DagsHub Annotations. DagsHub Annotations saves the annotations
of each task to a JSON
file under the .labelstudio
directory. When creating a new labeling project, DagsHub parses
the selected branch for this directory and loads the existing annotations to their associated tasks, enabling you to
switch between the labeling versions easily.
Note: The JSON
file name is the task path in the original project hashed by SHA1
function.
When creating a new labeling project, DagsHub parses the selected branch for the .labelstudio
directory, loads the
annotations it holds and associates them with their tasks.
Note: You can currently load annotations only created by DagsHub Annotations.
Currently, importing labels into Label Studio is a little bit of a manual process. Here's an overview of what needs to be done:
.labelstudio
directory at the root of your repo, if one does not already exist.label_config.xml
file to the .labelstudio
directory, which includes information about your classes. See example below..labelstudio
directory with JSON
files. See example below..labelstudio
directory with Git (NOT DVC) to the repo.The first time you try this, you might want to consider having Label Studio create templates for you to edit. This will allow you to see the format of the files and get a feel for how you need to structure your script to import you annotations.
To do this:
label_config.xml
file is properly configured.JSON
files you can use as a template when importing your annotations.label_config.xml
file contain?The label_config.xml
file describes the classes available for labeling and the settings for the annotation project. For example:
<View>
<Image name="image" value="$image" zoomControl="false" zoom="false"/>
<RectangleLabels name="label" toName="image">
<Label value="Baby-Yoda" background="#FFA39E"/>
<Label value="Mando" background="#0d73d3"/>
</RectangleLabels>
</View>
This defines two classes for an object detection model, Baby-Yoda and Mando and sets the color of the annotations when viewed via Label Studio.
You can find further examples here and here.
JSON
files contain?The JSON
files that live in the .labelstudio
directory describe the annotations for the images. There will be one JSON
file per image and they have a very specific format.
The name of the file should be the SHA1 hash of the path to the image, relative to the root of the repo.
For example:
import hashlib
image_file = 'data/images/train/backyard_squirrels_000000.jpg'
filename_hash = hashlib.sha1(image_file.encode("utf-8")).hexdigest()
json_file = filename_hash + '.json'
In this example, the json_file
would be 56f38098ffea4d6937b855e7ec2f01246526ff0e.json
You can find this particular JSON
file here.
A few things to note based on this object detection example:
repo://9eabb902f1980a3215cf1d7ec90038b990a88a5d/data/images/train/backyard_squirrels_000000.jpg
) is any commit hash where the image exists in the repo. This means you need to commit your data to DVC before importing annotations into Label Studio.To see an example of how to generate these JSON
files, checkout this script for creating Label Studio annotations from existing YOLO-style annotation files.
When starting a Label Studio project using the process above, select the directory that contains the images, but not the annotations.
 Selecting Only Images When Starting a ProjectTo learn more on how to use Label Studio with DagsHub please follow the end-to-end DagsHub Annotations tutorial.
DagsHub currently supports labeling in non-mirror repositories, but we might soon. Please, contact us on our Discord server if you find it important.
Press p or to see the previous file or, n or to see the next file
Are you sure you want to delete this access key?
Are you sure you want to delete this access key?
Are you sure you want to delete this access key?
Are you sure you want to delete this access key?