Register
Login
Resources
Docs Blog Datasets Glossary Case Studies Tutorials & Webinars
Product
Data Engine LLMs Platform Enterprise
Pricing Explore
Connect to our Discord channel

label_studio.md 5.4 KB

You have to be logged in to leave a comment. Sign In

Label Studio

Label Studio is a powerful open-source tool that supports the labeling of many unstructured and structured data types. It provides an easy-to-use and intuitive UI with various templates you can easily customize. DagsHub Annotations – our integration with Label Studio provides a fully configured labeling workspace, with a built-in Label Studio instance fired up and ready to go.

How does the Label Studio integration work with DagsHub?

Every repository on DagsHub is configured with a labeling workspace that has label studio installed. The workspace has full access to the project files, making them available to annotate directly from DagsHub's interface. To scale your work, DagsHub Annotations enable you to create multiple labeling projects on the workspace that are isolated from one another.

DagsHub Annotations provides a unique Git-Flow for labeling to ensure full reproducibility, scalability, and efficient version control of the labels and data. When creating a new labeling project, you associate it with a tip of an active branch which simulates the branching action. DagsHub loads and associates the annotations held on the branch with their tasks. You can version the annotations using Git, and once the labeling task is complete, create a pull request on DagsHub, where the reviewer can see and comment on every label.

How to create a new project on DagsHub Annotations?

To create a new labeling project for the first time, navigate to the Annotations tab and create a new workspace. This process can take 2-3 minutes as DagsHub spins up the Label Studio machine behind the scenes.

??? illustration "Create Label Studio workspace"

<br/>
<center>
  <video autoplay loop muted playsinline width="80%">
    <source src="../../tutorial/assets/create-workspace.webm" type="video/webm">
    <source src="../../tutorial/assets/create-workspace.mp4" type="video/mp4">
  </video>

 <sub>Create Label Studio workspace</sub></center>
<br/>

Once the workspace is ready, create a new project and associate it with an active branch. This marks the project's starting point and will make all the files hosted on DagsHub Storage, under the selected branch, available for labeling.

??? illustration "Create a Label Studio project"

<br/>
<center>
  <video autoplay loop muted playsinline width="60%">
    <source src="../../tutorial/assets/create-project.webm" type="video/webm">
    <source src="../../tutorial/assets/create-project.mp4" type="video/mp4">
  </video>

 <sub>Create Label Studio project</sub></center>
<br/>

How to choose files to label?

When you open the labeling project for the first time, you will have the option to select the files to annotate (AKA tasks). You can choose a specific file or an entire directory by checking the box next to its name.

??? illustration "Choose the files to annotate"

<br/>
<center>
  <video autoplay loop muted playsinline width="60%">
    <source src="../../tutorial/assets/choose-files.webm" type="video/webm">
    <source src="../../tutorial/assets/choose-files.mp4" type="video/mp4">
  </video>

 <sub>Choose the files to annotate</sub></center>
<br/>

How to version a Label Studio project?

DagsHub lets you version your annotations with Git and commit the changes to a remote branch directly from the UI. By clicking on the commit button, DagsHub Annotations saves your work in open source formats to a .labelstudio directory and provide the following options:

  • Save the annotations in one of the commonly used formats (JSON, COCO, CSV, TSV, etc.).
  • Commit the changes to the remote branch associated with the labeling project or to a new one.
  • Add a commit message.
![Commit Annotations](assets/commit-file.png) Commit Annotations

What is the .labelstudio directory ?

The .labelstudio directory is the source of truth for DagsHub Annotations. DagsHub Annotations saves the annotations of each task to a JSON file under the .labelstudio directory. When creating a new labeling project, DagsHub parses the selected branch for this directory and loads the existing annotations to their associated tasks, enabling you to switch between the labeling versions easily.

Note: The JSON file name is the task path in the original project hashed by SHA1 function.

How to load labels from different projects?

When creating a new labeling project, DagsHub parses the selected branch for the .labelstudio directory, loads the annotations it holds and associates them with their tasks.

Note: You can currently load annotations only created by DagsHub Annotations.


To learn more on how to use Label Studio with DagsHub please follow the end-to-end DagsHub Annotations tutorial.

Tip!

Press p or to see the previous file or, n or to see the next file

Comments

Loading...