Skip to content

DagsHub Documentation

Everything you need to know to use DagsHub like a pro.

DagsHub: A Single AI Platform to Manage Multimodal AI Datasets & Models

What is DagsHub?

DagsHub is an AI platform that helps developers and teams manage the entire lifecycle from data collection, through dataset curation and annotation, tracking experimentation (both model training and prompt engineering), to model management. DagsHub is based on open source tools and formats (such as Git, DVC, MLflow, Label Studio, and others) so it should quickly feel familiar.

If you're not sure where to start, check out the Quick Start section - it will take you step by step through the flow below with short video tutorials.

DagsHub Workflow Diagram
DagsHub Workflow Diagram

DagsHub was particularly designed for unstructured and multimodal data types – e.g. text, images, audio, video, documents, medical imaging, and binary files.

Above you can see a simplified ML/AI workflow diagram and where DagsHub fits in. DagsHub integrates with your storage and compute providers, and while you can host your data on the platform with DagsHub Storage, we don't currently provide compute resources, and rely on the compute infrastructure you have, which can range from running training and deployments locally, in the cloud, or on edge devices.

Not Sure Where to Start?

DagsHub offers many ways to improve your machine learning workflow. If you know what you're doing, feel free to explore. If not, here are 2 recommended options:

  •  Getting Started


    Check out our step-by-step guides and video tutorials that will take you through the flow above

  •  Hello World Colab


    A step-by-step computer vision tutorial, zero local setup needed. For other data types, see our tutorials section

Key Use Cases

You can do a lot of things with DagsHub, but here are some of the things DagsHub users usually use the platform for:

DagsHub Overview Video