Register
Login
Resources
Docs Blog Datasets Glossary Case Studies Tutorials & Webinars
Product
Data Engine LLMs Platform Enterprise
Pricing Explore
Connect to our Discord channel

index.md 2.6 KB

You have to be logged in to leave a comment. Sign In
title description
Manage Datasets with DagsHub Data Engine Documentation on using Data Engine to create training-ready datasets that you can enrich, query, visualize, annotate and train with.

Curating and Managing Datasets

DagsHub helps you manage your large scale datasets easily, so you can focus on improving your models. To do this, we built Data Engine which includes tools and APIs for querying, visualizing, annotating, and generating dataloaders to easily train, evaluate and debug your models.

The following use cases will guide you on how to use the Data Engine end to end; from connecting and enriching your data to querying and visualizing, annotating and finally training and improving your model.

Before using DagsHub Data Engine, make sure you have the DagsHub Client installed:

pip install dagshub

next, make sure to import the Datasource class:

from dagshub.data_engine import datasources

Start using Data Engine by connecting your datasource

  • :material-connection:{.lg .middle}  Connect Datasource


    Connect the data you want to work with

  • :material-expand-all:{.lg .middle}  Enrich Data


    Add custom metadata, predictions and labels to your data

  • :simple-googlebigquery:{.lg .middle}  Query Data


    Query your data and generate new subsets to re-train your model

  • :octicons-eye-16:{.lg .middle}  Visualize your data


    Visualize data points and their enrichments

  • :dagshub-annotations:{.lg .middle}  Annotate your data


    Annotate relevant data points

  • :octicons-ai-model-24:{.lg .middle}  Train a model


    Train and improve your model with new datasets

Tip!

Press p or to see the previous file or, n or to see the next file

Comments

Loading...