Skip to content

Managing Datasets with Data Engine

DagsHub helps you manage your large scale datasets easily, so you can focus on improving your models. To do this, we build Data Engine which includes tools and APIs for querying, visualizing, annotating, and generating dataloaders to easily train, evaluate and debug your models.

The following use cases will guide you on how to use the Data Engine end to end; from connecting and enriching your data to querying and visualizing, annotating and finally training and improving your model.

Before using DagsHub Data Engine, make sure you have the DagsHub Client installed:

pip install dagshub

next, make sure to import the Datasource class:

from dagshub.data_engine import datasources

Start using Data Engine by connecting your datasource