Skip to content

Have a question or want to get to know us and other users? Join our community now:

Discord Chat

Tutorial Overview

This tutorial covers the basics of tracking experiments with DAGsHub and the DAGsHub Logger.

You will learn about data exploration, tracking experiment parameters and metrics, comparing experiments and more...

If you want to learn about using DVC and DAGsHub for data version control and pipeline management, we recommend you go to our Data Versioning Tutorial.
If you aren't familiar with all the tools mentioned, we recommend you start with this tutorial, and continue to the DVC tutorial afterwards.

The tools and features on DAGsHub are built in a modular way, meaning you can use whatever you like and add other features as you need them. This means that this tutorial is standalone and will already add value even if you don't use DVC.

Creating an awesome project using DAGsHub

In this tutorial, we'll create a model to predict whether a question on the Cross Validated Stack Exchange concerns Machine Learning or not.

This kind of prediction can be useful to recommend a user to add the machine-learning tag to their question for example, and make it more likely that they will get an answer.

This task is chosen as simple and clean enough for a tutorial, but leaves room for experimenting with feature engineering, data enrichment, and model selection.

The tutorial is divided into several "levels", each of which demonstrates another workflow improvement. It's designed so that you learn something useful at each "level", even if the level after that is less to your liking and you choose to stop early.

The levels are:

  1. Data Exploration - Getting the data and trying to understand it, otherwise known as doing exploratory data analysis.
  2. Setup - Creating a DAGsHub account and project.
  3. Experimentation - Logging hyperparameters and metrics to DAGsHub to keep track of and comparing different experiments.

    Screenshot Delicious statistics 😋 (source: Cross Validated)

Too slow for you?

Here is a link to the complete code repo. You can go over it or use the code as you wish.

The tutorial will guide you, step-by-step, to create this repo.