This tutorial covers the basics of tracking experiments with DAGsHub and the DAGsHub Logger.
You will learn about data exploration, tracking experiment parameters and metrics, comparing experiments and more...
If you want to learn about using DVC and DAGsHub for data version control and pipeline management, we recommend you go to our Data Versioning Tutorial.
If you aren't familiar with all the tools mentioned, we recommend you start with this tutorial, and continue to the DVC tutorial afterwards.
The tools and features on DAGsHub are built in a modular way, meaning you can use whatever you like and add other features as you need them. This means that this tutorial is standalone and will already add value even if you don't use DVC.
Creating an awesome project using DAGsHub¶
In this tutorial, we'll create a model to predict whether a question on the Cross Validated Stack Exchange concerns Machine Learning or not.
This kind of prediction can be useful to recommend a user to add the
machine-learning tag to their question for example,
and make it more likely that they will get an answer.
This task is chosen as simple and clean enough for a tutorial, but leaves room for experimenting with feature engineering, data enrichment, and model selection.
The tutorial is divided into several "levels", each of which demonstrates another workflow improvement. It's designed so that you learn something useful at each "level", even if the level after that is less to your liking and you choose to stop early.
The levels are:
- Data Exploration - Getting the data and trying to understand it, otherwise known as doing exploratory data analysis.
- Setup - Creating a DAGsHub account and project.
- Experimentation - Logging hyperparameters and metrics to DAGsHub to keep track of and comparing different experiments.
Delicious statistics 😋 (source: Cross Validated)
Too slow for you?¶
Here is a link to the complete code repo. You can go over it or use the code as you wish.
The tutorial will guide you, step-by-step, to create this repo.