DagsHub Integrates with Colab: Build And Train ML Models With ZERO MLOps
  Back to blog home

DagsHub Integrates with Colab: Build And Train ML Models With ZERO MLOps

Google Colab May 15, 2023

DagsHub users can now open notebooks in a Colab environment directly from DagsHub (free GPU included) and also version and commit them back using Git or DVC. Seamlessly build, train, and collaborate on ML models with ZERO MLOps friction.

Check out the example notebook that also uses DagsHub Client for versioning

At DagsHub, we are committed to simplifying the machine learning development cycle by doing the MLOps heavy lifting for you. Today, we are thrilled to announce a major improvement in this field, and a big milestone in our journey, as DagsHub joins a very small and exclusive group of tools (GitHub and Hugging Face) integrated with Google Colab.

Let's explore the exciting new capabilities this integration brings to the table.

What is Google Colab?

Google Colab is a cloud-based platform that offers a free and interactive computing environment for data science and machine learning tasks. It provides users with a Jupyter Notebook-like interface, allowing them to write and execute code, perform data analysis, and create visualizations using popular libraries like TensorFlow and PyTorch.

One of the key advantages of Colab is its cloud infrastructure, which grants users access to powerful GPUs and TPUs for accelerated computations. Additionally, Colab facilitates collaboration by enabling real-time editing, commenting, and sharing of notebooks, making it ideal for team-based projects and knowledge sharing.

DagsHub and Google Colab - closing the ML training lifecycle

With the new integration with Google Colab, DagsHub users can now open notebooks in a Colab environment directly from DagsHub and also version and commit them back using Git or DVC.

Those new capabilities close the ML training lifecycle as you can now build a fully reproducible pipeline using DagsHub, train your model in Colab, and commit back all project components, including data, trained model, experiment, and now also the Colab notebook itself, back to DagsHub.

Version control Colab notebooks using Git or DVC on DagsHub

Integrating DagsHub and Colab introduces a significant improvement in notebook version control, as users can use DVC to version large notebooks that Git has trouble facilitating.

This means that if you're building a very complex and long notebook or using heavy plots, you no longer need to version them in "stone-age" style (v1.ipynb, v_final.ipynb,  v_final_final.ipynb, ...) on your google drive or S3 bucket, but can use industry-standard workflow, that is based on Git.

Not only that, DagsHub supports notebook diffing and commenting on notebook cells versioned by both Git and DVC. This unlocks collaboration capabilities for ML teams, without moving to third-party platforms or sharing screenshots on Slack or Discord. DagsHub even supports interactive notebook output visualizations, which means that if you’re using popular tools like Pandas Profiling or SweetViz you’ll be able to interact with the HTML outputs after committing them.

How to version Colab notebooks on DagsHub with Git or DVC?

You can version Jupyter and Colab notebooks using DagsHub Client using both Git or DVC. To do that, we’ll use the save_notebook functionality which requires the following arguments:

  • repo (str): repository in the format of user/repo
from dagshub.notebook import save_notebook

save_notebook(repo="nirbarazida/chexnet")

And can also take the following arguments:

  • path (str): Where to save the notebook within the repository (including the filename). If the filename is not specified, we'll save it as "notebook-{datetime.now}.ipynb" under the specified folder
  • branch (str): The branch under which the notebook should be saved. Will commit to the default repo branch if not specified
  • commit_message (str): The commit message for the update
  • versioning (str): ['git'|'dvc'] The VCS used to version the notebook

Check out the example notebook that also uses DagsHub Client for versioning

Tags

Nir Barazida

MLOps Team Lead @ DagsHub

Great! You've successfully subscribed.
Great! Next, complete checkout for full access.
Welcome back! You've successfully signed in.
Success! Your account is fully activated, you now have access to all content.