Register
Login
Resources
Docs Blog Datasets Glossary Case Studies Tutorials & Webinars
Product
Data Engine LLMs Platform Enterprise
Pricing Explore
Connect to our Discord channel

pycaret.md 4.1 KB

You have to be logged in to leave a comment. Sign In
title description
Log and track PyCaret experiments with DagsHub Log and track PyCaret experiments to DagsHub with minimal code changes, enabling collaboration, reproducibility, data-driven decisions, and more.

PyCaret is an open-source, low-code machine learning library in Python that simplifies the process of training and deploying machine learning models. It offers a wide range of functions and features that make it easy to go from preparing your data to deploying your model within seconds.

With DagsHub, you can log the experiments you run with PyCaret to a remote server with minimal changes to your code.

This includes versioning raw and processed data with DVC, as well as logging experiment metrics, parameters, and trained models with MLflow. This integration enables you to continue using the familiar MLflow interface, while also facilitating collaboration with others, comparing results from different runs, and making data-driven decisions with ease.

Open in Colab

How does PyCaret work with DagsHub?

By setting DagsHub as the logger of the experiment, it authenticates your DagsHub user and uses MLflow and DagsHub Client to log the information of the experiment to your DagsHub repository. We use built-in PyCaret callbacks to log the metrics and parameters of every run using MLflow, and the artifacts, as in data and trained model, using either MLflow or DVC. You can find the source code of the logger in the PyCaret repository.

How to log PyCaret Experiments on DagsHub?

Configurations

  • We will start by installing PyCaret, DagsHub, and MLflow by running the following command from the CLI

    === "Mac, Linux, Windows" bash pip install pycaret dagshub mlflow

  • Configure DagsHub [optional] - To avoid the authentication process with DagsHub's servers, we can conduct one of the following options:

    1. Log in using the dagshub client. === "Mac, Linux, Windows" bash dagshub login export MLFLOW_TRACKING_URI="<enter-your-MLflow-remote-DagsHub>"

Run an Experiment

  • Choose any one of PyCaret's many Machine Learning models and set DagsHub as the logger during initialization. === "Mac-os, Linux, Windows" python from pycaret.classification import * s = setup(..... , log_experiment="dagshub" , ....)

!!! Note "Authentication"

  If the DagsHub Logger is not already authenticated on your local machine, the terminal will prompt you to enter the `repo_owner/repo_name` and provide an authentication link. The repository and remote MLflow server will then be automatically initialized in the background.

Congratulations, you’re all set to track your PyCaret experiments using DagsHub!

PyCaret will automatically detect that the integration is triggered and available and will ensure that it adds our hook to your pipeline. Now, when you run your code, you will see new runs appear in the experiment tables, with their status and origin

Additional Resources

  • DagsHub x PyCaret - a full tutorial that showcases how to use DagsHub with PyCaret.
  • Example notebook - create your own transformer model and track your experiments.

Known Issues, Limitations & Restrictions

If you do not set the ML_TRACKING_URI environment variable, you will be prompted to enter the repo_owner/repo_name every time you run your experiment.

The latest feature of dagshub dagshub.init which configures your repository with MLflow configuration does not set this variable, hence this method will still trigger the prompt.

Tip!

Press p or to see the previous file or, n or to see the next file

Comments

Loading...