You have to be logged in to leave a comment.

MLflow Tracking

Real-time logging provides valuable information and visibility while running a data science experiment. It lets users monitor the progress of the training process and take action if necessary. To enable DAGsHub users to log their experiments in real-time, DAGsHub provides an MLflow Tracking integration. This means that parameters and metrics can be displayed while the process is running, with the ability to monitor more than one experiment while being executed.

Feature Overview

MLflow Tracking{target=_blank} is an open-source API for live logging of parameters, metrics, and metadata when running a machine learning code. To make MLflow Tracking output accessible outside your local machine, you’ll need to host it on a remote-tracking server. To connect all data science project components in one place, we automatically connect an MLflow server to your DAGsHub repository and integrate it seamlessly with the Experiment Tab.

How Does it work?

When you create a repository on DAGsHub, an MLflow server will be automatically created and connected to the repository. Your project's MLflow tracking server will be located at:

http://dagshub.com/<DAGsHub-user-name>/<repository-name>.mlflow

The server endpoint can also be found under the ‘Remote’ button:

![MLflow Experiments](assets/mlflow-remote.gif)

When you define DAGsHub's MLflow server as the remote server, the output of the run will be added to the Experiment Tab. !!! Note Only a repository contributor can log experiments.

How to Use It?

Install and Import MLflow

You will start by installing the MLflow python package{target=_blank} on your virtual environment using pip:

=== "Mac-os, Linux, Windows" bash pip install mlflow
Then, import MLflow to your python module using import mlflow.

Set the MLflow server URI

You can set the MLflow server URI by adding the following line to your code:

mlflow.set_tracking_uri(http://dagshub.com/<DAGsHub-user-name>/<repository-name>.mlflow)

??? info "Set the MLflow server URI using an environment variable"

You can also define your MLflow server URI using the `MLFLOW_TRACKING_URI
` environment variable.

**We don't recommend this approach**, since you might  forget to reset the environment variable when
switching between different projects. This might result in logging experiments to the wrong repository.

If you still prefer using the environment variable, we recommend setting it only for the current
command, like the following:

=== "Mac-os, Linux, Windows"
    ```bash
    MLFLOW_TRACKING_URI=http://dagshub.com/<username>/<repo>.mlflow python <file-name>.py
    ```

Set-up Credentials

The DAGsHub MLflow server has built-in access controls. Only a repository contributor can log experiments (someone who can git push to the repository).

In order to use basic authentication with MLflow, you need to set the following environment variables:
- MLFLOW_TRACKING_USERNAME - Your DAGsHub username
- MLFLOW_TRACKING_PASSWORD - Your DAGsHub password or preferably an access token

You can set these by typing in your terminal: === "Mac-os, Linux, Windows" bash export MLFLOW_TRACKING_USERNAME=<username/token> export MLFLOW_TRACKING_PASSWORD=<password>

Congratulations, you are ready to start logging experiments. Now, when you run your code, you will see new runs appear in the experiment tables, with their status and origin:

![MLflow Experiments](assets/mlflow_experiment_table.png)

MLflow Tracking Usage

This document does not cover the usage of MLflow tracking, but a tutorial will soon be available. In the meantime refer to the official MLflow docs{target=_blank}. If you have any further questions about this feature or any other on DAGsHub, please visit our Discord channel{target=_blank}.

Known Issues, Limitations & Restrictions

DAGsHub currently doesn't support artifacts, but we might soon. Please, contact us in our Discord channel{target=_blank} if you find it important.

Tip!

Press p or to see the previous file or, n or to see the next file

Specify your S3 bucket

Bucket name cannot be the same as the repository name. Please change one of them.

Bucket url and prefix

Region

Endpoint Url

Disable SSL verification

mlflow_tracking.md 4.4 KB

History Raw

MLflow Tracking

Feature Overview

How Does it work?

How to Use It?

Install and Import MLflow

Set the MLflow server URI

Set-up Credentials

MLflow Tracking Usage

Known Issues, Limitations & Restrictions

Comments

Use AWS S3 as storage!

Specify your S3 bucket

Access key (If needed)

Congratulations!

Use Google Cloud Storage!

Specify your Google Storage bucket

Service Account Key

Congratulations!

Use Azure Cloud Storage!

Specify your Azure Storage bucket

Access key (If needed)

Congratulations!

Use any S3 compatible storage!

Specify your S3 bucket

Access key (If needed)

Congratulations!

KalikaKay / dagshub-docs forked from DAGsHub-Official/dagshub-docs

mlflow_tracking.md 4.4 KB History Raw

MLflow Tracking

Feature Overview

How Does it work?

How to Use It?

Install and Import MLflow

Set the MLflow server URI

Set-up Credentials

MLflow Tracking Usage

Known Issues, Limitations & Restrictions

Comments

Use AWS S3 as storage!

Specify your S3 bucket

Access key (If needed)

Congratulations!

Use Google Cloud Storage!

Specify your Google Storage bucket

Service Account Key

Congratulations!

Use Azure Cloud Storage!

Specify your Azure Storage bucket

Access key (If needed)

Congratulations!

Use any S3 compatible storage!

Specify your S3 bucket

Access key (If needed)

Congratulations!

KalikaKay
/
dagshub-docs
forked from DAGsHub-Official/dagshub-docs

mlflow_tracking.md 4.4 KB

History Raw