Are you sure you want to delete this access key?
Real-time logging provides valuable information and visibility while running a data science experiment. It lets users monitor the progress of the training process and take action if necessary. To enable DAGsHub users to log their experiments in real-time, DAGsHub provides an MLflow Tracking integration. This means that parameters and metrics can be displayed while the process is running, with the ability to monitor more than one experiment while being executed.
MLflow Tracking{target=_blank} is an open-source API for live logging of parameters, metrics, and metadata when running a machine learning code. To make MLflow Tracking output accessible outside your local machine, you’ll need to host it on a remote-tracking server. To connect all data science project components in one place, we automatically connect an MLflow server to your DAGsHub repository and integrate it seamlessly with the Experiment Tab.
When you create a repository on DAGsHub, an MLflow server will be automatically created and connected to the repository. Your project's MLflow tracking server will be located at:
http://dagshub.com/<DAGsHub-user-name>/<repository-name>.mlflow
The server endpoint can also be found under the ‘Remote’ button:
When you define DAGsHub's MLflow server as the remote server, the output of the run will be added to the Experiment Tab. !!! Note Only a repository contributor can log experiments.
You will start by installing the MLflow python package{target=_blank} on your virtual environment using pip:
=== "Mac-os, Linux, Windows"
bash pip install mlflow
Then, import MLflow to your python module using import mlflow
.
You can set the MLflow server URI by adding the following line to your code:
mlflow.set_tracking_uri(http://dagshub.com/<DAGsHub-user-name>/<repository-name>.mlflow)
??? info "Set the MLflow server URI using an environment variable"
You can also define your MLflow server URI using the `MLFLOW_TRACKING_URI
` environment variable.
**We don't recommend this approach**, since you might forget to reset the environment variable when
switching between different projects. This might result in logging experiments to the wrong repository.
If you still prefer using the environment variable, we recommend setting it only for the current
command, like the following:
=== "Mac-os, Linux, Windows"
```bash
MLFLOW_TRACKING_URI=http://dagshub.com/<username>/<repo>.mlflow python <file-name>.py
```
The DAGsHub MLflow server has built-in access controls. Only a repository contributor can log experiments
(someone who can git push
to the repository).
In order to use basic authentication with MLflow, you need to set the following environment variables:
MLFLOW_TRACKING_USERNAME
- Your DAGsHub usernameMLFLOW_TRACKING_PASSWORD
- Your DAGsHub password or preferably an access tokenYou can set these by typing in your terminal:
=== "Mac-os, Linux, Windows"
bash export MLFLOW_TRACKING_USERNAME=<username/token> export MLFLOW_TRACKING_PASSWORD=<password>
Congratulations, you are ready to start logging experiments. Now, when you run your code, you will see new runs appear in the experiment tables, with their status and origin:
This document does not cover the usage of MLflow tracking, but a tutorial will soon be available. In the meantime refer to the official MLflow docs{target=_blank}. If you have any further questions about this feature or any other on DAGsHub, please visit our Discord channel{target=_blank}.
DAGsHub currently doesn't support artifacts, but we might soon. Please, contact us in our Discord channel{target=_blank} if you find it important.
Press p or to see the previous file or, n or to see the next file
Are you sure you want to delete this access key?
Are you sure you want to delete this access key?
Are you sure you want to delete this access key?
Are you sure you want to delete this access key?