|
@@ -4,12 +4,12 @@ description: Free remote MLflow server with team-based access control. Log exper
|
|
---
|
|
---
|
|
# MLflow Tracking
|
|
# MLflow Tracking
|
|
|
|
|
|
-[MLflow](https://mlflow.org/){target=_blank} is an open-source tool to manage the machine learning lifecycle. It supports
|
|
|
|
-live logging of parameters, metrics, metadata, and artifacts when running a machine learning experiment. To manage the
|
|
|
|
-post training stage, it provides a model registry with deployment functionality to custom serving tools.
|
|
|
|
|
|
+[MLflow](https://mlflow.org/){target=_blank} is an open-source tool to manage the machine learning lifecycle. It supports
|
|
|
|
+live logging of parameters, metrics, metadata, and artifacts when running a machine learning experiment. To manage the
|
|
|
|
+post training stage, it provides a model registry with deployment functionality to custom serving tools.
|
|
|
|
|
|
-DagsHub provides a free hosted MLflow server with team-based access control for every repository. You can log experiments with MLflow to it, view its information
|
|
|
|
-under the [experiment tab](../feature_guide/experiment_tracking.md), and manage your trained models from the full-fledged
|
|
|
|
|
|
+DagsHub provides a free hosted MLflow server with team-based access control for every repository. You can log experiments with MLflow to it, view its information
|
|
|
|
+under the [experiment tab](../feature_guide/experiment_tracking.md), and manage your trained models from the full-fledged
|
|
MLflow UI built into your DagsHub project.
|
|
MLflow UI built into your DagsHub project.
|
|
|
|
|
|
<style>
|
|
<style>
|
|
@@ -61,7 +61,7 @@ The server endpoint can also be found under the ‘Remote’ button:
|
|
- Only a repository contributor can log experiments and access the DagsHub MLflow UI.
|
|
- Only a repository contributor can log experiments and access the DagsHub MLflow UI.
|
|
|
|
|
|
|
|
|
|
-## How to set DagsHub as the remote MLflow server?
|
|
|
|
|
|
+## How to set DagsHub as the remote MLflow server?
|
|
|
|
|
|
### 1. Install and import MLflow
|
|
### 1. Install and import MLflow
|
|
|
|
|
|
@@ -78,7 +78,7 @@ your virtual environment using pip:
|
|
[MLflow logging functions](https://www.mlflow.org/docs/latest/tracking.html#logging-functions){target=_blank}.
|
|
[MLflow logging functions](https://www.mlflow.org/docs/latest/tracking.html#logging-functions){target=_blank}.
|
|
.
|
|
.
|
|
|
|
|
|
-### 2. Set DagsHub as the remote URI
|
|
|
|
|
|
+### 2. Set DagsHub as the remote URI
|
|
|
|
|
|
You can set the MLflow server URI by adding the following line to our code:
|
|
You can set the MLflow server URI by adding the following line to our code:
|
|
|
|
|
|
@@ -118,7 +118,7 @@ You can set these by typing in the terminal:
|
|
export MLFLOW_TRACKING_USERNAME=<username>
|
|
export MLFLOW_TRACKING_USERNAME=<username>
|
|
export MLFLOW_TRACKING_PASSWORD=<password/token>
|
|
export MLFLOW_TRACKING_PASSWORD=<password/token>
|
|
```
|
|
```
|
|
-
|
|
|
|
|
|
+
|
|
You can also use your token as username; in this case the password is not needed:
|
|
You can also use your token as username; in this case the password is not needed:
|
|
=== "Mac, Linux, Windows"
|
|
=== "Mac, Linux, Windows"
|
|
```bash
|
|
```bash
|
|
@@ -149,7 +149,7 @@ but uploading and downloading was done using the client's local credentials and
|
|
packages (i.e `boto3` or `google-cloud-storage`).
|
|
packages (i.e `boto3` or `google-cloud-storage`).
|
|
Support for proxying upload and download requests through the tracking server was added in MLflow 1.24.0.
|
|
Support for proxying upload and download requests through the tracking server was added in MLflow 1.24.0.
|
|
|
|
|
|
-DagsHub lets you leverage this capability by directly hosting your artifacts by default.
|
|
|
|
|
|
+DagsHub lets you leverage this capability by directly hosting your artifacts by default.
|
|
For every newly created repository or MLflow experiment,
|
|
For every newly created repository or MLflow experiment,
|
|
DagsHub will generate a dedicated artifact location similar to `mlflow-artifacts:/<UUID>`.
|
|
DagsHub will generate a dedicated artifact location similar to `mlflow-artifacts:/<UUID>`.
|
|
|
|
|
|
@@ -273,6 +273,123 @@ We shared two examples of experiment logging to DagsHub’s MLflow server in a C
|
|
- [Using MLflow with Tensorflow](https://colab.research.google.com/drive/1TrN7YEgiIzt7EelvshJPx2n4j-Qa6LBf?usp=sharing){target=_blank}
|
|
- [Using MLflow with Tensorflow](https://colab.research.google.com/drive/1TrN7YEgiIzt7EelvshJPx2n4j-Qa6LBf?usp=sharing){target=_blank}
|
|
- [Using MLflow with fast.ai](https://colab.research.google.com/drive/1DhHzI5blVbniFwx98EKXYSi0z_Icm07t?usp=sharing){target=_blank}
|
|
- [Using MLflow with fast.ai](https://colab.research.google.com/drive/1DhHzI5blVbniFwx98EKXYSi0z_Icm07t?usp=sharing){target=_blank}
|
|
|
|
|
|
|
|
+## How to import MLflow local objects to DagsHub MLflow remote?
|
|
|
|
+
|
|
|
|
+Generally, you can use [`mlflow-export-import`](https://github.com/mlflow/mlflow-export-import) to export MLflow experiments, runs and models from one server to another.
|
|
|
|
+
|
|
|
|
+The following example demonstrates how to bulk export all objects that are created locally, then bulk import to DagsHub remote tracking server.
|
|
|
|
+
|
|
|
|
+### 1. Install `mlflow-export-import`
|
|
|
|
+
|
|
|
|
+- In the same environment that you originally install `mlflow`, you can install `mlflow-export-import` with:
|
|
|
|
+
|
|
|
|
+ ```bash
|
|
|
|
+ pip install mlflow-export-import
|
|
|
|
+ ```
|
|
|
|
+
|
|
|
|
+- If you need to install the latest version from Github source, do this instead:
|
|
|
|
+
|
|
|
|
+ ```bash
|
|
|
|
+ pip install git+https://github.com/mlflow/mlflow-export-import
|
|
|
|
+ ```
|
|
|
|
+
|
|
|
|
+### 2. Export all local objects
|
|
|
|
+
|
|
|
|
+- In one terminal, start the local `mlflow` server, for example with:
|
|
|
|
+
|
|
|
|
+ ```bash
|
|
|
|
+ mlflow server --host 0.0.0.0 --port 8888
|
|
|
|
+ ```
|
|
|
|
+
|
|
|
|
+- In **another** terminal, in the same virtual environment, export all objects to a folder called `mlflow-export` with:
|
|
|
|
+
|
|
|
|
+ ```bash
|
|
|
|
+ # note: the port needs to be same one that you request in the other terminal
|
|
|
|
+ MLFLOW_TRACKING_URI=http://localhost:8888 \
|
|
|
|
+ export-all --output-dir mlflow-export
|
|
|
|
+ ```
|
|
|
|
+
|
|
|
|
+- If succeeded, you should see a report saying so, for example
|
|
|
|
+
|
|
|
|
+ ```text
|
|
|
|
+ 3 experiments exported
|
|
|
|
+ 37/37 runs succesfully exported
|
|
|
|
+ Duration for experiments export: 10.6 seconds
|
|
|
|
+ Duration for entire tracking server export: 10.8 seconds
|
|
|
|
+ ```
|
|
|
|
+
|
|
|
|
+- At this point, you can stop the local server in the first terminal.
|
|
|
|
+
|
|
|
|
+### 3. Import to DagsHub server
|
|
|
|
+
|
|
|
|
+- Find your DagsHub repository's MLflow remote variables, for example by going through the `Remote` button in your repository, then click on the `Experiments` tab.
|
|
|
|
+- Do the following in terminal:
|
|
|
|
+
|
|
|
|
+ ```bash
|
|
|
|
+ MLFLOW_TRACKING_URI=https://dagshub.com/<USER>/<REPO>.mlflow \
|
|
|
|
+ MLFLOW_TRACKING_USERNAME=<USER> \
|
|
|
|
+ MLFLOW_TRACKING_PASSWORD=<PASSWORD_OR_TOKEN> \
|
|
|
|
+ import-all --input-dir mlflow-export
|
|
|
|
+ ```
|
|
|
|
+
|
|
|
|
+- If successful, you can launch the local server again, and visit `https://dagshub.com/<USER>/<REPO>.mlflow` to inspect if there are any discrepancies between the logged data, artifacts, models, runs, experiments. For example, see if there's anything missing.
|
|
|
|
+
|
|
|
|
+### Importing issues & workarounds
|
|
|
|
+
|
|
|
|
+There may be some issues with in **Step 3** with importing. Below are some potential workarounds to try, essentially by editing the package source codes after installation.
|
|
|
|
+
|
|
|
|
+!!! warning
|
|
|
|
+ - These workarounds are used with `mlflow-export-import 1.2.0`. In the future, these may be patched.
|
|
|
|
+ - These workarounds may work for only certain cases and issues.
|
|
|
|
+ - You should backup `mlflow-export` directory just in case.
|
|
|
|
+ - Always try to compare the local server and DagsHub remote tracking server after every `import-all` step to ensure they are the same.
|
|
|
|
+
|
|
|
|
+- First, find the source code directory by inspecting at the `Location` field when doing `pip show`, for example:
|
|
|
|
+
|
|
|
|
+ ```bash
|
|
|
|
+ pip show mlflow-export-import
|
|
|
|
+ ...
|
|
|
|
+ Location: .venv/lib/python3.10/site-packages/mlflow_export_import
|
|
|
|
+ ...
|
|
|
|
+ ```
|
|
|
|
+
|
|
|
|
+ ??? info "Alternative to editing in `site-packages`"
|
|
|
|
+ The alternative to editing source codes in `site-packages` is using editable installation with `mlflow-export-import`.
|
|
|
|
+
|
|
|
|
+ ```bash
|
|
|
|
+ # clone the repository
|
|
|
|
+ git clone https://github.com/mlflow/mlflow-export-import
|
|
|
|
+
|
|
|
|
+ # install in editable mode
|
|
|
|
+ pip install -e ./mlflow-export-import/
|
|
|
|
+ ```
|
|
|
|
+
|
|
|
|
+ Then you can edit files in the local `mlflow-export-import/mlflow_export_import` directory instead of your environment's `site-packages` directory.
|
|
|
|
+
|
|
|
|
+- If you see `{'error_code': 'BAD_REQUEST'}` in the outputs of `import-all`, comment out/delete the following lines in `common/mlflow_utils.py` file, under `set_experiment` function
|
|
|
|
+
|
|
|
|
+ ```python
|
|
|
|
+ if ex.error_code != "RESOURCE_ALREADY_EXISTS":
|
|
|
|
+ raise MlflowExportImportException(ex, f"Cannot create experiment '{exp_name}'")
|
|
|
|
+ ```
|
|
|
|
+
|
|
|
|
+- Re-run **Step 3** with the same `import-all` command again.
|
|
|
|
+
|
|
|
|
+- If you still encounter issues, it could be because your local experiments did not log any data inputs. If none of your experiments did, attempt to comment out/delete the following line in `run/import_run.py` source file, under `import_run` function, inside the `try` block
|
|
|
|
+
|
|
|
|
+ ```python
|
|
|
|
+ _import_inputs(http_client, src_run_dct, run_id)
|
|
|
|
+ ```
|
|
|
|
+
|
|
|
|
+ !!! warning
|
|
|
|
+ Please note that this workaround assumes **none** of your experiments or runs log any inputs.
|
|
|
|
+
|
|
|
|
+ If there are some that do and some that do not, you will need to modify the logic of `_import_inputs` and/or the code surround this line to accommodate that.
|
|
|
|
+
|
|
|
|
+- Re-run **Step 3** with the same `import-all` command again.
|
|
|
|
+
|
|
|
|
+If there are still issues or there are discrepancies between the local server and the remote DagsHub server, please open a ticket on DagsHub and/or `mlflow-export-import` Github repository.
|
|
|
|
+
|
|
## Known Issues, Limitations & Restrictions
|
|
## Known Issues, Limitations & Restrictions
|
|
|
|
|
|
The MLflow UI provided by DagsHub currently doesn't support displaying artifacts pushed to an external storage like S3.
|
|
The MLflow UI provided by DagsHub currently doesn't support displaying artifacts pushed to an external storage like S3.
|