Simon/mlflow-tracking-example

1 Branches

MLproject

9bed896504

added more epochs

3 years ago

README.md

41300fd214

init

3 years ago

conda.yaml

41300fd214

init

3 years ago

mnist_autolog_example1.py

41300fd214

init

3 years ago

DagsHub Storage

You have to be logged in to leave a comment.

MNIST example with MLFlow

In this example, we train a Pytorch Lightning model to predict handwritten digits, leveraging early stopping. The code, adapted from this repository, is almost entirely dedicated to model training, with the addition of a single mlflow.pytorch.autolog() call to enable automatic logging of params, metrics, and models, including the best model from early stopping.

Running the code

To run the example via MLflow, navigate to the mlflow/examples/pytorch/MNIST/example1 directory and run the command

mlflow run .

This will run mnist_autolog_example1.py with the default set of parameters such as --max_epochs=5. You can see the default value in the MLproject file.

In order to run the file with custom parameters, run the command

mlflow run . -P max_epochs=X

where X is your desired value for max_epochs.

If you have the required modules for the file and would like to skip the creation of a conda environment, add the argument --no-conda.

mlflow run . --no-conda

Viewing results in the MLflow UI

Once the code is finished executing, you can view the run's metrics, parameters, and details by running the command

mlflow ui

and navigating to http://localhost:5000.

For more details on MLflow tracking, see the docs.

Passing custom training parameters

The parameters can be overridden via the command line:

max_epochs - Number of epochs to train model. Training can be interrupted early via Ctrl+C
gpus - Number of GPUs
accelerator - Accelerator backend (e.g. "ddp" for the Distributed Data Parallel backend) to use for training. By default, no accelerator is used.
batch_size - Input batch size for training
num_workers - Number of worker threads to load training data
lr - Learning rate
patience -parameter of early stopping
mode - parameter of early stopping
monitor - parameter of early stopping 10.verbose - parameter of early stopping

For example:

mlflow run . -P max_epochs=5 -P gpus=1 -P batch_size=32 -P num_workers=2 -P learning_rate=0.01 -P accelerator="ddp" -P patience=5 -P mode="min" -P monitor="val_loss" -P verbose=True

Or to run the training script directly with custom parameters:

python mnist_autolog_example1.py \
    --max_epochs 5 \
    --gpus 1 \
    --accelerator "ddp" \
    --batch_size 64 \
    --num_workers 3 \
    --lr 0.001 \
    --es_patience 5 \
    --es_mode "min" \
    --es_monitor "val_loss" \
    --es_verbose True

Logging to a custom tracking server

To configure MLflow to log to a custom (non-default) tracking location, set the MLFLOW_TRACKING_URI environment variable, e.g. via export MLFLOW_TRACKING_URI=http://localhost:5000/. For more details, see the docs.

Tip!

Press p or to see the previous file or, n or to see the next file

README.md

MNIST example with MLFlow

Running the code

Viewing results in the MLflow UI

Passing custom training parameters

Logging to a custom tracking server

Comments

Use Google Cloud Storage!

Specify your Google Storage bucket

Service Account Key

Congratulations!

Use AWS S3 as storage!

Specify your S3 bucket

Access key (If needed)

Congratulations!

Use any S3 compatible storage!

Specify your S3 bucket

Access key (If needed)

Congratulations!

Use Azure Cloud Storage!

Specify your Azure Storage bucket

Access key (If needed)

Congratulations!

Simon / mlflow-tracking-example

README.md

MNIST example with MLFlow

Running the code

Viewing results in the MLflow UI

Passing custom training parameters

Logging to a custom tracking server

Comments

Use Google Cloud Storage!

Specify your Google Storage bucket

Service Account Key

Congratulations!

Use AWS S3 as storage!

Specify your S3 bucket

Access key (If needed)

Congratulations!

Use any S3 compatible storage!

Specify your S3 bucket

Access key (If needed)

Congratulations!

Use Azure Cloud Storage!

Specify your Azure Storage bucket

Access key (If needed)

Congratulations!

Simon
/
mlflow-tracking-example