Deploy ML Models to a Cloud Platform¶

Using the MLflow server integrated with each DagsHub repo, you can deploy ML models to a cloud platform.

MLflow’s Model Registry allows us to store models alongside experiments and runs. It also includes model versioning and stage transitions. Put together, it simplifies determining which model we need to deploy to which environment. The whole process becomes less prone to error.

Once we’ve logged and registered models to our Model Registry, we can then load and deploy them.

Automatic Model Registration¶

MLflow contains functions to log and register a model in one shot. These functions are available for several popular frameworks, which can be found in their official documentation.

The general call for logging a model is mlflow.<framework>.log_model(). For instance, to log a Keras model, we can do something like:

import mlflow

# Set MLflow server URI
mlflow.set_tracking_uri(os.getenv("MLFLOW_TRACKING_URI"))

# Training code
# ...

# Log model
mlflow.keras.log_model(keras_model=model,
                       artifact_path=MODELS_DIR,
                       registered_model_name="Super Cool Model")

The MLFLOW_TRACKING_URI is the same as the URL for your DagsHub repo, with .mlflow appended to the end. For example:

https://dagshub.com/yonomitt/BetterSquirrelDetector.mlflow

Manual Model Registration¶

Sometimes, however, our framework either isn’t supported, or we need more control over how the model is logged. In this instance, MLflow provides us with the ability to log any Python function.

To support the logging of generic Python functions we need to:

Create a wrapper class that inherits from mlflow.pyfunc.PythonModel.
Write a predict method, which takes a context and the model_input.
Optionally write a load_context method to setup our model.

Let’s say we wanted to log and register a custom YOLOv5 model. This could look something like this:

import mlflow
import torch

class SquirrelDetectorWrapper(mlflow.pyfunc.PythonModel):
    def load_context(self, context):
        self.model = torch.hub.load('ultralytics/yolov5', 'custom', path=context.artifacts['path'])

    def predict(self, context, img):
        objs = self.model(img).xywh[0]

        return objs.numpy()

We then need to provide a dictionary with the dependencies we have:

PYTHON_VERSION = "{major}.{minor}.1".format(major=version_info.major,
                                            minor=version_info.minor)

conda_env = {
    'channels': ['defaults'],
    'dependencies': [
        'python~={}'.format(PYTHON_VERSION),
        'pip',
          {
            'pip': [
                'mlflow',
                'pillow',
                'cloudpickle=={}'.format(cloudpickle.__version__),
                'torch>=1.12.0'
            ],
          },
    ],
    'name': 'squirrel_env'
}

We can then log and register the model with the following code:

import mlflow

# Set MLflow server URI
mlflow.set_tracking_uri(os.getenv("MLFLOW_TRACKING_URI"))

with mlflow.start_run(experiment_id=exp_id):
    mlflow.pyfunc.log_model(
        'mymodel',
        python_model=SquirrelDetectorWrapper(),
        conda_env=conda_env,
        artifacts={ 'path': '<path/to/saved/model.pt>' },
        registered_model_name='Ultra Cool Model'
    )

Loading a Model¶

Once a model has been logged and registered, we can load it using `mlflow.:

model_uri = f'models:/<model name>/<version>'
model = mlflow.<framework>.load_model(model_uri)

The version of the model can be either a version number, which is an auto-incremented integer or the stage of the model (Staging, Production).

One easy way to get the latest registered version of your model is to use the following code:

client = mlflow.MlflowClient()

name = "Ultra Cool Model"
version = client.get_latest_versions(name=name)[0].version
model_uri = f'models:/{name}/{version}'

model = mlflow.keras.load_model(model_uri)

Deploying to Amazon SageMaker¶

Before we can start with Amazon SageMaker, we need to make sure we create and setup our AWS account properly.

How to Create an AWS Account <- you will need a credit card
Setup IAM (Identity and Access Management) Role(s)
Install the AWS CLI using pip install awscli
Add your credentials to ~/.aws/credentials

The AWS credentials file should look like this:

[default]
aws_access_key_id = ***
aws_secret_access_key = ***

The following command only needs to be run once for each AWS account. It creates the docker image that gets used by MLflow to serve the model and pushes it to Amazon SageMaker. However, it does not hurt if it is run multiple times.

mlflow sagemaker build-and-push-container

In order to deploy, we need to know:

The model URI
Our SageMaker execution role
The AWS region we are going to deploy the endpoint to

mlflow sagemaker deploy --app-name mario \
                        --model-uri "models:/Ultra Cool Model/1" \
                        -e arn:aws:iam::***:role/SageMakerRole \
                        --region-name us-east-2

Creating Docker Images to Deploy Anywhere¶

We can also use MLflow to create docker images to deploy anywhere we want.

mlflow models build-docker --name mario \
                           --model-uri "models:/Ultra Cool Model/1"

This will create a Docker image tagged with the name mario:latest. To spin up a container from this image, we can run:

docker run -d -p 8080:8080 mario:latest

Running Inference on our Deployed Model¶

We can run inference using our Docker container with a curl command.

The image input, as discussed, will need to be a base64 encoded image. However, we can't just send the base64 string. We need to include it in a JSON dictionary with this format:

{
  "inputs": "base64-string-representation-of-image..."
}

Here is our bash command, which uses the built in base64 command to convert our image into a string:

curl http://localhost:8080/invocations \
        -H 'Content-Type: application/json' \
        -d '{"inputs": "'$(base64 -w 0 mario_0.jpg)'"}'

For more information, see our MLflow Crash Course Workshop, which includes a recording and a Colab notebook.

See the project on DagsHub