Deploy ML Models to a Cloud Platform¶
Get early access to new Deployment feature
We're working on a new and improved deployment experience on DagsHub. If deploying models automatically onto your infrastructure is something you need. Contact us using this form to get early access.
Using the MLflow server integrated with each DagsHub repo, you can deploy ML models to a cloud platform.
MLflow’s Model Registry allows us to store models alongside experiments and runs. It also includes model versioning and stage transitions. Put together, it simplifies determining which model we need to deploy to which environment. The whole process becomes less prone to error.
Once we’ve logged and registered models to our Model Registry, we can then load and deploy them.
Automatic Model Registration¶
MLflow contains functions to log and register a model in one shot. These functions are available for several popular frameworks, which can be found in their official documentation.
The general call for logging a model is mlflow.<framework>.log_model()
. For instance, to log a Keras model, we can do something like:
import mlflow
# Set MLflow server URI
mlflow.set_tracking_uri(os.getenv("MLFLOW_TRACKING_URI"))
# Training code
# ...
# Log model
mlflow.keras.log_model(keras_model=model,
artifact_path=MODELS_DIR,
registered_model_name="Super Cool Model")
The MLFLOW_TRACKING_URI
is the same as the URL for your DagsHub repo, with .mlflow
appended to the end. For example:
https://dagshub.com/yonomitt/BetterSquirrelDetector.mlflow
Manual Model Registration¶
Sometimes, however, our framework either isn’t supported, or we need more control over how the model is logged. In this instance, MLflow provides us with the ability to log any Python function.
To support the logging of generic Python functions we need to:
- Create a wrapper class that inherits from
mlflow.pyfunc.PythonModel
. - Write a
predict
method, which takes acontext
and themodel_input
. - Optionally write a
load_context
method to setup our model.
Let’s say we wanted to log and register a custom YOLOv5 model. This could look something like this:
import mlflow
import torch
class SquirrelDetectorWrapper(mlflow.pyfunc.PythonModel):
def load_context(self, context):
self.model = torch.hub.load('ultralytics/yolov5', 'custom', path=context.artifacts['path'])
def predict(self, context, img):
objs = self.model(img).xywh[0]
return objs.numpy()
We then need to provide a dictionary with the dependencies we have:
PYTHON_VERSION = "{major}.{minor}.1".format(major=version_info.major,
minor=version_info.minor)
conda_env = {
'channels': ['defaults'],
'dependencies': [
'python~={}'.format(PYTHON_VERSION),
'pip',
{
'pip': [
'mlflow',
'pillow',
'cloudpickle=={}'.format(cloudpickle.__version__),
'torch>=1.12.0'
],
},
],
'name': 'squirrel_env'
}
We can then log and register the model with the following code:
import mlflow
# Set MLflow server URI
mlflow.set_tracking_uri(os.getenv("MLFLOW_TRACKING_URI"))
with mlflow.start_run(experiment_id=exp_id):
mlflow.pyfunc.log_model(
'mymodel',
python_model=SquirrelDetectorWrapper(),
conda_env=conda_env,
artifacts={ 'path': '<path/to/saved/model.pt>' },
registered_model_name='Ultra Cool Model'
)
Loading a Model¶
Once a model has been logged and registered, we can load it using `mlflow.
model_uri = f'models:/<model name>/<version>'
model = mlflow.<framework>.load_model(model_uri)
The version
of the model can be either a version number, which is an auto-incremented integer or the stage of the model (Staging
, Production
).
One easy way to get the latest registered version of your model is to use the following code:
client = mlflow.MlflowClient()
name = "Ultra Cool Model"
version = client.get_latest_versions(name=name)[0].version
model_uri = f'models:/{name}/{version}'
model = mlflow.keras.load_model(model_uri)
Deploying to Amazon SageMaker¶
Before we can start with Amazon SageMaker, we need to make sure we create and setup our AWS account properly.
- How to Create an AWS Account <- you will need a credit card
- Setup IAM (Identity and Access Management) Role(s)
- Install the AWS CLI using
pip install awscli
- Add your credentials to
~/.aws/credentials
The AWS credentials file should look like this:
[default]
aws_access_key_id = ***
aws_secret_access_key = ***
The following command only needs to be run once for each AWS account. It creates the docker image that gets used by MLflow to serve the model and pushes it to Amazon SageMaker. However, it does not hurt if it is run multiple times.
mlflow sagemaker build-and-push-container
In order to deploy, we need to know:
- The model URI
- Our SageMaker execution role
- The AWS region we are going to deploy the endpoint to
mlflow sagemaker deploy --app-name mario \
--model-uri "models:/Ultra Cool Model/1" \
-e arn:aws:iam::***:role/SageMakerRole \
--region-name us-east-2
Creating Docker Images to Deploy Anywhere¶
We can also use MLflow to create docker images to deploy anywhere we want.
mlflow models build-docker --name mario \
--model-uri "models:/Ultra Cool Model/1"
This will create a Docker image tagged with the name mario:latest
. To spin up a container from this image, we can run:
docker run -d -p 8080:8080 mario:latest
Running Inference on our Deployed Model¶
We can run inference using our Docker container with a curl
command.
The image input, as discussed, will need to be a base64 encoded image. However, we can't just send the base64 string. We need to include it in a JSON dictionary with this format:
{
"inputs": "base64-string-representation-of-image..."
}
Here is our bash
command, which uses the built in base64
command to convert our image into a string:
curl http://localhost:8080/invocations \
-H 'Content-Type: application/json' \
-d '{"inputs": "'$(base64 -w 0 mario_0.jpg)'"}'
For more information, see our MLflow Crash Course Workshop, which includes a recording and a Colab notebook.