Model Management with Python#
Overview#
Model management is a workflow within the overall model lifecycle that can be used to manage multiple versions of deployed models in production. RStudio helps you use Python tooling to develop, deploy, and manage models in production environments within enterprise organizations.
Different components of a model deployment pipeline can be developed using RStudio professional products to:
- Deploy multiple versions of a model as REST APIs using Flask
- Retain a history of model revisions for traceability using Jupyter Notebooks
- Route API traffic between deployed models for A/B testing using Flask
- Monitor models in production using Plotly Dash
RStudio Workbench can be used with Jupyter Notebooks and machine learning packages to develop, train, and score models during development. RStudio Connect can be used to publish Jupyter Notebooks, REST APIs with Flask, and interactive apps with Plotly Dash.
Example: A/B Testing Multiple Credit Payment Risk Models#
The following example demonstrates a full model lifecycle for different versions of a model that were developed in RStudio Workbench and deployed to RStudio Connect.
The data set used in this example involves demographic information and payment history of various customers and whether they defaulted/missed a payment on their credit accounts.
The goals of this model deployment pipeline are to train multiple classification models to predict the probability of a new customer defaulting on their credit payment, serve predictions via a REST API, route API traffic between two different models as part of an A/B testing framework, and run interactive diagnostics to verify the model routing is performing as expected.
The tasks described in the following sections will walk through each stage of the model lifecycle in this example.
Training a Model#
You can train a model in a Jupyter Notebook and retain all of the information used to develop the trained model in a published notebook for reproducibility and traceability.
In this example, the published notebook includes a record of the library, algorithm, and parameters used to train the model. Refer to the RStudio Connect User Guide for more information on publishing Jupyter Notebooks.
You can also schedule model training notebooks in RStudio Connect to retrain the model on a recurring basis (e.g., daily or weekly).
Serving Model Predictions#
Once you've trained a model, you can serialize the model to a file, which contains the corresponding trained model weights. You can then serve model predictions via a REST API using Flask.
In this example, we deployed the model as a REST API and used a custom content
URL such that the model is serving at the
/model-management-python/model-a-predict
endpoint. Refer to the RStudio
Connect User Guide for more information on
deploying REST APIs with Flask.
Tuning a Model#
Once you've trained a model, you can tune the model based on feedback from either the model evaluation stage or based on the performance of the model in production. You can select and train a different model by changing the library, algorithm, or parameters.
In this example, we changed one of the model parameters and will observe the impact on the importance of the factors in the model.
Deploying a New Version of a Model#
After you've developed a new version of a model, you can deploy it as a separate application and REST API.
In this example, we deployed a second version of the model as a separate REST
API and used a custom content URL such that the second version of the model is
serving at the /model-management-python/model-b-predict
endpoint.
Managing Multiple Versions of a Model#
You can manage multiple versions of a model using different methods in RStudio Connect:
-
Versioned API deployments
- Each time you deploy an updated version of a REST API, a new application bundle is created. You can access a history of application bundles for each deployment. You can also roll-back to any previously deployed version.
-
Custom API URLs
- You can use custom content URLs to create custom URLs for deployed REST APIs. Custom content URLs c an be configured and/or swapped between models.
-
Separate API deployments
- You can also deploy a REST API as a separate application with a separate version history and API URL.
A/B Testing Different Models#
You can implement an additional REST API endpoint that routes API traffic between various deployed models to implement different testing strategies.
In this example, we've deployed an API router that splits traffic between two models, which can be used as a framework to perform A/B testing between two models.
You can also change the logic in the model router to implement different routing schemes such as champion-challenger testing.
Verifying Model Predictions#
Once the models, REST API endpoints, and API routers are deployed to production, they are ready to receive traffic and serve predictions.
You can use a deployed application to simulate API traffic, load test your API endpoints, verify the behavior of the API router, and compare the resulting model predictions.
In this example, we deployed an interactive dashboard to simulate continuous API traffic, verify the behavior of the API router, and compare the resulting model predictions.
Additional Considerations#
The purpose of this example is to demonstrate how model management and various stages of the model lifecycle can be mapped to functionality in RStudio Connect.
This example is simplified and can be used as a starting point for model management. There are additional considerations when deploying and managing multiple versions of models:
- Saving models and parameters on external persistent storage systems
- Encapsulating models with packages such as
mlflow
Example Code#
The source code for all of the model management components described here is available in the sol-eng/model-management-python repository on GitHub.