If you work with any type of Machine Learning nowadays and you’ve ever thought about putting the AI you’re working on into production, this article is for you. Whether you’re a (lead) Data Scientist, Data Engineer, Product owner or anyone else involved in Machine Learning, I might be reading your mind on what your next challenge is.
Let’s say your team has trained an amazing model, optimized all its hyper-parameters, and is releasing some impressive predictions if you may say so yourself! The time has come to let it break free from the chains of the office server, out of Beta and into the real world. Still, there’s one thing that keeps you awake at night: how are you going to make sure your ML-pipeline will run smoothly in production all the time? Should you set up your own server or is it better to leave it to an external party? With so many vendors out there, how can you determine what is the right service for you? What are the things that you absolutely need to be thinking about?
The thing you should be thinking about is called MLOps: this emerging field focuses entirely on the infrastructure that encapsulates ML models, making sure releasing to production goes smoothly and without headaches. MLOps is also described as the intersection of ML, DevOps and Data Engineering.
While this field is still relatively new, there is not yet a clear ‘best-practice’ that applies at every scale. However, there are some evident requirements that are necessary for a robust deployment of ML. To be able to design an architecture for your ML, it’s important to first know what it should be capable of and where it can make your life easier. Whether the ML application consists of a complicated, multi-facetted ML architecture or a single line of code, the requirements for the infrastructure of the ML pipelines remain the same.
This article gives you an overview of the most important requirements for a smooth ML infrastructure, in other words:
The 10 commandments of MLOps
1. Thou shalt always scale.
This might be, or should be, a no-brainer, but still: Your architecture should allow you to control resources in a flexible way. You never know how many (or few) resources your pipeline will take up in the future and therefore the infrastructure should always scale (automatically) with it appropriately. This allows your pipeline to handle fluctuating amounts of data/number of requests and will also prevent you from having to adjust it in the future to be more efficient or scale better. Moreover, as unused resources are shut down automatically, there’s no need to worry about paying for resources that are not used.
2. Thou shalt be secure.
A very important characteristic of proper ML infrastructure is the way security is handled. Some companies would even consider this to be the key feature they base their selection on, as privacy regulations have become strict and (sensitive) data leaks do not look well on your company’s resume. Consider the security of the data (uploading data to the cloud vs. data not leaving company servers), but certainly also the security of your precious ML intel.
3. Thou shalt aim to standardize
Maybe one of the most overlooked aspects of ML infrastructure is that it offers you the chance to standardize deployment of pipelines, not unlike ‘regular’ software deployments.
In software engineering there are design patterns on how code is written within a team and a deployment forces the team to come together to thoroughly test and check the code, leading to noticing mistakes early and improving overall quality. An ML-deployment doesn’t have to be different, as it provides opportunities to test and benchmark the new pipeline, and standardizing deployments brings structure in how the ML code is written and deployed. The ML infrastructure offers your team a framework to write code in a fixed way, making it consistent across applications, which greatly increases quality and improves workflow. Lastly, standardized deployments pave the way for automatic deployments: after uploading and passing tests, the new pipeline is automatically deployed into production. This may come in handy for systems that always have to be (re-)trained on the latest data: saving you the hassle of manually deploying a pipeline every day at 05:00 am.
4. Honor thy data integrations
The infrastructure should integrate effortlessly with your existing data pipelines and storage. In most cases, the less data migrations there are to and from a pipeline, the better. Not only does it save time and money to have your data stored in the place where it’s being processed, it minimizes the risks of data leaks as well. If this is simply unattainable, the infrastructure should easily integrate into your already existing data engineering pipelines and it should provide enough options to connect to your storage.
5. Thou shalt not underestimate the importance of monitoring
Just as it was important when the model/pipeline was being trained, monitoring its performance during production remains vital to keep results in the green. I’m not only talking about ML-metrics like accuracy and F1-score, but also infrastructure performance metrics, like latency (the time that is added by the infrastructure when running a request) and errors: Are all my pipelines available or are there errors? If the time it takes to handle a request suddenly rises from 0.01 to 52 seconds, that’s an indication something is going wrong. Monitoring of the pipeline should be accessible to the Data Science team and the IT-department, as both have a role to play in maintaining a healthy system.
6. Thou shalt be flexible in choice of software and programming language
If you choose to take on the service provided by an infrastructure provider, make sure to check out what software they support. Nothing is as frustrating as being locked-in by a vendor only to find out that the package or language of your choice is not supported. Your code should determine the infrastructure, not the other way around. Is the infrastructure very rigid and will it only allow pre-selected packages by the vendor? Or is switching between a Python model or an R-based model as simple as the click of a button?
7. Thou shalt have zero-downtime releasing
Doesn’t deploying ML-pipelines without any downtime of your application sound like a dream? With a good MLOps infrastructure, it is within reach. This can be achieved if your infrastructure allows for multiple identical environments to run in parallel. The production pipeline will run in the production environment, while tests and adjustments can be made to models running in the testing environment. Then, when deadline day arrives, deploying a new version is just a matter of switching environments with minimal to no downtime at all. The same applies to retrained models within a pipeline: The infrastructure should allow you to only swap out the old model for the retrained one and keep the rest in place as it was, without interrupting the flow of data continuously going through your pipeline.
8. Thou shalt have version control and data lineage tracing
This point relates to the monitoring of your pipeline (commandment 5), but there’s a difference: monitoring allows you to see if and when performance is going down, version control/data lineage tracing allows you to pinpoint exactly which model version or batch of data caused it. Since ML applications often intertwine code, model and data, it’s important that both code+model and data is tracked. Moreover, it will make comparing versions of models side-by-side much easier.
9. Consider thy complexity/transparency
An indicator of excellent MLOps infrastructure is whether it makes complex (cloud) infrastructure abstract, so that even people without much knowledge of the infrastructure can easily deploy models. However, it should also not be such a ‘black-box’, that when errors occur there is no way to tell what is going wrong and why. Insight into the internal processes of the infrastructure is necessary to debug the problem. The right balance between transparency and complexity is therefore vital. If you depend on an external service provider for your ML infrastructure, make sure to check that there’s a logging system in place, that gives an indication of what is happening internally.
10. Thou shalt have low latency
This one depends on your use case, but if real-time predictions are necessary, one cannot have an infrastructure-related latency of 15 seconds. The latency of infrastructure usually depends on how models/pipelines are stored and spun up internally, how data is retrieved and also, how the results are given back again. Be sure to check the added latency of an infrastructure on your application, as it can differ greatly between services.
You might be reading this as a Data Scientist and wondering: “Why should I care about MLOps? I already made the model, isn’t bringing it to production the work of the IT team?” This may be true for bigger companies with separated IT / Data Science departments, but for most companies starting out with Data Science, the reality is that the people designing the algorithm, will also be the ones deploying it, which makes good knowledge of MLOps invaluable. Moreover, it could very well be that the other party doesn’t know as much as you do about your pipeline and how to deploy it to production. Bringing them up to speed would take much effort, while a combined effort in designing a good MLOps architecture from the start will save both departments time and possibly a lot of frustration.
Is your infrastructure working for you?
There is a lot to consider when designing or choosing an MLOps infrastructure and it might be overwhelming at first. It’s also good to keep in mind that no business case is the same and emphasis can be on different aspects of the deployment cycle. Maybe it’s more important for your application to run fast than to be always available? Or maybe the infrastructure doesn’t have to be transparent as long as automated deployments are offered?
Whether your team is still in the start-up phase of productizing ML applications or there is already an infrastructure in place, it’s always good to consider if your infrastructure is still making your life easy (or at least not harder than it is). The optimal MLOps infrastructure adapts to your needs; you should not adapt your needs to the infrastructure. The market for infrastructure service providers is still expanding and it’s always good to look further when yours does not fit the bill (anymore).