MLOps – The Missing Piece In The Enterprise AI Puzzle

The enterprise CXOs are getting serious about machine learning (ML) and artificial intelligence (AI). Machine learning is finding its place in the big data and business intelligence initiatives within enterprises.

According to Forrester, over 50% of enterprise technology decision-makers have already implemented or are in the process of implementing ML and AI. 

Despite the investments and commitment from the leadership team, organizations are yet to realize the full potential of AI. In the 2020 State of Enterprise Machine Learning report from Algorithima, 55 percent of the surveyed companies are yet to deploy a model, and 22 percent of companies have had models in production for 1-2 years. About 8 percent of organizations are running sophisticated models in production. 

One of the barriers to operationalizing ML models is the long wound process and timeline involved in the deployment. Organizations derive value from ML only when the models are deployed and fully integrated with existing business processes. 

According to Algorithima’s survey, 22 percent of the respondents take one to three months to deploy new ML models into production. In comparison, 18 percent mentioned that it takes more than three months for deployment. 

MORE FOR YOU

The delay in deployment leads to the failure of ML projects in the enterprise. A survey conducted by IDC in June 2020 showed that 28 percent of machine learning projects fail in organizations. 

A more in-depth analysis of this trend highlights the gaps such as fragmented toolchain, lack of expertise, ML-ready data, integrated development environments, and collaboration among developers, data scientists, and the DevOps team.

The challenges closely resemble what developers and system administrators faced during the last decade. Lack of collaboration between developers and administrators led to a disjoint development and deployment strategy, which proved to be expensive for organizations. The DevOps paradigm addressed this through the introduction of new tools and processes. With an effective DevOps strategy, organizations could increase the velocity of shipping new features and services to end-users. 

The machine learning ecosystem needs a robust DevOps-like framework, toolchain, and process that brings the developers, data scientists, and operators together. MLOps, the marriage of ML with DevOps, aims to bring some of the proven capabilities of agile software engineering to machine learning and artificial intelligence. 

Similar to software development, building machine learning models involves a variety of tools and frameworks. Each data scientist in the team prefers to use her favorite tool for data preparation, exploratory analysis, and feature extraction. An ML engineer uses a language and framework of his choice to train the model. The operations team is expected to package, deploy and scale the model for inference where existing business applications consume it. 

In a typical software development environment, developers use an IDE to build and test code. At the same time, the operations team relies on containers and container orchestration engines to deploy and scale the code. In contrast, a data scientist uses Jupyter Lab as the development environment to train the models. Kubernetes has become the defacto deployment platform for both applications and machine learning models. 

The accuracy of Machine learning models may deteriorate with time. Changing business environments and external factors act as the key influencers of the precision and accuracy of models. By bringing the same agility and velocity of software development and deployment to ML, new models can be trained and deployed automatically with no manual intervention. 

MLOps includes the detection of model drift – a term used to describe the widening gap between the predictions and expected results – which can trigger the process of automated training, evaluation, and deployment of the model. This workflow is comparable to continuous integration (CI) and continuous deployment (CD) pipelines of modern software development. MLOps adds continuous training, which is triggered by the continuous evaluation process that continually compares the accuracy of model predictions. 

ML Platform as a Service (PaaS) offerings such as Amazon SageMaker, Azure ML, Google Cloud AI have integrated pipelines and MLOps capabilities. But those tools are tightly coupled with the underlying platform components such as the data ingestion service, object storage, data lake, distributed training environment, container orchestrator, model registry, and model serving. 

Customers looking for a consistent set of tools and platforms can use open source projects such as Kubeflow, MLflow and Seldon that work both in on-premises and cloud environments. 

With a mature DevOps framework, organizations can deploy software at a rapid pace. SaaS and mobile applications’ rise led to a non-disruptive delivery model where users get the new version of the software without any intervention. MLOps promises a similar approach to managing the lifecycle of machine learning models.

It’s time for enterprises to pay attention to MLOps and adopt it as the framework for implementing machine learning projects. 

MLOps brings the best of iterative development involved in training machine learning models and scalable and manageable model deployment. It is currently the missing piece of the puzzle in the enterprise AI strategy.