5 ways machine learning uses CI/CD in production – Analytics India Magazine

  • Lauren
  • January 30, 2022
  • Comments Off on 5 ways machine learning uses CI/CD in production – Analytics India Magazine

Continuous integration (CI) is the process of all software developers merging their code changes in a central repository many times throughout the day. A fully automated software release process is called continuous delivery, abbreviated as CD. Although the two terms are not interchangeable, CI/CD is a DevOps methodology and fits in that category. A continuous integration/continuous delivery (CI/CD) pipeline is a system that automates the software delivery process. CI/CD pipelines generate code, run tests, and deliver new product versions when software is changed. The testing occurs in the CI phase of the CI/CD workflow. Then, the code improvements are integrated and tested by an automated method. Deployment is the pipeline’s CD element since it continually distributes or provides software at scale. This article looks into 5 ways machine learning uses CI/CD in production. 

CI/CD with Azure DevOps

A business uses Azure DevOps Pipelines to set up build and release activities to automate the dev-to-production cycle. After model serialisation, the build pipeline generates model artefacts from candidate source code (mostly ONNX). Next, the release pipelines are used to deploy the artefacts to infrastructure targets. Finally, after the artefacts have been tested in the development environment, the release pipelines move them to the quality assurance (or QA) step. 

Model testing occurs during the QA stage when the team performs A/B tests and stress tests on the model service to ensure that it is ready for deployment to the production environment. 

A human validator, generally the product owner, verifies that the model has passed the tests and validated them before approving it. 

CI/CD with GitHub Action 

GitHub Actions is a workflow management solution available for GitHub repositories. Automating processes in GitHub gives the user an integrated way to optimise their development productivity. Workflows are created using YAML files in the .github/workflows folder at the base of the project. With actions, events in their GitHub repo, such as pushes, pulls, and releases, are used as triggers to start workflows that can orchestrate various tasks. For example, GitHub Actions would run their suite of tests on every new commit to the project, allowing them to maintain confidence in their model’s capability.

CI/CD with Gitops 

To implement CI/CD, the business uses GitOps with Jenkins to execute code quality checks and smoke tests in the test environment using production-like runs. Every pull request for model code goes through code reviews and automated unit tests in the team’s single pipeline. The pull requests are also subjected to automated smoke tests, which involve training models, and making predictions and executing the complete end-to-end pipeline on a small sample of real data to check that everything worked as planned. For continuous delivery of models, a model quality report is generated after each model is trained and manually inspected by a domain expert before being deployed manually after being validated by the domain expert.

CI/CD with AWS 

AWS Cloud offers managed CI/CD workflow solutions like AWS CodePipeline and AWS Step Functions to help machine learning developers with continuous integration and delivery. For continuous integration, businesses utilise git to commit to AWS CodeCommit, which then triggers a build step in CodePipeline (via an AWS CodeBuild job), with AWS Step Functions orchestrating the workflows for each CodePipeline activity. The business can manage the complications of running numerous models and pipelines with CodePipelines, thanks to AWS Step Functions’ workflow orchestration approach. Because each pipeline job in CodePipeline focuses on one process, the team’s multi-model deployments are easier to manage and update. Builds are also easier to deliver and troubleshoot.

CI/CD with Vertex AI

Businesses can use the managed Vertex AI Pipelines product and TensorFlow Extended, both running on Google Cloud infrastructure, to orchestrate and manage their machine learning pipelines’ continuous integration, delivery, and deployment. Using an ML-native pipeline tool instead of traditional CI/CD technologies allows the business to assure model quality consistency and ensure that models go through the normal processes of feature engineering, model scoring, model analysis, model validation, and model monitoring in one unified pipeline.

Read more: Top CI/CD Tools For Improving DevOps Pipeline
Source: https://analyticsindiamag.com/5-ways-machine-learning-uses-ci-cd-in-production/