Breaking News

AI and ML: The new frontier for data center innovation and optimization

This article is part of a VB special issue. Read the full series here: Data centers in 2023: How to do more with less.

As the demand for data processing and storage continues to surge, data centers are grappling with the challenge of evolving and expanding. The changing landscape of platforms, equipment design, topologies, power density requirements and cooling demands all underscore the pressing need for new architectural designs. 

Data center infrastructures often struggle to align current and projected IT loads with their critical infrastructure, resulting in a mismatch that threatens their ability to meet escalating requirements. Against this backdrop, traditional data center approaches must be revised.

Data centers are now integrating artificial intelligence (AI) and machine learning (ML) technologies into their infrastructure to remain competitive. By implementing an AI-driven layer within traditional data center architectures, companies can create autonomous data centers that can optimize and perform generic data engineering tasks without human intervention.

Turbocharging traditional architectures with AI

The proliferation of AI and ML technologies within data centers has been notable in recent years. AI is driving efficiency and performance across various use cases.


Transform 2023

Join us in San Francisco on July 11-12, where top executives will share how they have integrated and optimized AI investments for success and avoided common pitfalls.

Register Now

“AI-driven data centers can help organizations gain a competitive advantage by optimizing application performance and availability, which in turn helps increase customer satisfaction and loyalty,” said Sajid Mohamedy, EVP of silicon valley based technology consulting firm Nisum. “Adding AI to the mix aids optimized resource allocation, which improves data center efficiency and reduces costs.”

Fast failure detection and prediction, root cause analysis, power usage optimization and resource capacity allocation optimization are just a few examples where data and algorithm-driven technologies are being deployed to maximize data center efficiency.

Incorporating AI into the data center is becoming increasingly necessary for every data-driven business, as outages are becoming more frequent and expensive. AI-driven data centers offer an array of benefits, chief among them the potential to slash downtime and enhance overall system reliability, ultimately translating into massive cost savings for organizations.

Increased fault detection and prediction abilities

According to Ellen Campana, leader of enterprise AI at KPMG U.S., AI has historically been employed to enhance data storage optimization, energy utilization and accessibility. However, in recent years, there has been a discernible trend in expanding AI’s utility to encompass fault detection and prediction, which can trigger self-healing mechanisms.

“The key to streamlining automated detection is providing the AI with a window into the details of hardware and software operations, including network traffic,” Campana told VentureBeat. “If traffic within a certain node is slowing, AI can detect that pattern and trigger restart to a process or the entire node.” 

Pratik Gupta, chief technology officer at IBM Automation, posits that AI has transformative potential across the data center and hybrid cloud environments. By bolstering user experiences in applications, streamlining operations, and empowering CIOs and business decision-makers to glean insights from an array of data, AI catalyzes innovation and optimization.

A clear picture of app resourcing levels

IBM expect data center energy consumption to increase by 12% (or more) by 2030, due to the expiration of Moore’s Law, and an explosion of data volume, velocity and energy-intensive workloads, said Gupta.

“Simply put, AI can reduce the amount of hardware to purchase, maintain, manage and monitor,” he said.

Data center managers must maintain a clear picture of their organization’s application resourcing levels, allowing for nimble scaling to meet demand in real-time, said Gupta. AI-powered automation can play a key role in this process, mitigating the risk of resource congestion and latency while ensuring that hardware workloads remain safe and performance standards are upheld.

IBM’s Turbonomic, for instance, can automatically optimize application resourcing levels and scale with business needs.

“This enables IT managers to have a single dashboard to oversee resourcing levels, make decisions in real-time and brings efficiency as they ensure none of their apps get over-provisioned,” said Gupta.

Maximizing the benefits of AI-driven data centers

AI and ML use cases in data centers continue to grow, but organizations must consider some key factors before implementing them. While pre-packaged AI and ML solutions are increasingly available, they still require integration beyond individual point solutions. DIY AI deployments are possible but require investment in sensors to collect data and expertise to convert that data into usable insights. 

“Many organizations choose to implement their own data centers precisely because they can be sure that data will not be pooled with others’ data or used in ways they cannot control,” said KPMG’s Campana. “While this is true, organizations must then accept the responsibility of maintaining security and privacy.” 

With the right resources, data centers can become smarter and more efficient, but achieving this goal requires optimal planning.

“Planning should be a key pillar of implementing AI-driven data centers,” said IBM’s Gupta. “Successful deployments don’t happen overnight, and need a significant amount of iteration and thought before being rolled out. IT leaders need to consider factors such as understanding what hardware they can and should keep and what workloads they need to move to the cloud.”

Flexibility critical

The key to success for AI-driven data centers is to take a strategic approach. This means identifying the right use cases for AI and ML, investing in the necessary infrastructure and tools and developing a skilled workforce to effectively manage and maintain systems.

“Companies often maintain sprawling infrastructure — from distributed data center locations to various cloud deployments,” said Gupta. “IT Leaders need to consider whether they need to build a lake for all data sources to converge…or bring the data preparation, ML and AI tools to each location. As companies transform their IT infrastructure, they must not only consider the value being delivered but also the vulnerabilities being created.”

He added that best-laid plans can go awry. “The same can be true for technology rollouts, and the nimble organization that can adjust course quickly will be more successful,” he added. 

Four emerging strategies for improving IT and data center performance

AIOps, MLOps, DevOps and SecOps each have unique strengths. When combined, they are optimizing data center operations and broader IT performance, reducing costs and enabling service improvements.

AIOps automates and scales corporate-wide data center and IT workflows

AIOps is becoming core to enterprises’ sustainability and carbon reduction efforts in data centers and has proved effective in identifying why performance gaps occur. Core to this technology is its ability to interpret and suggest actions based on real-time performance data (causal analysis).

For example, Walmart is using AIOps to streamline e-commerce operations. AIOps relies on a combination of ML models and Natural Language Processing (NLP) to discover new process workflows that can improve the accuracy, cost-effectiveness and efficiency of data center operations. Retailers also use AIOps to detect and resolve inefficient and disconnected processes in real-time while also automating tech stacks and broader infrastructure management.

AIOps enables more accurate real-time anomaly detection within e-commerce platforms. The technology also excels at correlating data from all available sources across a data center to provide a 360-degree view of operations and identify where availability, cost control and performance can be improved.

Retailers rely on DevOps to accelerate app development

Retailers rely on DevOps to stay competitive and shorten time-to-market for new apps and features. DevOps is based on a software development methodology approach that emphasizes collaboration and communication between software developers and IT operations teams. It’s proven effective in streamlining software delivery and development for new mobile apps, website features and customer experience-based enhancements. 

Amazon, Target, Nordstrom, Walmart and other leading retailers have adopted DevOps as their main software development process. Retail CIOs tell VentureBeat that the higher the quality of the DevOps code base, the more efficient data centers run with the latest app release to customers worldwide.

MLOps offers a lifecycle-based approach

As retailers recruit more data scientists, MLOps becomes just as important as DevOps for keeping models current and usable. MLOps applies DevOps principles to ML models and algorithms. Leading retailers use MLOps to design, test and release new models to improve customer segmentation, demand forecasting and inventory management.

MLOps is proving effective in solving the most costly and challenging problems in retail, starting with inventory management and optimization. Supply chain uncertainty, chronic labor shortages and spiraling inflationary costs are making inventory management a make-or-break area for retailers.

Macy’s, Walmart and others are using MLOps to optimize pricing and inventory management, helping retailers make decisions that reduce costs and protect themselves from the downside risk of holding too much inventory. 

SecOps relies on AI and ML to secure every identity and threat surface

SecOps ensures data centers and the broader IT infrastructure stay secure and complaint. Zero trust security, which assumes no user or device can be trusted and every identity must be verified, is the foundation of any successful SecOps implementation. The goal is to reduce the attack surface and risks of increasingly sophisticated cyberattacks.

SecOps optimizes data center security by combining the most proven techniques for reducing intrusions and breaches. Adopting zero trust security measures helps retailers protect the identities of their customers, employees, and suppliers, and microsegmentation can limit the blast radius of any attack. 

 Leading retailers rely on AIOps, MLOps, DevOps, and SecOps to gain greater efficiency, security and performance in their data centers. Source: VentureBeat analysis of leading retailers’ uses of AIOps, MLOps, DevOps, and SecOps use in data centers.

AI and the future of data center technology

Edge computing is emerging as one of the most promising technologies for developing AI-driven data centers. By processing data closer to the source, edge computing reduces latency and improves overall performance. When combined with AI, this technology offers the potential to achieve real-time analysis and decision-making capabilities, making data centers capable of handling mission-critical applications in the future.

“The move to 5G was a major step in this transition and is fueling a wave of innovation in AI-based software infrastructure,” said KPMG’s Campana. “For businesses beginning new data centers, it is worthwhile to consider their timeline for adopting 5G and making other updates of end-user hardware.”

For his part, IBM’s Gupta sees data intelligent automation as a way to continue making inroads into heavily regulated industries, as AI and data center tools will be designed to automatically meet compliance requirements. 

“As AI and automation get embedded further into data centers, they will be able to meet the most stringent compliance protocols,” he said.

VentureBeat’s mission is to be a digital town square for technical decision-makers to gain knowledge about transformative enterprise technology and transact. Discover our Briefings.