https://newsletter.mw.creamermedia.com
  

Private Bag X139 Halfway House 1685

  

Powering intelligence - how AI training and inferencing are reshaping data centre demands

     

Font size: - +

By: Canninah Dladla - Cluster President for English-speaking Africa at Schneider Electric

The IT industry is undergoing one of its most defining shifts to date, driven by the explosive growth of generative AI. These powerful language models are pushing the limits of traditional data centre infrastructure. The upgrades operators prioritise will depend largely on whether they’re handling AI training or inference workloads.

Training an AI model consumes enormous amounts of power, often exceeding 100kW per rack, and demands advanced cooling and electrical designs. 

Inference, while less power-intensive per server, is evolving rapidly and spreading across environments — from cloud and colocation facilities to on-premises data centres and even the edge. It is at this stage that AI delivers real business value, making optimised infrastructure critical.

Together, training and inferencing are redefining what data centres must handle, in scale, density, flexibility, and resilience. Operators must therefore understand these shifting workloads and proactively adapt to meet a new era of intelligent computing.

AI training vs. inference: the infrastructure divide

Training involves teaching AI models to identify patterns within vast datasets using high-performance GPU clusters. These operate as unified virtual compute nodes, often across multiple racks, where power densities regularly exceed 100kW per rack. To manage the resulting thermal loads, direct-to-chip liquid cooling and rear door heat exchangers are becoming standard.

In contrast, interference is when trained models make predictions or decisions based on new data in real time. Though traditionally less demanding, inference workloads are becoming more complex — spanning chatbots, healthcare analysis, retail automation, and autonomous systems. Depending on deployment, rack densities can range from under 40kW to as high as 80kW for advanced use cases.

By 2030, the data centre market will likely reflect this diversity:

~25% of new builds: <40kW/rack (primarily inference)

~50%: 40–80kW/rack (mixed inference and training)

~25%: >100kW/rack (dedicated training clusters)

Where inference happens — and why it matters

1. Public cloud:
The public cloud remains the dominant environment for inference due to its flexibility, scalability, and mature ecosystem. GPU-accelerated servers, high-throughput networking, and intelligent workload orchestration are key. As inference workloads grow in complexity, energy-efficient architectures and advanced thermal management are vital to sustain performance and sustainability.

2. Colocation & on-premises
As use cases mature, organisations seek greater control over latency, costs, and data sovereignty. In regulated sectors like healthcare, finance, and manufacturing, localised AI models enable real-time insights without moving sensitive data. Here, infrastructure must be future-proofed — racks supporting 20kW today may soon need to handle double that. Cooling, power distribution, and backup systems must therefore be scalable.

3. Edge Computing
Dynamic AI applications — from smart retail to telecom infrastructure — require inference at the edge, where latency elimination is crucial. Edge sites face tight space and power constraints, demanding compact, high-efficiency designs capable of delivering performance within a small footprint while maintaining thermal stability.

Designing Infrastructure for AI Workloads

As inference accelerates across these environments, , infrastructure design must be flexible, modular, and optimised for density and efficiency.

Supporting training: ultra-high-density strategies:                                                                                                                                                                             

  • 100kW per rack
  • Direct-to-chip and rear door heat exchanger cooling
  • Modular, scalable electrical architectures
  • High-performance interconnects for GPU cluster synchronisation

As accelerators evolve, their thermal design power (TDP) will increase — making scalable cooling systems essential.

Supporting inference: medium-to-high-density strategies:

  • Support at least 40kW per rack
  • Employ compressed, optimised models for efficiency
  • Use hot/cold aisle containment for <40kW, upgrading to liquid cooling as density rises
  • Deploy intelligent PDUs for dynamic load scaling

Ultimately, rack power and cooling strategies must align with deployment strategy, futureproofing for increasing workload complexity.

The Importance of software

For both training and inference, monitoring, automation, and lifecycle support must be built into the infrastructure.

Tools such as Data Centre Infrastructure Management (DCIM), Electrical Power Monitoring Systems (EPMS), Building Management Systems (BMS), and digital design software are foundational. 

Also, managing mixed environments with both air- and liquid-cooled servers requires real-time monitoring, capacity planning, and automated responses — providing a frontline defence against operational risks.

What’s next? Preparing for evolving inference demands

As generative AI matures, inference workloads will become more complex and pervasive. Three trends are shaping this future:

1. More Complex Models - larger, multimodal (text + image + video) models with wider context windows will push power densities higher, even outside training environments.

2. Closer to the data source - low-latency, on-site decision-making will expand, driving more high-density inference to edge and colocation facilities.

3. AI-as-a-Service growth - ,managed AI services will further decentralise inference, requiring modular infrastructure to support diverse customer workloads.

The result: data centres must support a broad spectrum of power and cooling configurations, from lightweight edge servers to dense racks running inference and training side by side.

Whether training massive models in hyperscale facilities or deploying optimised inference at the edge, AI workloads demand new thinking around power, cooling, and energy efficiency. Every deployment is unique, shaped by workload type, GPU power needs, and cluster size.

Lastly, operators looking to stay ahead should dive deeper into how AI is reshaping data center design. Understanding the unique demands of training and inference is the first step; building infrastructure that can evolve alongside it is the real challenge,

 

Edited by Creamer Media Reporter

Comments

sq:0.073 2.77s - 212pq - 2rq
Subscribe Now