Designing Data Centers for AI: Cooling, Power and Electrical Distribution Patterns for High-Density GPU Pods
AI-infrastructurecoolingpower

Designing Data Centers for AI: Cooling, Power and Electrical Distribution Patterns for High-Density GPU Pods

ddatacentres
2026-01-26 12:00:00
11 min read
Advertisement

Practical design patterns for powering and cooling high-density GPU pods in 2026 — transformer sizing, PDUs, busways, liquid cooling and PUE strategies.

Hook: The immediate challenge—how to host AI at scale without blowing fuses or your PUE

AI clusters in 2026 are no longer incremental workloads — they're concentrated, bursty and extremely power-dense. Operators and IT managers tell us the same things: capacity upgrades take months, utility upgrades are costly and politically visible, and cooling limits are the bottleneck for pod scale. If you are planning GPU pods or expanding existing clusters, the hardware and electrical patterns you choose now will determine your uptime, PUE and total cost of ownership for years.

Executive summary — what to do first (the inverted pyramid)

  • Capacity-plan from transformer to rack: start at the utility interface and size transformers for IT load + harmonics + future growth; don’t back into sizing from breakers.
  • Choose the right electrical topology: MV/LV transformer placement, busway distribution, and dual A/B feeds reduce outage surface area and speed deployment.
  • Re-think cooling as part of the electrical design: warm-water and direct liquid cooling drastically reduce CRAH load and PUE — design piping, plate exchangers and pumps with electrical draw in your models.
  • Plan for BESS and on-site resilience: in 2026 policy and grid constraints are pushing data centres to internalize more power costs; batteries and on-site generation help manage demand charges and enable higher PUE targets.

Through late 2025 and early 2026 regulators and grid operators increasingly focus on AI-driven load growth. Policy changes (for example, proposals shifting a greater share of interconnection and generation costs to large users) are making on-site power economics and peak management central to data-centre financial models. Operationally, industry deployments show rack power densities of 20–80 kW and GPU pods of 200–1,000+ kW are common for training clusters. That changes everything: transformer inrush, harmonic heating, cooling ΔT strategies and even site layout.

Example: recent policy moves in 2026 highlight that data centres may be expected to internalize some grid capacity costs as AI demand grows — plan designs that reduce peak grid draw and justify on-site investments.

Electrical design patterns: from transformers to PDUs

Transformer sizing and placement — rules of thumb and worked example

Best practice is to size transformers from a bottom-up IT load model (not top-down). The steps:

  1. Quantify sustained IT load (kW) for the pod and include facility overheads (lights, pumps, controls).
  2. Apply power factor & harmonic corrections (assume PF 0.9 nominal for heavily rectified GPU loads).
  3. Add a growth/contingency factor (25–50%) depending on planned upgrade cadence.
  4. Choose standard transformer sizes and evaluate parallel transformer configurations for redundancy.

Worked example: a 10-rack GPU pod with 40 kW per rack sustained IT = 400 kW. Add 10% facility overhead → 440 kW. With PF = 0.9, apparent power S = 440 / 0.9 ≈ 489 kVA. Add 25% growth margin → 611 kVA → select a 750 kVA transformer (or two 500 kVA transformers paralleled for redundancy). Where possible place transformers in an MV yard or separate room with oil containment; prefer pad-mounted units for modular growth and shorter MV cable runs.

Transformer placement considerations

  • Indoor dry-type vs outdoor oil-filled: dry-type reduces fire risk but can be larger and warmer; oil-filled has higher efficiency but requires spill containment and NFPA-compliant rooms.
  • Locating transformers close to the building entrance reduces LV feeder length and copper cost but requires careful short-circuit and protection coordination.
  • Parallel transformers enable staged upgrades — design switchgear and protection for paralleling from day one.

PDUs, breaker sizing and rack feeds — practical math

High-density racks push conventional PDU approaches beyond their limits. You should design rack feeds on a per-rack power model and size PDUs to provide both capacity and redundancy.

Use the 3-phase power formula: kW = 1.732 × V × I × PF. Rearranged: I (A) = kW / (1.732 × V × PF).

Example: 30 kW rack on 208V 3-phase with PF 0.9 → I ≈ 30,000 / (1.732 × 208 × 0.9) ≈ 93 A. That means a single 100 A 3-phase breaker could serve the rack, but for redundancy you should implement dual A/B 100 A feeds or multiple 60 A breakers per phase depending on your chosen PDU topology.

Recommendations:

  • Prefer 3-phase 208V or 400/415V distributions to reduce current and conductor sizes.
  • Use dual A/B fed rack PDUs where uptime is critical. Size each leg to meet expected rack draw while leaving headroom for GPU burn-in.
  • Consider high-density power shelves (e.g., 400 A bus or high-current PDUs) with local distribution to racks to reduce breaker proliferation.

Busways and modular distribution

Busways accelerate deployment and reduce LV copper and conduit clutter. They allow in-aisle tap-offs and straightforward redistribution as rack densities change.

Design tips: use segmented busway with insulated tap-offs rated for your maximum fault-current and temperature. Separate busways for A and B feeds and locate busways above the aisle to limit mechanical interference with cooling plumbing.

UPS, BESS and peak management — grid realities in 2026

Two trends make UPS and BESS planning urgent: higher base loads from AI and policy shifts making utilities and planners ask heavy users to pay for upgrades. The result: increasing incentive to manage and shape peaks rather than only rely on utility upgrades.

  • UPS topology: modular, containerised UPS or distributed in-row UPS (IRU) both work — choose based on serviceability and space. For GPU bursts, ensure UPS can support high inrush and fast load changes.
  • BESS: battery energy storage systems are now a practical tool to shave peaks, participate in demand response and reduce interconnection upgrade costs. Design BESS integration into the electrical one-line early to avoid retrofits.
  • On-site generation: consider gas gensets or fuel cells as long-duration resilience — but model emissions and utility incentives when doing so.

Cooling architectures for high-density GPU pods

Air cooling cannot reasonably support sustained rack densities >30–40 kW if you want predictable, low-PUE operation. Liquid cooling (direct-to-chip, rear-door, or immersion) is the standard for modern GPU pods. Each has trade-offs in electrical and facility design.

Direct-to-chip (D2C) and rear-door heat exchangers

D2C uses cold plates on the CPU/GPU and a secondary facility loop to transfer heat to chillers or heat reuse. Rear-door heat exchangers are a lower-risk intermediate step, returning warmer air to the room and reducing CRAH wattage.

Electrical implications:

  • Pumps add steady-state electrical load — model pump power into PUE and site electrical capacity.
  • Pumps should be supplied via redundant electrical paths and included in service-level agreements for availability.
  • Plate heat exchangers, valves and controls require space and must be accessible for maintenance.

Immersion cooling (single-phase and two-phase)

Immersion provides the highest thermal density and lowest PUE but requires close collaboration with equipment vendors. Two-phase immersion (boiling dielectric) removes the need for chillers in many climates; single-phase often uses a pumped dielectric and external heat rejection.

Key design points:

  • Design electrical rooms and piping for dielectric fluid handling and spill containment.
  • Plan for rack-level containment and convenient serviceability — immersion service procedures differ from air-cooled racks and require trained staff.
  • Estimate PUE improvements conservatively — factor in additional pump and filtration loads.

Facility loop design: decouple IT and plant loops

Always design an IT (server) loop and a facility (chiller/cooling tower) loop with a plate heat exchanger between them. Benefits:

  • Protects server-side fluid quality.
  • Allows use of different fluids or temperatures on either side.
  • Makes maintenance and redundancy simpler.

Also design for fluid temperatures that enable free cooling and heat reuse. In modern D2C designs, leaving return-water temps of 40–60°C is achievable and monetizable when partnered with district heating or industrial reuse.

PUE, WUE and efficiency modeling — realistic targets for 2026

Design-stage PUE modeling must include transformer, PDU and pump losses plus UPS and power conversion inefficiencies. Typical targets:

  • Air-cooled high-density pods: design PUE 1.2–1.4 (variable with climate).
  • Warm-water or D2C: design PUE 1.05–1.12.
  • Immersion: achievable PUE 1.03–1.08 in many deployments.

Include these losses explicitly in your one-line and energy model: transformer no-load and load losses, PDU efficiency (include rectifier and distribution losses), UPS conversion (or bypass in some architectures), and pumps/chillers. Also model part-load performance, not just nameplate.

Reliability, harmonics and protection coordination

Rectified GPU power supplies and high-frequency switching create harmonic currents that increase transformer and conductor heating. Design for this:

  • Specify transformers and neutrals rated for expected harmonic currents.
  • Install harmonic filters where needed — active filters can limit K-factor heating.
  • Oversize neutrals in 208/120 systems supporting variable loads.
  • Coordinate protective relays to allow safe paralleling and fast transfer for maintenance operations.

Safety, codes and serviceability

Liquid cooling demands re-evaluation of fire suppression and electrical room design. Points to cover in design review:

  • NFPA and local code requirements for liquid-filled equipment and spill containment.
  • Fire suppression compatible with electronics — inert gas or pre-action systems are common.
  • Maintenance access: ensure you can isolate and remove a rack or heat exchanger without shutting down an entire pod.

Case study: Designing a 400 kW GPU pod (practical walk-through)

Scenario: 10 racks at 40 kW each → 400 kW IT load. Steps and decisions:

  1. Transformer: compute apparent power: 400 kW + 10% overhead = 440 kW. With PF 0.9 → 489 kVA. Add 25% growth → 611 kVA → specify two 500 kVA transformers in parallel or a single 750 kVA (prefer dual 500 kVA for staged growth and redundancy).
  2. Distribution: implement A/B busways feeding each rack, with tap-off PDUs rated for 3-phase 208V supplies. Provide dual 100 A 3-phase feeds to racks or two 60 A 3-phase feeds per rack depending on your selected PDU hardware.
  3. UPS & BESS: use modular UPS to cover short ride-through; integrate a BESS sized for peak shaving (e.g., 300 kW for 15–30 minutes) to reduce demand charges during training peaks.
  4. Cooling: select D2C with warm-water target supply 30–40°C and a facility loop that can free-cool most of the year. Model pump power into PUE and specify redundancy for pumps and heat exchangers.
  5. PUE outcome: with D2C and good system integration you can expect a design PUE ≈ 1.06–1.09; with immersion, target 1.04–1.06.

Actionable checklist for your next design

  • Build a bottom-up IT load model per rack and pod, including transient peaks.
  • Size transformers from apparent power (kVA) and include harmonics and growth.
  • Plan A/B distribution with busways and rack PDUs that support your target redundancy.
  • Choose liquid-cooling architecture early — integrate pumps, loops and plate exchangers into electrical one-line diagrams.
  • Model PUE including plant, transformer and distribution inefficiencies at multiple load points.
  • Include BESS and/or demand-management strategies in the business case to reduce utility upgrade costs.
  • Specify harmonics mitigation and oversize neutrals where rectification loads dominate.
  • Design for maintainability: isolation valves, service aisles, quick disconnects and clear safety zones.

Future predictions — what to expect through 2028

Expect more regulatory pressure on large power users to internalize grid upgrade costs and more widespread adoption of BESS and demand-response integration. Liquid cooling and immersion will become the dominant pattern for training clusters as systems vendors deliver liquid-ready hardware and developers demand predictable thermal performance. Thermal reuse markets will expand, turning waste heat into revenue streams where local infrastructure exists.

Closing thoughts — balancing reliability, cost and sustainability

Designing data centres for AI is an exercise in systems engineering: electrical distribution, cooling, and operational policy must be designed together. The marginal cost of doing this right in 2026 is far lower than the operational cost and downtime risk of retrofitting later. Prioritize transformer and distribution capacity, adopt liquid cooling where density demands it, and bake peak-management (BESS) into your financial model given changing grid economics.

Key takeaways

  • Start at the utility interface: transformer sizing and interconnect decisions drive capital and schedule.
  • Design electrical and cooling in lockstep: every kW of pump or chiller load reduces your headroom and raises PUE.
  • Use modular busways and PDUs: they reduce deployment time and provide scalable capacity for evolving GPU densities.
  • Incorporate BESS and demand strategies: they mitigate utility upgrade costs and enable better economics for AI clusters.

For technical teams: if you want a ready-to-use worksheet that walks through transformer kVA, PDU breaker sizing, pump power and a PUE model specific to GPU pod topologies, download our GPU Pod Design Checklist or contact datacentres.online for a design review.

Call to action: Schedule a free 30-minute architecture review with our Design & Hardware experts to validate your transformer sizing, PDU configuration and cooling topology before procurement. Protect uptime, reduce PUE and control long-term operating costs.

Advertisement

Related Topics

#AI-infrastructure#cooling#power
d

datacentres

Contributor

Senior editor and content strategist. Writing about technology, design, and the future of digital media. Follow along for deep dives into the industry's moving parts.

Advertisement
2026-01-24T04:01:40.222Z