Data Center Commissioning
An Overview of Mission Critical Cx
What is commissioning (Cx) and why is it important?
Without a robust Cx program, data center builders and operators risk the resiliency and reliability of the equipment meant to keep their data center running with maximum uptime. The industry standards organization, ASHRAE, defines commissioning as, “… a quality-focused process for enhancing the delivery of a project by achieving, validating, and documenting the performance of facility elements in meeting the objectives and criteria of the Owner. Cx extends through all phases of new or major renovation projects, from predesign to Owner occupancy and operation, with tasks during each phase to ensure verification of design, construction, and operator training.”, in their 0-2019 Guideline. Whether a data center is training an LLM, storing vital records, or supporting critical systems of any kind, keeping the compute online cannot be ensured without successful commissioning.
So what should a Cx program look like?
A proper Cx program begins at the inception of the project during the planning and design phases. It is during this time that the project leaders, A&E team and other experts think critically about the equipment being spec’d out for the build and what types of testing will be required to ensure the entire infrastructure operates as intended. The Cx program must also serve to align with and uphold the owner project requirements (OPR) for the facility. Another early, relevant aspect is the submittal review phase which ensures product compliance with the OPR.
The Cx program should be comprehensively documented, reviewed, and distributed across all involved stakeholders who will play a part in the project’s success.
Keep reading for an overview on data center commissioning.
Levels of Cx
There are typically five levels of Cx implemented across data center builds, L1 through L5.
Level 1 - Factory Acceptance Testing
The purpose of Factory Acceptance Testing (FAT) is to verify that individual components meet design specifications before arriving at the site by testing critical equipment and systems to ensure their ratings, performance, and safety align with project requirements.
For example, a 2MW generator may be tested under simulated load conditions at the manufacturer’s facility to confirm it meets runtime and efficiency standards before shipping.
Level 2 - Site Acceptance / Installation Verification
L2 Cx is performed to ensure equipment is correctly delivered, installed, and ready for testing by checking things like proper grounding, electrical connections, piping, airflow, and overall physical integrity. Additionally, design drawings, submittals, and manufacturer installation instructions should be reviewed.
For example, this process involves confirming that chillers and cooling towers are installed according to mechanical design drawings and ensuring that electrical switchgear is properly grounded before power is applied.
Level 3 - Startup / Pre-Functional Testing
L3 startup testing verifies that individual pieces of equipment power up, operate, and respond as expected. This could look like energizing UPS systems, generators, cooling units, and switchgear to ensure proper startup, running standalone tests on HVAC controls, battery charging systems, and fuel systems, and validating communications between BMS, EPMS, security, and telecom systems.
For example, this process involves starting a UPS system with batteries, confirming it maintains voltage under no-load conditions, and testing alarm conditions within the EPMS.
Level 4 - Functional Performance Testing
L4 performance testing validates system performance under real-world operational loads through load testing and system integration by load bank testing for UPS systems, generators, and electrical distribution, chilled water and HVAC system flow testing with simulated IT loads and testing failover scenarios such as automatic transfer switch (ATS) and redundant power feeds.
For example, a 100% IT load bank test may be run, followed by a full building power transfer, to confirm uninterrupted UPS operation and generator startup within the required time limits.
Level 5 - Integrated Systems Testing
Lastly, integrated systems testing ensures all systems work together under dynamic real-world conditions by simulating data center failure events such as power loss, cooling failure, or fire alarm activation, verifying the automatic responses of BMS, EPMS, security, and fire suppression systems, and confirming failover between redundant UPS systems, generators, network systems, and cooling infrastructure.
For example, a main utility power failure may be simulated to verify the sequence of operations: the UPS maintains the IT load, generators start, the ATS transfers the load, cooling systems continue operating, and monitoring systems record the events correctly.
Types of Cx Testing
Across a data center build there exists a wide range of equipment, systems and building materials that all require specific testing procedures to verify compliance and optimal operations. Whether it be electrical equipment like switchgear, transformers, UPS’s and generators; mechanical equipment like piping, pumps, coolant distribution units and air handling units; networking and ICT equipment like switches and data cabling; life safety equipment like fire suppression systems and emergency alarms; security equipment like access control and CCTV cameras; or other systems like BMS and EPMS; there are a multitude of pieces in the puzzle that must all fit together perfectly. Not to mention, large server deployments will undergo their own form of acceptance testing when first installed in a new data center- this could be seen as similar to Cx or simply performance testing of the compute itself.
L1 Testing Examples i.e. component verification performed at manufacturer facilities before shipment.
Electrical:
Transformer dielectric and insulation resistance tests
Switchgear breaker trip curve verification
UPS runtime and load-bank validation
Mechanical:
Chiller performance under rated tonnage
CRAH/CRAC airflow, coil pressure drop, and vibration testing
Pump curve validation
Fire Protection / Life Safety:
Factory acceptance of fire alarm panels
FM200 / Novec suppression cylinder weight and discharge valve checks
Security:
Card reader functional demo by manufacturer
CCTV camera resolution/zoom factory quality test
Networking:
Switch throughput and latency benchmarking
Patch panel compliance to TIA-568 standards
L2 Testing Examples i.e. ensures equipment is installed per design and spec.
Electrical:
Visual inspection of bus bar torqueing
Cable megger testing for insulation integrity
Grounding and bonding continuity checks
Mechanical:
Piping hydrostatic pressure test
Verification of valve positions and labeling
Refrigerant line leak testing
Fire Protection / Life Safety:
Fire sprinkler piping hydrostatic test
Smoke detector placement vs. drawings
Egress pathway clearance check
Security:
Card readers properly mounted and wired
Camera coverage angle field-of-view verification
Networking:
Rack elevations per design
Copper/fiber cable labeling and pathway inspection
L3 Testing Examples i.e. verifies that equipment can operate independently.
Electrical:
UPS start-up and battery discharge testing
Generator crank/start sequence and governor control
ATS open/close transfer operation
Mechanical:
Chiller start-up, oil pressure verification
CRAH airflow CFM measurement
Pump motor rotation check
Fire Protection / Life Safety:
Fire alarm panel input/output response checks
Emergency lighting and exit signs functional tests
Security:
Card swipe grants/denies entry correctly
Biometric scanner calibration
Networking:
Switch power-on and firmware validation
Basic VLAN configuration test
L4 Testing Examples i.e. tests each system under simulated load or operating conditions.
Electrical:
UPS full load-bank test with battery discharge/ recharge
Generator load acceptance and rejection tests
STS/ATS automatic transfer during simulated outage
Mechanical:
Hot aisle / cold aisle thermal mapping under simulated IT load
Redundancy failover (N+1 chiller auto-sequence test)
CRAH/CRAC control loop verification
Fire Protection / Life Safety:
Integration of fire alarm with HVAC shutdown
Suppression system time delay and release sequence
Security:
Access control system alarm reporting to BMS/PSIM
Intrusion detection alarms under forced entry simulation
Networking:
Link aggregation throughput testing
Redundant router failover and spanning tree operation
L5 Testing Examples i.e. end-to-end scenarios simulating real data center events.
Electrical:
Black-start of facility from utility outage through generator/UPS recovery
Cascade failure test of redundant UPS modules
Mechanical:
Simultaneous generator + chiller plant switchover under load
Thermal stress test with IT loads ramping to 100%
Fire Protection / Life Safety:
Fire alarm triggers: HVAC shuts down, doors release, suppression discharges
Emergency power lights, alarms, and evacuation drill
Security:
Multi-factor authentication test during fire scenario (doors unlock as designed)
Loss of power to security system with UPS/generator failover
Networking:
Dual fiber path failover test to ISP carriers
Core router + switch cutover validation under full traffic load
GPU Deployment
Obviously the entire purpose of building a data center is to support valuable compute, the GPU cluster, that will run within the facility and also composes the largest amount of CapEx. Though not classically considered “Cx”, there are certainly levels of testing practiced when deploying large GPU clusters at scale. One reference I found insightful was the SemiAnalysis article linked here, see the Cluster Deployment and Acceptance Test section.
If you’re seeking deeper, more technical information about cluster deployment I recommend this post from together.ai.
Cx Program In Practice
So now that we’ve covered what commissioning entails as far as testing sequence, examples of testing of various systems, and a basic overview- how does a project team implement a Cx program in practice?
Successful Cx programs boil down to three general concepts: competence, coordination, and communication. At risk of being too reductive, a short explanation is provided below:
Competence i.e. having the right people with technical backgrounds and experience.
Commissioning a data center successfully requires input and involvement from key stakeholders: CxA’s, engineers, integrators, energy marshals, and inspectors. From the design phase through the construction and Cx phases, engineers with technical acumen in MEP and IT fields are needed to properly evaluate plans and requirements to ensure the Cx program is adequate. Additionally, many equipment/systems vendors, say for example, Eaton, will send out integrators or commissioning agents (CxA’s) to assist on-site or otherwise lead aspects of the testing, given their intimate expertise of the vendor’s equipment/systems. The owner and/or general contractor will also employ full-time CxA’s to lead coordination of the overarching Cx program and oftentimes the GC or EC will employ energy marshals to help coordinate Cx-related safety SOP’s like lockout-tagout (LOTO). On top of general job site safety, electrical safety is critical when a data centers are energizing, especially when the energization occurs in segmented phases across the building or campus, with certain equipment/systems becoming “hot” while others nearby remain de-energized or are still being installed. Lastly, data centers must comply with any/all regulations of the authority having jurisdiction (AHJ), which typically sends inspectors to the site on a periodic basis to review construction systems and key infrastructure components. Though AHJ inspections are related more to the larger quality plan and requirements of a data center build, they are still be relevant to the Cx process at times.
Coordination i.e. maintaining critical path.
Coordination is an obvious factor through all phases of the Cx program. With the scaling wars of late between hyperscalers and other cloud and AI companies, each of these players are racing to energize their respective compute clusters as quickly as possible- sometimes sparing no expense. As with any large capital project, the project schedule and its critical path are key drivers of success or failure. Contractors are obligated to meet or beat schedule milestones set by owners/clients and the Cx/energization timeline is a crucial component to the schedule’s critical path. Between the project team, CxA’s, schedulers and other relevant stakeholders, intense coordination is required to ensure expectations are aligned, risks are mitigated, and that the Cx standards are maintained. The importance of Cx documentation cannot be overstated either, as this practice not only provides sufficient records for the owner at turnover but also serves as a verifiable history of any and all Cx performed throughout the project.
Communication i.e. keeping everyone on the same page.
Large data center builds often employ hundreds of trades people at any given time, with multiple crews working across different phases of the campus to complete the project as quickly as possible. Thus, communication is important to keep everyone safe, on-track, and informed. As mentioned earlier, energy marshals play a large role in maintaining the energy control program (ECP), a practice that ties directly to Cx efforts. CxA’s also must communicate extensively with vendors and other partners to work through any actions or issues that arise. A common issue faced in data center construction is supply chain bottlenecks, sometimes causing delays to equipment delivery which then presents risks to the project schedule. As with construction generally, cost and schedule risks must be mitigated to the utmost extent and this requires clear, constructive communication to solve issues and maintain stakeholder satisfaction. Communication covers the entire Cx process, from initially communicating owner project requirements (OPR’s) and the Cx program standards, day-to-day management, and completing comprehensive Cx documentation throughout all phases of delivery, installation, and testing.
Coordination and communication are essentially intertwined and can include a number of mediums from frequent, recurring meetings with team members, to email updates and other job site communications, to implementing tagging systems with colored stickers on equipment to denote the Cx stage.
Platforms commonly used to document and track the Cx process include Procore, Autodesk Construction Cloud, CxAlloy, and Bluerithm.
Cx for Operational Data Centers
Beyond the initial L1 through L5 stages of Cx when actively constructing, energizing and deploying a facility, other types of Cx are commonly performed on data centers that are actively operational, recently retrofitted, or in the later stages of their facility lifetime.
These other forms of Cx could include: Enhanced or Ongoing Cx (typically relevant for LEED programs), Retro-Commissioning (RCx), Re-Commissioning (Re-Cx), or other forms of Failure Mode and Effects Analysis (FMEA).
Also worth noting is that during the initial Cx stages of a build-out, CxA’s or other qualified persons should provide training to the owner and the incoming operations team to ensure the team managing the facility following acceptance/turnover understands how to properly maintain and troubleshoot the equipment and systems within their new facility.
Summary
Commissioning is one of, if not the most, crucial aspects of successfully bringing a new data center online. An effective Cx program requires extensive planning, constant communication and documentation, and strong leadership to ensure standards are upheld, safety is maintained, and clients are satisfied with outcomes.
There are a number of companies that specialize in data center commissioning and a handful of relevant industry standards and best practices to reference. One document I have found helpful for both general knowledge and for purposes of writing this post is the ASHRAE Guideline 0-2019, available here in full. In addition, ANSI/BICSI 002-2024 includes an extensive chapter (ch.16) on data center commissioning.
Thanks for reading! Let me know what you think are the most critical aspects to Cx or maybe something I overlooked while writing this.

