Asking the Tough Questions When Your Data Center Provider Incurs an Outage

By: Ken Carter
Executive Vice President of Data Operations and Infrastructure
AIS Data Centers

The worst has happened. An outage has occurred and you feel like your Data Center Service Provider has failed you. It’s quite possible that they have. To find out, you’ll have to ask some tough questions. Data centers are complex facilities with sophisticated equipment, procedures and engineered componentry making up the heart their operations. Essentially, you want to know what happened to cause the service outage, the scale of incident, a matrix of impacted services and components, and why – and feel confident that it won’t happen again. However, you’ll need to put on your engineering cap to really get to the truth.     

If you’re aware of an outage that has occurred within your data center support infrastructure, ask your provider:

  • What is your history of outages?
  • Has a root cause analysis been completed of your most recent outage? (If so, be sure to request a copy of the report)
  • Did the data center provider identify any cause, issue, fault or failure as contributing factors to the outage that is the result of a third party, an equipment provider or a service provider (e.g. utility service)?  If so, be sure to obtain the root cause analysis from each of the third parties and/or service providers identified and don’t rely solely on the data center provider’s consolidated report.
  • Who performed the root cause analysis?
  • What did the remediation (solution to the identified problem) consist of? 
  • Has the remediation or solution been completed, tested and commissioned?
  • Who performed the commissioning and certification?
  • What was the name of the engineering firm and engineer of record involved in remediation?
  • Provide the preventative maintenance, scheduled maintenance and unscheduled maintenance logs and reports of each component involved.


These are the technical questions you’ll need to ask to find out what occurred, why it occurred and if it’s been fixed. Your data center service provider should be willing to provide you with all of this information, including:

  • A definitive root cause analysis for each fault identified with detailed documentation, drawings and information.
  • The engineered solution for remediation (what they’ve put in place to solve the problem).
  • Evidence that problem has in fact be solved, complete with the name of the engineering firm that performed the analysis and commissioning.

 

Once you’ve collected this information, it’s likely that you’ll want to discuss and fact check it with industry experts. Unless you are an experienced engineer, that’s the only way you’ll know for sure that the information is complete and accurate. What you’re looking for is absolute verification that your provider has identified the real problem that caused the outage and that it’s been thoroughly resolved using industry best practices.

Read article