One of the main reasons many companies outsource Web hosting is to ensure consistent availability of their websites. Data centers are able to use economies of scale to provide reliable and secure power supplies and Internet connectivity to multiple sites, with a variety of backup power sources that are simply unavailable to the average customer.
Over the years, the strategy of building in n+1 redundancy into every system within the data center has been refined to effectively eliminate single points of failure; that is, having a single piece of equipment whose failure could bring down the whole system. In fact, with multiple power supplies and Internet connections, the data center is designed so that nothing short of a natural disaster should cause a loss of service.
N+1 redundancy means that if a system requires "n" number of something to ensure continual service, the data center will have at least n+1, so that in the event one component fails, the others will take over seamlessly, allowing for repairs without affecting performance.
Here we’ll take a look at this in practice in a data center with regard to the two most vital aspects of provision: power supply and Internet access.
- The average PC will have an internal power supply unit with a single ATX module. Computers in a data center will be powered by ATX units with n+1 modules that are hot swappable. What this means in practice is if the machine requires a single module to power it, it will house two. Hot swappable means if one of the modules fails, and the other takes over, the failed module can be unplugged and replaced without having to switch off the machine.
- The data center will be connected to at least two independent power supplies, drawn from different substations, so that in the event of a power outage in one area, power can still be drawn from an unaffected area. In practice power may be drawn from several external sources, to limit the strain placed on the remaining lines should one fail. For example, Hostway’s Chicago data center has no fewer than seven dedicated electrical feeds connected to five substations.
- Despite these seemingly infallible power supplies, data centers still have contingency plans just in case there is a blanket power outage across multiple substations. Uninterruptable Power Supply (UPS) units are installed to power the plant for short periods in the event of a major power failure. Even this redundant system will have built in n+1 redundancy, so that if the power supply fails and a UPS unit fails, there will still be enough capacity to power equipment long enough to bring alternative power online. So for example if the plant requires 1200 KVA of power and the data center uses 300 KVA UPS units, they will install 1500 KVA of UPS equipment.
- The data center will also house n+1 generators to power the equipment, which will be capable of being brought online in the time provided by the UPS. According to www.sans.org guidelines for data center physical security, “There SHOULD be diesel generators on site with 24 hours of fuel also on site. A contract SHOULD be in place to get up to a week of fuel to the facility.”
- Each server that is connected to the data center network should have redundant connection methods. This might mean it is connected to the LAN via both wireless and wired means so that if the network card fails, the server can still interact wirelessly.
- The data center will not rely on a single telecommunications provider, or on a single fiber optic cable, to connect them to the Internet, but will have multiple avenues of access. This built-in redundancy not only provides emergency provision, but allows short bursts of activity outside the normal parameters of operation without degrading the service.
- Every connecting piece of equipment between the server and the Internet will be duplicated, so that routers and switches that direct the flow of information can’t prevent a bottleneck or bring the system to a halt in the event of failure.
This attention to detail isn’t restricted to the power supplies and Internet connections, though, it permeates every aspect of data center design and management from physical and data security to cooling and fire suppression systems.
Read the complete series:
Data Center Tour Part 1: Introduction
Data Center Tour Part 2: Meet the Staff
Data Center Tour Part 3: Physical Security
Data Center Tour Part 4: Redundancy
Data Center Tour Part 5: Servers