Networks can be configured to be so incredibly redundant now - for reasonable prices - that there is no excuse for a data center not to achieve five nines (99.999%) of network availability.
What about the servers and applications though? For a site or application that has high uptime requirements, why spend so much time configuring the network to make sure it doesn't fail, and then deploy an application to a single server?
Sure, there are ways to configure servers with some redundancy to minimize some failures -- things like RAID (redundant array of inexpensive disks) which will protect against a disk drive failure (and I highly recommend RAID for all production servers - and preferably the use of hardware RAID vs. software RAID). But what happens if a recent configuration change brings the application down? Or a newly released patch conflicts with other settings and causes problems? Or the server has a part other than drives – RAM, CPU, NIC, motherboard, etc – that goes bad? Well, in these situations the server will go down and the application(s) hosted on that server will be offline until the problem is resolved.
A good monitoring and alerting process will allow the system administrator to detect and address these issues quickly, but still there will be some level of downtime associated with the issue. Depending on the type of issue, even the best system administrator might not be able to immediately resolve the issue - it may take time to troubleshoot and resolve. Time during which your application is unavailable and you may be losing business due to the site interruption.
So, what can you do?
A great option - and one that has recently become more affordable - is to host your application on a webfarm. A webfarm consists of two or more web servers with the same configuration, and that serve up the same content. There are special switches and processes involved that allow each of these servers to respond to a request to a single location. For example, say we have two servers - svr1.orcsweb.com and svr2.orcsweb.com - that have 100% the same configuration and content. We could configure a special switch* to handle traffic that is sent to www.orcsweb.com and redirect the traffic to either of these nodes depending on some routing logic. All clients visiting the main URL (in this case www.orcsweb.com) have no idea whether this is a single server - or ten servers! The balancing between nodes is seamless and transparent.
[*note: There is also software that could handle the routing process but experience and tests have shown that these types of solutions are generally not as scalable, fast, or efficient as the hardware switching solutions]
The routing logic can be a number of different options - most common are:
- Round-robin: Each node gets a request sent to it "in turn". So, node1 gets a request, then node2 again, then node1, then node2 again.
- Least Active: Whichever node shows to have the lowest number of current connects gets new connects sent to it. This is good to help keep the load balanced between the server nodes.
- Fastest Reply: Whichever node replies faster is the one that gets new requests. This is also a good option - especially if there are nodes that might not be "equal" in performance. If one performs much better than the other, why not send more requests there?
In any of these scenarios the switch will also detect if a node were to fail. So, if svr1.orcsweb.com was taken offline for maintenance - or it had a critical failure - the switch would detect that and only send traffic to svr2.orcsweb.com. And since the clients always access the site via the main URL (not the node names) they have no idea that one of the nodes is down - the application continues to serve client requests seamlessly.
Besides high-availability (continuing to satisfy requests during a failure), a webfarm also gives an application a higher level of scalability - the ability to handle more and more load. If load increased on the application to the point where performance started to degrade, more nodes can be added to the webfarm (again, without clients noticing), giving the ability to handle potentially unlimited levels of traffic (just keep adding nodes!).
Of course there are a lot of factors surrounding the proper support of a webfarm - the switches, fail over between switches (don't let the switch be a single point-of-failure!), replication of content, synchronization of server changes, synchronization of application changes, etc, etc.. But a good system administrator (or experienced hosting company) can help address all of these issues for you.
By the way, at ORCS Web we have a shared webfarm hosting plan gives all the benefits of a front-end webfarm without the higher costs of fully dedicated systems. This is a great product for sites that want both scalability and high-availability for their site/application – at a reasonable monthly rate. It is also a great stepping-stone to future dedicated webfarm services as the application’s traffic grows over time.
Hopefully this has been a good introduction to webfarms for you, and hopefully I've properly communicated enough of the benefits for you to consider this as a hosting option for yourself. With the rates now down to affordable levels - why not get this additional layer of protection?
Happy hosting!
~Brad