OW Blogs Navigation
This Blog
News
Syndication

Brad Kingsley's Blog

Things Break

I almost titled this "Mean Time Between Failures (MTBF)" which is a commonly understood term when dealing with just about any technical device. MTBF means just what it says - no explanation needed there.

What it also means, without actually saying it, is "things break". Even things without any moving parts have the potential to fail. I wish it weren't true, but it is. Some things break more often than others - hence the need to understand the MTBF for various devices.

It seems that many hosts are afraid to say it out loud - as if the general population believed that things never fail (I don't think they do, but perhaps some believe it). It is an unfortunate fact of the business though. That is why tens of millions of dollars are spent on data center infrastructure with layers of redundancy across components. It's why we perform regular tests of fail-over processes to gauge the impact (or lack-of) to services when devices fail.

Technologies have advanced quite a bit in recent years and dealing with a single device failure is quite reasonable. What is really frustrating, while having a low chance of occurrence, is when there is a quality redundant solution in place but two (or more) things fail at once. Every component has a MTBF so it is possible, and has happened to every host I know of at some point, for really bad luck to occur and a backup device will fail at the same time as the primary device.

Those situations are stressful and frustrating for sure (because of all the up-front time and investment in the redundant solution). That's really when a managed hosting company succeeds or fails though. How the hosting provider handles the situation technically; how they handle the customer service; how they handle updates; how they handle testing and getting things back to initial state; etc...  Years of experience with highly-available systems, top-notch expert staff, comprehensive systems and tools, tested processes - become hyper-critical in these already critical type of situations.

Some related ORCS Web specific information:
ORCS Web has been providing managed hosting solutions on Windows Server platforms for 12 years now. Before founding the company I was in the Advanced Technology and Integration department at NASDAQ* where a large focus was on handling highly scalable and highly available solutions for Microsoft platforms. This is not new stuff to us - this is what we do.

Our Webteam support group is loaded with Microsoft-certified people who themselves have years of experience. Several of the team speak at conferences and technology events; several have written books; a couple have been recognized as technology MVPs by Microsoft. This isn't just a job - our team members love what they do and are the best in the industry.

*Things even fail at NASDAQ - they have two geographically separate data centers and at times have had to utilize each.

Published Tuesday, April 08, 2008 4:12 PM by Brad

Comments

 

Mike said:

> Things even fail at NASDAQ - they have two geographically separate data centers and at times have had to utilize each.

Does ORCSWEB maintain redundant datacenters?

April 15, 2008 8:41 PM
 

Brad said:

Yes Mike, ORCS Web does have equipment in multiple data centers that can be utilized for customers who would like geographically redundant solutions.

April 16, 2008 5:42 AM
New Comments to this post are disabled

Powered by Community Server 2.1