This blog is a primer for a series of other posts I have planned to discuss performance, availability, and scalability of disk IO solutions.
RAID stands for Redundant Array of Inexpensive Disks. (While some of the high-end SCSI and fiber disks are far from "inexpensive" in my opinion, they are indeed usable in RAID configurations.)
There are various levels of RAID and each generally tries to address one of two challenges, in some configurations it tries to address both challenges.
The first challenge is redundancy. Disk spin at very high speeds; Disk heads move back and forth inside the disk casing; Disk media is fragile. Because of the multiple constantly moving parts and the fragility of the materials, more than anything else, disks tend to fail. They don't constantly fail - some run years with no problems - but if something is going to fail, there is a higher likelihood of it being a disk drive just because of the specifics involved in spinning disk media. (Newer drives use solid state storage so don't spin. See my other blog posting about that.)
Some RAID levels address this challenge by double-writing all the data. So, for example, in a RAID1 configuration - otherwise known as "mirroring" - all the data is written to two drives. If one fails, there is another copy of that data still sitting there and usable to keep the server online.
Other RAID levels address this by writing a parity block along with the data across multiple drives. This is a smaller bit of information rather than double writing. If a drive fails, the remaining good data combined with the parity block allows the missing data to be reconstructed.
The second challenge is speed. Reading from a single drive means the server data access speed is limited to the physical capabilities of that one drive. By using certain RAID configurations the server can actually read from multiple drives at the same time. So, in a simplified example, a certain RAID configuration with three drives might be able to read data three times as fast as from a single drive.
Understanding some of the challenges of data access IO and various RAID options can be very helpful in properly analyzing system bottlenecks and architecting ideal configurations for project solutions. Working with customers through these types of topics is one of the many things that we do as part of our managed hosting services at ORCS Web. This type of interaction is one of the many reasons that our customers continually telling us that we are #1 in technical service and customer support.
Look for future posts with more specifics and details on various points surrounding data access and the topics started here.
Happy hosting!
~Brad
Follow-up posts:
RAID1
RAID5