RAID stands for redundant array of independent disks. RAID is a data storage mechanism that uses multiple physical storage disks that are work together as a single virtual drive. Data is spread across all the physical disks. There are several RAID configurations, called RAID levels. Standard RAID levels are 0 through 6. Elsewhere we have compared other RAID levels such as RAID 0 vs. RAID 1 and RAID 5 vs. RAID 10.
This comparison looks at RAID 5 and RAID 6 in detail to understand their similarities and differences.
|RAID 5||RAID 6|
|Key feature||Striping with parity||Striping with double parity|
|Fault tolerance level||A RAID 5 configuration can tolerate the failure of one of its physical disks. If more than one disk fails, data is not recoverable.||A RAID 5 configuration can tolerate the failure of up to 2 of its physical disks. If more than two disks fail, data is not recoverable.|
|Striping||Yes; data is striped (or split) evenly across all disks in the RAID 5 setup. In addition to data, parity information is also stored (once) so that data can be recovered if one of the drives fails.||Yes; data is striped (or split) evenly across all disks in the RAID 6 setup. In addition to data, parity information is also stored (twice, on two separate disks) so that data can be recovered even if two of the drives fail.|
|Minimum number of physical disks required||3||4|
|Mirroring, redundancy and fault tolerance||No mirroring or redundancy; fault tolerance is achieved by calculating and storing parity information. Can tolerate the failure of 1 physical disk.||No mirroring or redundancy; fault tolerance is achieved by calculating and storing parity information. Can tolerate the failure of 2 physical disks.|
|Performance||Fast reads because of striping (data distributed across many physical disks). Writes are a little slower because parity information needs to be calculated. But since parity is distributed, 1 disk doesn't become a bottleneck (like it does in RAID 4).||Fast reads because of striping (data distributed across many physical disks). Writes are a little slower because parity information needs to be calculated and stored twice. But since parity is distributed, 1 disk doesn't become a bottleneck.|
|Applications||Good balance of efficient storage, decent performance, failure resistance and good security. RAID 5 is ideal for file and application servers that have a limited number of data drives.||Good balance of efficient storage, decent performance, failure resistance and good security. RAID 6 is better suited than RAID 5 in setups with many large drives.|
|Storage efficiency||(n-1)/n, for a RAID 5 system with n physical disks.||(n-2)/n, for a RAID 6 system with n physical disks.|
|Parity disk?||Parity information is distributed among all physical disks in the RAID. If one of the disks fails, parity info is used to recover data that was stored on that drive.||Parity information is distributed among all physical disks in the RAID. If up to two of the disks fail, parity info is used to recover data that was stored on that drive. Each block of parity info is stored on two separate drives.|
|Advantages||Fast reads; inexpensive redundancy and fault tolerance; data can be accessed (albeit at a slower rate) even while a failed drive is in the process of being rebuilt.||Fault tolerant even with 2 of the drives failing. Fast reads; inexpensive redundancy and fault tolerance; data can be accessed (albeit at a slower rate) even while a failed drive is in the process of being rebuilt.|
|Disadvantages||Recovery from failure is slow because of parity calculations involved in restoring data and rebuilding the replacement drive. It is possible to read from the RAID while this is going on but read operations during that time will be quite slow.||Recovery from failure is slow because of parity calculations involved in restoring data and rebuilding the replacement drive. It is possible to read from the RAID while this is going on but read operations during that time will be quite slow.|
RAID 5 configuration
The definition of a RAID 5 storage system, according to the Storage Networking Industry Association (SNIA) is:
A placement policy using parity-based protection for storing stripes of 'n' logical blocks of data and one logical block of parity across a set of 'n+1' independent storage devices where the parity and data blocks are interleaved across the storage devices. Data stored using this form of RAID is able to survive a single storage device failure without data loss.
In a RAID 5 configuration, data is striped — i.e., split and stored across multiple physical disks. In addition, a special parity block is used for redundancy. For each combination of data blocks in RAID 5, a parity block is calculated and stored. Each individual parity block resides on only one disk; however, parity blocks are stored in a round-robin fashion, distributed equally across all the physical disks.
Considering that data blocks are striped across at least two disks and the parity block is written on a separate disk, we can see that a RAID 5 configuration requires at least 3 physical drives.
RAID 6 configuration
According to the SNIA, RAID 6 is defined as:
A RAID 6 or RAID Level 6 storage system is a placement policy using parity-based protection that allows stored data to survive any two storage device failures without data loss.
A RAID 6 configuration is similar to RAID 5 in that it uses striping and parity blocks. The difference is that it stores two parity blocks, allowing for extra redundancy so that even if two of the disks fail, information is still recoverable.
The video below explains the differences in RAID 5 and RAID 6 levels.
What are RAID 5-5, RAID 5-9, RAID 6-6 and RAID 6-10?
RAID 5-5 means there are 5 physical disks in a RAID 5 configuration. Similarly, RAID 5-9 means data is striped across 9 physical disks in a RAID 5 configuration.
RAID 6-6 means data is striped across 6 disks in a RAID 6 system. There are 4 data segments and 2 parity segments for each stripe. Similarly, RAID 6-10 uses 10 physical disks; there are 8 data segments and 2 parity segments for each stripe.
Redundancy, Fault Tolerance and Parity Blocks
Both RAID 5 and RAID 6 are fault tolerant systems. i.e., data is not lost even when one of the physical disks fails. RAID 5 can tolerate the failure of any one of its physical disks while RAID 6 can survive two concurrent disk failures.
What's more, both RAID 5 and RAID 6 can continue being used when the failed disk is being replaced. This is called hot-swapping.
Understanding striping in RAID
Striping is a technique to split data into blocks that are stored across different physical disks. A good example of this is RAID 0, which uses striping.
The advantage of striping is that reads and writes are very fast because they happen from multiple physical disks in parallel. This is especially advantageous in HDD disks because they use mechanical components for reading and writing.
The disadvantage of simple striping is that if one of the disks fails, all the data is lost. There is no way to reconstruct the information if certain data blocks are missing.
Fault Tolerance in RAID
In order to make a RAID system fault tolerant, it would need to store information in a redundant fashion. i.e., the same information would have to be stored on multiple disks. So if one of the disks fails, data is still present and recoverable from another of the surviving disks.
There are ways to implement redundancy. A simple way would be to store a copy of each block of data on two physical disks. That is how RAID 1 is structured.
Another way to make a RAID configuration redundant is to use parity information. This is what both RAID 5 and RAID 6 use for more efficient redundancy.
RAID 5 Fault Tolerance
RAID 5 can tolerate the failure of 1 disk. Data and parity information stored on the failed disk can be recalculated using the data stored on the remaining disks.
Technical details on how parity works are outside the scope of this comparison. But put simply, a parity block is computed from all the individual data blocks. If there are n physical disks in the RAID, there will be n-1 data blocks and 1 parity block. If any of the n-1 data blocks goes missing (e.g., if the physical disk that it is stored on fails), all the information of that data block can still be reconstructed using the other n-2 data blocks plus the parity block. If the disk containing the parity block fails, it can be recomputed using all the n-1 data blocks.
What happens when a disk fails in RAID 5?
Not only can data be recovered when one of the disk fails, the RAID 5 system remains operational throughout because data is accessible and reads are possible from a RAID 5 even when one of the drives has failed and is being rebuilt. However, such reads will be slow because part of the data (the part that was on the failed drive) gets computed in real time using the parity block, rather than simply being read from disk.
Fault Tolerance in RAID 6
RAID 6 has better fault tolerance than RAID 5 because RAID 6 can survive the simultaneous failure of 2 of its disks. This comes at the cost of higher redundancy. Since two parity blocks are needed for each data stripe, storage capacity of two RAID 6 disks is spent on fault tolerance.
Space Efficiency in RAID 5 vs. RAID 6
The capacity efficiency of a RAID system is the fraction of the physical storage capacity that can be productively used to store data. It is calculated by taking the disks that are not parity or mirror and dividing them by the total disks in the set.
For a RAID 5 system with n disks, the storage efficiency is (n-1)/n because 1 disk worth of storage is taken up by parity blocks, leaving n-1 disks for data storage.
For a RAID 6 system with n disks, the storage efficiency is (n-2)/n because 2 disks worth of storage is taken up by parity blocks, leaving n-1 disks for data storage.
The picture below compares the storage efficiency of RAID 5 with either 5 or 9 disks, and RAID 6 with either 6 or 10 disks.
RAID 5 and RAID 6 both offer fast reads because of striping. Data is read from multiple disks in parallel, which speeds up reads. Write performance is slow, however, due to the overhead of calculating parity information. RAID 6 is a little slower than RAID 5 for write performance.
Pros and Cons
Both RAID 5 and RAID 6 offer fast reads and are hot-swappable, i.e., the system is functional and continues to support reads even when a failed disk is being replaced.
RAID 5 is more common than RAID 6. The advantages of RAID 5 over RAID 6 include:
- RAID 5 offers a good balance of many features: fault-tolerance (single disk), performance, cost and storage efficiency.
- Writes are slow with RAID 5 but not as slow as RAID 6.
- RAID 5 provides higher storage efficiency compared with RAID 6.
- Potentially faster recovery from failure compared to RAID 6 because of only one parity block.
The disadvantages of RAID 5 are:
- RAID 6 supports two concurrent disk failures while RAID 5 can only survive a single disk failure at a time.
RAID 5 provides a healthy balance of efficient storage, decent performance, failure resistance and good security. It is the most popular RAID configuration for enterprise NAS devices and business servers. RAID 5 is ideal for file and application servers that have a limited number of data drives. If the number of physical disks in the RAID is very large, the probability of at least one of them failing is higher. RAID 6 is a better option in such cases where it is important to have a higher degree of fault tolerance.
- Standard RAID levels - Wikipedia
- Nested RAID levels - Wikipedia
- Parity in computing - Wikipedia
- Common RAID Disk Data Format (DDF) - Storage Networking Industry Association
- Solving Data Loss in Massive Storage Systems - Storage Networking Industry Association
- Don't be afraid of RAID
Comments: RAID 5 vs RAID 6