RAID 5 vs RAID 6 - Comparing Fault Tolerance, Performance, Pros and Cons

Diffen › Technology › Consumer Electronics › Computer Hardware

RAID stands for redundant array of independent disks. RAID is a data storage mechanism that uses multiple physical storage disks that are work together as a single virtual drive. Data is spread across all the physical disks. There are several RAID configurations, called RAID levels. Standard RAID levels are 0 through 6. Elsewhere we have compared other RAID levels such as RAID 0 vs. RAID 1 and RAID 5 vs. RAID 10.

This comparison looks at RAID 5 and RAID 6 in detail to understand their similarities and differences.

Comparison chart

RAID 5 versus RAID 6 comparison chart
	RAID 5	RAID 6
Key feature	Striping with parity	Striping with double parity
Fault tolerance level	A RAID 5 configuration can tolerate the failure of one of its physical disks. If more than one disk fails, data is not recoverable.	A RAID 5 configuration can tolerate the failure of up to 2 of its physical disks. If more than two disks fail, data is not recoverable.
Striping	Yes; data is striped (or split) evenly across all disks in the RAID 5 setup. In addition to data, parity information is also stored (once) so that data can be recovered if one of the drives fails.	Yes; data is striped (or split) evenly across all disks in the RAID 6 setup. In addition to data, parity information is also stored (twice, on two separate disks) so that data can be recovered even if two of the drives fail.
Minimum number of physical disks required	3	4
Mirroring, redundancy and fault tolerance	No mirroring or redundancy; fault tolerance is achieved by calculating and storing parity information. Can tolerate the failure of 1 physical disk.	No mirroring or redundancy; fault tolerance is achieved by calculating and storing parity information. Can tolerate the failure of 2 physical disks.
Performance	Fast reads because of striping (data distributed across many physical disks). Writes are a little slower because parity information needs to be calculated. But since parity is distributed, 1 disk doesn't become a bottleneck (like it does in RAID 4).	Fast reads because of striping (data distributed across many physical disks). Writes are a little slower because parity information needs to be calculated and stored twice. But since parity is distributed, 1 disk doesn't become a bottleneck.
Applications	Good balance of efficient storage, decent performance, failure resistance and good security. RAID 5 is ideal for file and application servers that have a limited number of data drives.	Good balance of efficient storage, decent performance, failure resistance and good security. RAID 6 is better suited than RAID 5 in setups with many large drives.
Storage efficiency	(n-1)/n, for a RAID 5 system with n physical disks.	(n-2)/n, for a RAID 6 system with n physical disks.
Parity disk?	Parity information is distributed among all physical disks in the RAID. If one of the disks fails, parity info is used to recover data that was stored on that drive.	Parity information is distributed among all physical disks in the RAID. If up to two of the disks fail, parity info is used to recover data that was stored on that drive. Each block of parity info is stored on two separate drives.
Advantages	Fast reads; inexpensive redundancy and fault tolerance; data can be accessed (albeit at a slower rate) even while a failed drive is in the process of being rebuilt.	Fault tolerant even with 2 of the drives failing. Fast reads; inexpensive redundancy and fault tolerance; data can be accessed (albeit at a slower rate) even while a failed drive is in the process of being rebuilt.
Disadvantages	Recovery from failure is slow because of parity calculations involved in restoring data and rebuilding the replacement drive. It is possible to read from the RAID while this is going on but read operations during that time will be quite slow.	Recovery from failure is slow because of parity calculations involved in restoring data and rebuilding the replacement drive. It is possible to read from the RAID while this is going on but read operations during that time will be quite slow.

Configuration

RAID 5 configuration

The definition of a RAID 5 storage system, according to the Storage Networking Industry Association (SNIA) is:

A placement policy using parity-based protection for storing stripes of 'n' logical blocks of data and one logical block of parity across a set of 'n+1' independent storage devices where the parity and data blocks are interleaved across the storage devices. Data stored using this form of RAID is able to survive a single storage device failure without data loss.

In a RAID 5 configuration, data is striped — i.e., split and stored across multiple physical disks. In addition, a special parity block is used for redundancy. For each combination of data blocks in RAID 5, a parity block is calculated and stored. Each individual parity block resides on only one disk; however, parity blocks are stored in a round-robin fashion, distributed equally across all the physical disks.

Example of a RAID 5 configuration. Data and parity blocks are grouped by color to easily identify which parity block is associated with which data blocks.

Considering that data blocks are striped across at least two disks and the parity block is written on a separate disk, we can see that a RAID 5 configuration requires at least 3 physical drives.

RAID 6 configuration

According to the SNIA, RAID 6 is defined as:

A RAID 6 or RAID Level 6 storage system is a placement policy using parity-based protection that allows stored data to survive any two storage device failures without data loss.

A RAID 6 configuration is similar to RAID 5 in that it uses striping and parity blocks. The difference is that it stores two parity blocks, allowing for extra redundancy so that even if two of the disks fail, information is still recoverable.

Example of a RAID 6 configuration. Data and parity blocks are color-coded to

The video below explains the differences in RAID 5 and RAID 6 levels.

What are RAID 5-5, RAID 5-9, RAID 6-6 and RAID 6-10?

RAID 5-5 means there are 5 physical disks in a RAID 5 configuration. Similarly, RAID 5-9 means data is striped across 9 physical disks in a RAID 5 configuration.

RAID 6-6 means data is striped across 6 disks in a RAID 6 system. There are 4 data segments and 2 parity segments for each stripe. Similarly, RAID 6-10 uses 10 physical disks; there are 8 data segments and 2 parity segments for each stripe.

Redundancy, Fault Tolerance and Parity Blocks

Both RAID 5 and RAID 6 are fault tolerant systems. i.e., data is not lost even when one of the physical disks fails. RAID 5 can tolerate the failure of any one of its physical disks while RAID 6 can survive two concurrent disk failures.

What's more, both RAID 5 and RAID 6 can continue being used when the failed disk is being replaced. This is called hot-swapping.

Understanding striping in RAID

In RAID 0, data is split into blocks stored across multiple disks.

Striping is a technique to split data into blocks that are stored across different physical disks. A good example of this is RAID 0, which uses striping.

The advantage of striping is that reads and writes are very fast because they happen from multiple physical disks in parallel. This is especially advantageous in HDD disks because they use mechanical components for reading and writing.

The disadvantage of simple striping is that if one of the disks fails, all the data is lost. There is no way to reconstruct the information if certain data blocks are missing.

Fault Tolerance in RAID

In order to make a RAID system fault tolerant, it would need to store information in a redundant fashion. i.e., the same information would have to be stored on multiple disks. So if one of the disks fails, data is still present and recoverable from another of the surviving disks.

There are ways to implement redundancy. A simple way would be to store a copy of each block of data on two physical disks. That is how RAID 1 is structured.

In a RAID 1 setup, redundancy is achieved by storing multiple copies of the data on different physical disks.

Another way to make a RAID configuration redundant is to use parity information. This is what both RAID 5 and RAID 6 use for more efficient redundancy.

RAID 5 Fault Tolerance

RAID 5 can tolerate the failure of 1 disk. Data and parity information stored on the failed disk can be recalculated using the data stored on the remaining disks.

Technical details on how parity works are outside the scope of this comparison. But put simply, a parity block is computed from all the individual data blocks. If there are n physical disks in the RAID, there will be n-1 data blocks and 1 parity block. If any of the n-1 data blocks goes missing (e.g., if the physical disk that it is stored on fails), all the information of that data block can still be reconstructed using the other n-2 data blocks plus the parity block. If the disk containing the parity block fails, it can be recomputed using all the n-1 data blocks.

What happens when a disk fails in RAID 5?

Not only can data be recovered when one of the disk fails, the RAID 5 system remains operational throughout because data is accessible and reads are possible from a RAID 5 even when one of the drives has failed and is being rebuilt. However, such reads will be slow because part of the data (the part that was on the failed drive) gets computed in real time using the parity block, rather than simply being read from disk.

Fault Tolerance in RAID 6

RAID 6 has better fault tolerance than RAID 5 because RAID 6 can survive the simultaneous failure of 2 of its disks. This comes at the cost of higher redundancy. Since two parity blocks are needed for each data stripe, storage capacity of two RAID 6 disks is spent on fault tolerance.

Space Efficiency in RAID 5 vs. RAID 6

The capacity efficiency of a RAID system is the fraction of the physical storage capacity that can be productively used to store data. It is calculated by taking the disks that are not parity or mirror and dividing them by the total disks in the set.

For a RAID 5 system with n disks, the storage efficiency is (n-1)/n because 1 disk worth of storage is taken up by parity blocks, leaving n-1 disks for data storage.

For a RAID 6 system with n disks, the storage efficiency is (n-2)/n because 2 disks worth of storage is taken up by parity blocks, leaving n-1 disks for data storage.

The picture below compares the storage efficiency of RAID 5 with either 5 or 9 disks, and RAID 6 with either 6 or 10 disks.

A comparison of the storage efficiency of some RAID 5 and RAID 6 configurations with RAID 10. Chart from Dell.

Performance

RAID 5 and RAID 6 both offer fast reads because of striping. Data is read from multiple disks in parallel, which speeds up reads. Write performance is slow, however, due to the overhead of calculating parity information. RAID 6 is a little slower than RAID 5 for write performance.

Pros and Cons

Both RAID 5 and RAID 6 offer fast reads and are hot-swappable, i.e., the system is functional and continues to support reads even when a failed disk is being replaced.

RAID 5 is more common than RAID 6. The advantages of RAID 5 over RAID 6 include:

RAID 5 offers a good balance of many features: fault-tolerance (single disk), performance, cost and storage efficiency.
Writes are slow with RAID 5 but not as slow as RAID 6.
RAID 5 provides higher storage efficiency compared with RAID 6.
Potentially faster recovery from failure compared to RAID 6 because of only one parity block.

The disadvantages of RAID 5 are:

RAID 6 supports two concurrent disk failures while RAID 5 can only survive a single disk failure at a time.

Applications

RAID 5 provides a healthy balance of efficient storage, decent performance, failure resistance and good security. It is the most popular RAID configuration for enterprise NAS devices and business servers. RAID 5 is ideal for file and application servers that have a limited number of data drives. If the number of physical disks in the RAID is very large, the probability of at least one of them failing is higher. RAID 6 is a better option in such cases where it is important to have a higher degree of fault tolerance.

References

About the Author

Nick Jasuja has over 15 years of technology industry experience, including at Amazon in Seattle. He is an expert at building websites, developing software programs in PHP and JavaScript, maintaining MySQL and PostgreSQL databases, and running Linux servers for serving high-traffic websites. He has a bachelor's degree in Computer Science & Engineering.