RAID 4 and RAID 6

Historically FAS devices have used RAID 4 for data protection. This offers several advantages over the more commonly used RAID 5 layout.

A new disk that contains all zeroes can be added to a RAID 4 group without affecting the parity on the parity drive, because each block on the parity drive is simply the XOR of the corresponding blocks on the data drives, and P xor 0 == P. By contrast, adding disks to a RAID 5 group requires the parity blocks to be restriped across all the drives in the group.

Because of the above, larger RAID groups can be managed. The cost to add a disk to a RAID 5 group is linear in the number of disks in the group, whereas with RAID 4 it is zero. As a result, the larger a RAID 5 group gets the more resistant to change the administrator will be, but with RAID 4, she has complete flexibility.

The traditional knock against RAID 4 is that the parity drive is a bottleneck to writes, as for every block written to a data disk, the corresponding block must also be written to parity. This is overcome by the design of the Write Anywhere File Layout (WAFL) file system, which integrates RAID management into the system instead of isolating it in a subsystem. In addition, WAFL by design writes data to new locations on each write instead of overwriting data in place. This allows it to coalesce writes into "write episodes", and also to lay out writes efficiently in tetrises, which use the same physical blocks across multiple disks in the RAID group and therefore require fewer writes to parity. Consider the following:

In the preceding diagram, 8 data blocks are being written, and a total of ten are written including parity. No reads of existing parity are necessary. RAID 5 systems, however, must read each of the 8 data blocks and the corresponding 8 parity blocks to compute the parity without the old data. Then, it writes 8 data blocks, and 8 blocks with the new parity, for a total of 32 reads and writes versus WAFL's ten. RAID 6 requires another 8 reads and 8 writes. Below is another diagram, which shows that files may be striped and interleaved in order to maintain the parity management properties that the integrated RAID/WAFL design offers.

RAID 4 was an effective data protection technique for the first decade of the Unified Storage Device's (USD) life. However, because of increasing disk sizes, with consequently larger physical volume sizes and ensuing longer reconstruct time, it is not too uncommon that a second disk fails while parity is reconstructing. This is because a typical bit error rate (BER) on a drive is on the order of 1 bit in 1014 bits. But a terabyte drive contains almost 1013bits. Therefore, the probability of a bit error, a failure, while reading the entire drive for a reconstruct is significant. Multiply that by the fact that every drive in the RAID group must be read to reconstruct parity, and one begins to have cause for alarm. In traditional RAID 4 and RAID 5, any such failure is guaranteed to cause at least some data loss.

To counteract this trend, Data ONTAP 6.5 introduced an innovative new double parity scheme called RAID DP. This scheme uses a second parity disk, and a mathematically optimal algorithm to achieve parity protection that can survive the loss of two disks at once, making it is a form of RAID 6. The algorithm is much faster than Reed-Solomon or other Hamming code based schemes which achieve the same protection level. Current estimates are that RAID DP incurs a performance penalty of around 3%, depending on workload.

All file systems are optimized for some things and not so well optimized for others. The workload that a write-anywhere file system has trouble with is usually called "random write sequential read". Because WAFL repositions data upon each write, a data set such as a large file that an application thought was sequential will not be after repeated small writes, and sequential read performance may suffer as a result.

 


Related Topics