Tuesday 25 October 2016

ECC Memory - ECC RAM




ECC Memory



ECC stands for "Error Correction Codes" and is a method used to detect and correct errors introduced during storage or transmission of data. Certain kinds of RAM chips inside a computer implement this technique to correct data errors and are known as ECC Memory.
ECC Memory chips are predominantly used in servers rather than in client computers. Memory errors are proportional to the amount of RAM in a computer as well as the duration of operation. Since servers typically contain several Gigabytes of RAM and are in operation 24 hours a day, the likelihood of errors cropping up in their memory chips is comparatively high and hence they require ECC Memory.

Memory errors are of two types, namely hard and soft. Hard errors are caused due to fabrication defects in the memory chip and cannot be corrected once they start appearing. Soft errors on the other hand are caused predominantly by electrical disturbances.
Memory errors that are not corrected immediately can eventually crash a computer. This again has more relevance to a server than a client computer in an office or home environment. When a client crashes, it normally does not affect other computers even when it is connected to a network, but when a server crashes it brings the entire network down with it. Hence ECC memory is mandatory for servers but optional for clients unless they are used for mission critical applications.
ECC Memory chips mostly use Hamming Code or Triple Modular Redundancy as the method of error detection and correction. These are known as FEC codes or Forward Error Correction codes that manage error correction on their own instead of going back and requesting the data source to resend the original data. These codes can correct single bit errors occurring in data. Multi-bit errors are very rare and hence due not pose much of a threat to memory systems.


No comments:

Post a Comment