ECC Memory
ECC stands for "Error Correction
Codes" and is a method used to detect and correct errors introduced during
storage or transmission of data. Certain kinds of RAM chips inside a computer
implement this technique to correct data errors and are known as ECC Memory.
ECC Memory chips are predominantly used in
servers rather than in client computers. Memory errors are proportional to the
amount of RAM in a computer as well as the duration of operation. Since servers
typically contain several Gigabytes of RAM and are in operation 24 hours a day,
the likelihood of errors cropping up in their memory chips is comparatively
high and hence they require ECC Memory.
Memory errors are of two types, namely hard
and soft. Hard errors are caused due to fabrication defects in the memory chip
and cannot be corrected once they start appearing. Soft errors on the other
hand are caused predominantly by electrical disturbances.
Memory errors that are not corrected
immediately can eventually crash a computer. This again has more relevance to a
server than a client computer in an office or home environment. When a client
crashes, it normally does not affect other computers even when it is connected
to a network, but when a server crashes it brings the entire network down with
it. Hence ECC memory is mandatory for servers but optional for clients unless
they are used for mission critical applications.
ECC Memory chips mostly use Hamming Code or
Triple Modular Redundancy as the method of error detection and correction.
These are known as FEC codes or Forward Error Correction codes that manage
error correction on their own instead of going back and requesting the data
source to resend the original data. These codes can correct single bit errors
occurring in data. Multi-bit errors are very rare and hence due not pose much of
a threat to memory systems.
No comments:
Post a Comment