Stories of data corruption as a result of using non-ECC memory



Among the memories installed in PCs, there is something called '

ECC memory ' with an error correction function. This ECC memory is mainly used in servers, etc., but based on the experience that his own PC broke down, Mr. Robert Elder , a software consultant, argued that `` ECC memory should be adopted even for personal PCs '' I'm here.

Non-ECC Memory Corrupted My Hard Drive Image - This Is Why ECC Memory Is So Important - YouTube


The flow that Mr. Elder discovered the problem of non-ECC memory is like this. First, Mr. Elder tried to migrate the data stored on the HDD to another HDD to repair the broken PC.



I used the command line tool ``

ddrescue '' to migrate the data. Unlike the data copy tool 'dd', ddrescue adopts a mechanism that allows data to be copied even on storage with bad sectors. However, as a result of Mr. Elder checking the hash values of the copy source HDD and the copy destination HDD, different hash values were calculated. In other words, although the data should have been copied 100% using ddrescue, different data was saved.



As a result of analyzing the data in the HDD, it turned out that an error occurred in which '0' changed to '1' in some bits.



Errors can be caused by various components such as memory, motherboard, and power supply. Therefore, Mr. Elder first tried to replace the power supply and try to migrate the data again, but a similar error occurred and it failed.



Next, Mr. Elder performed a memory health check. As a result, it turned out that the memory was not working properly.



Therefore, Mr. Elder obtained and replaced the standard memory that fits the PC. However, the PC did not even start because the memory was too old.



Mr. Elder bought another memory again and replaced the memory again.



As a result, the PC booted normally and successfully passed the memory health check. Mr. Elder was able to successfully complete the data migration of the HDD.



Although Mr. Elder was finally able to successfully migrate the data, it took a lot of time to identify the cause of the error and search for suitable memory. Mr. Elder points out, 'If the PC had ECC memory, error correction could prevent this phenomenon.'

However, in order to introduce ECC memory into the system, the motherboard, CPU, etc. must support ECC memory. Mr. Elder emphasized the need for ECC memory, saying to memory makers and CPU makers, ``Please declare that the production of non-ECC memory and CPUs that do not support ECC memory will be permanently discontinued.''

in Hardware,   Video, Posted by log1o_hf