AMD's CPU core may hang in about 1044 days of continuous operation, but there is no plan to fix it



It was found that the April 2023 EPYC 2nd generation revision guide published by AMD contains

an errata that ``the core may hang after about 1044 days.'' AMD has clarified that it has no plans to address this issue.

Revision Guide for AMD Family 17h Models 30h-3Fh Processors
(PDF file) https://www.amd.com/system/files/TechDocs/56323-PUB_1.01.pdf



The page looks like this.



This issue causes the core to fail to exit power state 'CC6' approximately 1044 days after the last system reset, which can eventually lead to a core hang.

The recommended workaround is to disable CC6 or reboot the system before the expected time of the issue.

AMD states that a fix for this issue is 'not planned'.

Reddit, a social bulletin board site, points out that the problem actually manifests itself 1042 days and about 12 hours after startup.

PSA: EPYC 7002 CPUs may hang after 1042 days of uptime
by u/acid_migrain in sysadmin



In addition, since security patches are applied in some way, it should not be the case that all systems are running, but Mr. deafpolygon, who used to work at GE, said, ``Old UNIX machines have been running for eight years in a row. I commented.

Comment
by u/deafpolygon from discussion PSA: EPYC 7002 CPUs may hang after 1042 days of uptime
in sysadmin



in Hardware, Posted by logc_nt