Succeeded in finding a bug that occurs once in 1000 times by repeating Linux startup 292,612 times
Richard M.W. Jones, a developer of Red Hat Linux , noticed that there was a bug that hung up when starting Linux v6.4, and tested to restart Linux 292,612 times. is.
I booted Linux 292,612 times |
https://rwmj.wordpress.com/2023/06/14/i-booted-linux-292612-times/
Dev Boots Linux 292,612 Times to Find Intel, AMD Kernel Bug | Tom's Hardware
https://www.tomshardware.com/news/dev-boots-linux-292612-times-for-1-in-1000-kernel-bug
Jones suspected a startup hang as a bug when he tested some server software using the nbdkit protocol for accessing block devices over the network, and found that a virtual machine disk image had That was when I noticed random hangs when used with the library ' libguestsfs ' to access. Mr. Jones set out to identify the bug to prove that he had discovered it himself.
According to Jones, when launching the open source processor emulator ' QEMU ', it was found that the bug always occurred at the same stage of the boot process.
Linux kernel hangs rarely when booting on the latest qemu (#1696) Issues QEMU / QEMU GitLab
https://gitlab.com/qemu-project/qemu/-/issues/1696#note_1428829389
So Mr. Jones started Linux 292,612 times and checked whether the bug occurred. Then, it turned out that it hangs at the time of startup at a rate of 1 time in 1000 times. It seems that the test to restart 292,612 times took a total of 21 hours, but ``It took days to do the restart test so far,'' Jones said.
Then I ran the command line 'guestfish' to inspect the virtual machine's filesystem 10,000 times in a loop, running many instances in parallel and parsing the output to find the cause. The culprit that rarely interfered with Linux startup was a regression of 'printk time' that displays timestamps on the kernel console.
According to Jones, by comparing Linux v6.0 and v6.4-rc6, he was able to narrow down the culprit of the boot hang. 'Reverting the code commit at printk time will fix the problem,' Jones asserts.
LKML: 'Richard WM Jones': printk.time causes rare kernel boot hangs
https://lkml.org/lkml/2023/6/13/733
According to Mr. Jones, for some reason this startup bug occurred less frequently on machines with Intel CPUs than on machines with AMD CPUs.
Related Posts: