A case where the .NET program is delayed by 50% on Intel's Skylake-X processor is found



Alois Kraus has published a blog on Intel 's Skylake - X generation Xeon processor that there is a big delay in .NET applications compared to previous models.

Why Skylake CPUs Are Sometimes 50% Slower - How Intel Has Broken Existing Code - Alois Kraus
https://aloiskraus.wordpress.com/2018/06/16/why-skylakex-cpus-are-sometimes-50-slower-how-intel-has-broken-existing-code/

Mr. Kraus used a Skylake-X generation "Xeon Gold 6126" machine, and encountered a situation where the performance was significantly lower than in the old machine. The figure below shows the delay time of the new machine (yellow) and the old machine (blue), and the smaller the value, the better the performance. Regarding most problems, it seems that it is often caused by problems of Windows or BIOS setting, but Kraus seems to be unable to solve this problem by tuning.


Mr. Kraus examined the cause in detail, and found that there was a delay through the "pause" instruction in the .NET Framework code.


When we digitize the time when the CPU temporarily stops with the pause instruction, we found that the Xeon of the Skylake-X generation is ten times longer than the old generation Xeon (E5 1620). About 10 times delay Kraus says "It seemed like a bug".


Mr. Kraus who searched the net about the pause instruction found the description related to the Intel manual below.

Intel 64 and IA-32 Architectures Optimization Reference Manual
https://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-optimization-manual.pdf

According to this description, Intel showed that it extends from the conventional 10 cycles with the pause command to 140 cycles, and pointed out that performance can be deteriorated by increasing the latency of the pause command.


The CPU cycle of each generation of the conventional Core processor is as follows. Actual execution time is expressed by dividing this number of CPU cycles by CPU frequency, but it seems that there is no doubt that the delay of Skylake-X is one digit different.


In the machine using the Skylake-X generation CPU, the problem that the processing speed can be greatly reduced when executing a large number of multithreaded applications on .NET has been pointed out by Xiangyang Guo as of August 2017 It was said that it was done.

Spin wait tuning · Issue # 13388 · dotnet / coreclr · GitHub
https://github.com/dotnet/coreclr/issues/13388

This problem has been resolved in the next version of the .NET Framework, version 4.8, Preview and .NET Core 2.1, but there will be problems until the official release in 2019. In addition, Kraus points out that this problem is not a problem specific to .NET on the Skylake-X generation CPU, it affects all implementations of Spinlock using the pause command.

About the problem pointed out by Mr. Kraus In Reddit, there are indications that both Microsoft and Intel thought that the pause command was too fast and that the misfortunes of applying countermeasures were overlapped.

Skylake-X CPUs have 140-cycle "pause" latency with negative effects on .NET Spinlocks: hardware
https://www.reddit.com/r/hardware/comments/8s011f/skylakex_cpus_have_140cycle_pause_latency_with/

in Software,   Hardware, Posted by darkhorse_log