It is pointed out that CPUs with many cores are limited in capacity by 'POSIX'
by
POSIX , which defines how to call functions common to UNIX-like OSs, is a standard that guarantees that 'it will work in any environment as long as it complies with POSIX.' POSIX has supported the development of portable applications for many years, but system administrator Charles Fisher pointed out that 'POSIX is a factor limiting the power of multi-core CPUs,' and issued the 'xargs' command. A concrete explanation is given as an example.
Parallel shells with xargs: Utilize all your cpu cores on UNIX and Windows | Linux Journal
https://www.linuxjournal.com/content/parallel-shells-xargs-utilize-all-your-cpu-cores-unix-and-windows
POSIX.2 compliant shells do not have the ability to notify when resources are available or have integrated task schedule management capabilities. Fisher points out that this situation stems from the fact that the industry at the time made conservative decisions that did not presuppose symmetric multiprocessing when establishing POSIX.2.
To bring out the performance of a multi-core CPU on a UNIX-like OS, a multi-thread technology called 'POSIX thread' is generally used. However, the GNU version of the ' xargs ' command allows parallel processing in process units that isolate memory space rather than threads that share memory space, and can exceed the structural limits of POSIX. That thing. In order to manage tasks on a process-by-process basis in a POSIX-compliant shell, it is necessary to manually manage processes using the 'bg', 'fg', and 'jobs' commands.
xargs is a command that can read a value from standard input and execute processing on multiple command lines, and is used when processing a large number of files.
The GNU version of xargs has an 'P' option that specifies the maximum number of processes that can be executed concurrently, and an 'L' option that specifies the maximum number of input lines that can be used on a single command line. By using these options, it is possible to start the processing for all inputs in a separate process. Fisher compares the performance of file compression with gzip using xargs and file compression with
Performing file compression on pigz launches a single process, as shown in the image.
You can use xargs to start a process for each file.
Comparing the time from program execution to termination, xargs is '45 minutes 51 seconds .904' when pigz is used as it is, and '44 minutes 42 seconds .107' when multiple processes are started with xargs. The result was that the processing time was shorter when used. This means that it is better to divide the process by xargs and use multi-core than multi-thread processing by POSIX threads. Regarding this result, Mr. Fisher commented, 'The fact that the number of files is divisible by the number of CPU cores has led to improved processing performance by xargs.'
Fisher points out that being restricted by POSIX in parallel processing is 'foreign in modern times.' POSIX is considered an absolute standard in many areas, but new technologies such as SELinux and systemd raise hopes that it may overcome the limitations of previous generations. 'While there may be claims that portability outweighs functionality, innovation also needs to ultimately outperform tradition,' Fisher said.
Related Posts:
in Software, Posted by darkhorse_log