What is the benchmark test result of Intel's GPU 'Arc A770'?



On September 27, 2022, Intel announced the GPU '

Arc A770 ' with performance equivalent to the RTX 3060 Ti. Chips and Cheese, a gadget blog, explains the history of Intel's challenge to the GPU market, which basically creates CPUs, and the microbenchmark results of Arc A770.

Microbenchmarking Intel's Arc A770 – Chips and Cheese
https://chipsandcheese.com/2022/10/20/microbenchmarking-intels-arc-a770/

Intel's Arc A770 is Intel's third attempt at the dedicated GPU market. Intel used to create a graphics chip 'Intel 740 (i740)', and at this time I was trying to store textures in system memory using the new AGP interface at that time. In theory, it would have been possible to simplify programming with less on-board memory, as textures would not need to be copied to VRAM.

However, this was a fiasco, and Intel has learned that high-performance GPUs cannot rely heavily on host connectivity, Chips and Cheese points out. As a second challenge, Intel developed 'Larrabee', but it lacked much of the hardware to process, Intel did not release Larrabee, and there is a high barrier to entry into the GPU market Chips and Cheese explains that I realized.

Based on these valuable lessons, Intel is trying to re-challenge the development of high-performance GPUs using one of the instruction set architectures 'Xe-HPG'.

The Arc A770 has 512 EUs (cores) with a total of 4096 FP32 lanes. This is several times larger than Intel's integrated GPU, and represents Intel's clear intention to challenge the mid-range GPU market.

Chips and Cheese then ran various microbenchmarks on the Arc A770 and shared their results.

The results of testing memory latency are as follows. Intel Arc A770 is represented by blue line, NVIDIA RTX 3060 Ti by green line, NVIDIA RTX 3070 by green dashed line, AMD Radeon 6600 XT by red line, AMD Radeon 6700 XT by red dashed line. In the test, when the cache size (horizontal axis) is small, the latency of the Arc 770 (vertical axis) is stable, but when the cache size is large, the latency increases significantly.



Intel and Nvidia, like most GPUs that have appeared in the last decade, use a traditional two-level cache hierarchy. Intel's unique strategy here was to adopt a larger cache than its competitors. For L1 cache, the Arc A770 offers at least 192KB in size in exchange for some latency, while the NVIDIA Ampere SM uses 128KB blocks for both local memory and L1 D cache, leaving 96KB for L1 D cache. Chips and Cheese points out that it seems to allocate as. 'L1 D latency should be lower, but a high-utilization GPU shouldn't have a hard time hiding such latency. If L1 hit rate is high, L2 and beyond Average access latency should also improve as fewer requests must be processed by

Here's what the bandwidth looks like. The Arc A770 Xe Core (blue line) is noticeably smaller than the others, with Chips and Cheese pointing out that it ``cannot extract much bandwidth across the memory hierarchy, especially VRAM bandwidth.'' .



Chips and Cheese said, 'Latency is definitely to blame, but it's not the only one. AMD's older HD 7950 single CU can squeeze several times more bandwidth out of VRAM for the same VRAM latency. The cache bandwidth is also not very good, suggesting that the Arc A770 Xe Core has limited memory-level parallelism compared to other GPUs.'

After conducting a benchmark test, Chips and Cheese said, ``Intel has made a lot of efforts to move its graphics architecture to a higher performance field.Performance over AMD and NVIDIA's low-end cards The current Arc series was born as a result of aiming to create a reliable single GPU that has

'The A770 probably ended up hitting a bit lower than what Intel intended. I think a lot of work needs to be done for the A770 to really shine. It's all about high-res frames.' Intel will probably do well when dealing with operations that are performed across pixels, but it probably won't do well when pixels and other high-occupancy work is a small part of the work of rendering a frame.' rice field.

in Hardware, Posted by log1p_kr