Cerebras announces the world's fastest AI chip 'WSE-3' equipped with 4 trillion transistors



Cerebras , a startup that builds computer systems for AI, announced the world's fastest AI chip ' WSE-3 (Third Generation 5nm Wafer Scale Engine)' equipped with 4 trillion transistors. . In addition, in order to improve cost performance, which is a bottleneck in AI operations, we will use Qualcomm's AI 100 Ultra and build a new AI supercomputer ' Condor Galaxy 3 (CG-3) ' in collaboration with G42. It has also been announced what will be done.

Cerebras Systems Unveils World's Fastest AI Chip with Whopping 4 Trillion Transistors - Cerebras
https://www.cerebras.net/press-release/cerebras-announces-third-generation-wafer-scale-engine

Cerebras Selects Qualcomm to Deliver Unprecedented Performance in AI Inference - Cerebras
https://www.cerebras.net/press-release/cerebras-qualcomm-announce-10x-inference-performance

Cerebras and G42 Break Ground on Condor Galaxy 3, an 8 exaFLOPs AI Supercomputer - Cerebras
https://www.cerebras.net/press-release/cerebras-g42-announce-condor-galaxy-3

AI startup Cerebras unveils the WSE-3, the largest chip yet for generative AI | ZDNET
https://www.zdnet.com/article/ai-startup-cerebras-unveils-the-largest-chip-yet-for-generative-ai/

Cerebras WSE-3: Third Generation Superchip for AI - IEEE Spectrum
https://spectrum.ieee.org/cerebras-chip-cs3?share_id=8148767

Cerebras Systems Sets New Benchmark in AI Innovation with Launch of the Fastest AI Chip Ever - Unite.AI
https://www.unite.ai/cerebras-systems-sets-new-benchmark-in-ai-innovation-with-launch-of-the-fastest-ai-chip-ever/

On March 13, 2024 local time, Cerebras, which develops components for generative AI, announced 'WSE-3'. As a result, compared to 'WSE-2', which achieved the fastest performance with existing AI chips, performance has been improved by 2 times, while power consumption and selling price remain the same. .

The main specifications of WSE-3 are as follows.

Number of transistors: 4 trillion
Number of AI cores: 900,000
Peak AI performance: 125PFLOPS
On-chip SRAM: 44GB
Memory bandwidth: 21PB
Network fabric bandwidth: 214Pbit
Process rules: TSMC 5nm
External memory: 1.5TB, 12TB, 1.2PB
Number of AI models trained: up to 24 trillion parameters
Cluster size: up to 2048 CS-3 systems (computing systems with WSE-3)

The WSE-3 is a square with a side of 21.5 cm, and is also called the world's largest single chip.



With a memory system of up to 1.2PB, CS-3 is designed to train next-generation frontier models that are 10x larger than GPT-4 and Gemini. A 24 trillion parameter model can be stored in a single logical memory space without partitioning or refactoring, greatly simplifying training workflows and increasing developer productivity.

So training a 1 trillion parameter model on CS-3 will be as easy as training a 1 billion parameter model on a GPU. In addition, with the full-scale CS-3, Llama 70B, Meta's large-scale language model with 70 billion parameters, can be trained from scratch in one day.

According to Cerebras, the real limit in AI implementation lies in the inference and execution of neural network models. According to estimates by Cerebras, if everyone on the planet used OpenAI's ChatGPT, it would cost as much as $1 trillion (approximately 150 trillion yen) annually, and the amount of energy consumed would be enormous. .

Therefore, Cerebras announced that it will use Qualcomm's

AI 100 Ultra to achieve unprecedented performance in AI inference. This seems to make it possible to reduce inference costs by up to one-tenth.

'This initiative aims to usher in a new era of high-performance, low-cost AI inference, and the timing couldn't be better,' said Andrew Feldman, co-founder and CEO of Cerebras. is focused on training the highest quality, state-of-the-art models that don't break the bank at inference time.''By leveraging Qualcomm's AI 100 Ultra, we are reducing inference costs without sacrificing model quality. This is a significant reduction and leads to the most efficient implementation currently available.'



In addition, Cerebras announced plans to build an AI supercomputer, the Condor Galaxy 3 (CG-3), in collaboration with G42, a major technology holding group. CG-3 is equipped with 64 CS-3s equipped with WSE-3, the industry's fastest AI chip, which has more than 58 million AI-optimized cores and a processing speed of 8EFLOPS ( Exaflops).

CG-3 is the third AI supercomputer in the partnership between Cerebras and G42, as they previously co-developed the AI supercomputers Condor Galaxy 1 and Condor Galaxy 2. With these three AI supercomputers, the processing speed of the entire Condor Galaxy network will reach a total of 16 EFLOPS.



Condor Galaxy, the AI supercomputer developed by Cerebras and G42, has been used to train industry-leading, cutting-edge generative AI models such as Jais-30B, Med42, Crystal-Coder-7B, and BTLM-3B-8K. It has been used.

Condor Galaxy 3 is expected to be released in Q2 2024.

in Hardware, Posted by logu_ii