NVIDIA's next-generation chip, Blackwell, faces engineering challenges as it packs two processors onto one chip



The GPU architecture '

Blackwell ', announced by NVIDIA in March 2024, is equipped with multiple innovative technologies, including 208 billion transistors. However, it is said that Blackwell is experiencing problems such as heat generation due to the extreme performance required.

Nvidia's Future Relies on Chips That Push Technology's Limits - WSJ
https://www.wsj.com/tech/nvidias-future-relies-on-chips-that-push-technologys-limits-bd3839fc



The Blackwell chip has a significantly different design from previous AI chips, combining two processors and multiple memory components in one chip using silicon, metal, and plastic materials. This doubles the size of the chip using the previous generation '

Hopper ', and the number of transistors is 2.6 times that of the previous generation, at 208 billion, which is expected to further improve performance.

NVIDIA announces GPU architecture 'Blackwell' and new GPU 'B200' to realize AI models with trillions of parameters - GIGAZINE



NVIDIA CEO Jensen Huang said, 'Demand for Blackwell chips is growing at a tremendous pace,' but according to insiders, Blackwell chips have multiple engineering challenges. Perfection is required when manufacturing AI chips, and a defect in one part can cause serious problems. It has been pointed out that Blackwell chips are more susceptible to quality issues because they contain more components than conventional products, such as countless transistors.

In addition, the heat generated by the countless components risks damaging various parts and materials inside the package, and in the worst case scenario, the Blackwell chips, which cost $40,000 each, could stop working altogether. Andrew Feldman, founder of chip manufacturing startup Cerebras Systems, pointed out that 'it's hard to develop the technology to integrate two chips into one, and even harder to double them.'

'The new approach required to achieve Blackwell's performance came with hurdles, including increased manufacturing complexity and warping that could affect reliability and performance,' said analysts at investment bank UBS . 'These were key factors that made Blackwell difficult to deploy, but the upcoming fixes should enable NVIDIA to begin producing chips on schedule for 2025 shipments.'



In response to these concerns, Huang announced on August 28, 2024, 'We have made changes to the Blackwell design to improve the reliability of the chip.' It should be noted that no changes to functionality were required due to the design change. CFO Collett Cruz also said, 'NVIDIA is on track to scale up Blackwell production, which will lead to billions of dollars in revenue in the fourth quarter of 2025.'

The Wall Street Journal, an overseas media outlet, pointed out that 'NVIDIA has in recent years begun releasing its next-generation chips every year instead of every other year, which has put the company under increasing pressure to quickly resolve manufacturing issues. NVIDIA said, 'The increased frequency and complexity of new product introductions could lead to quality or production issues that could result in higher costs and shipping delays.'

in Hardware, Posted by log1r_ut