What is the breakthrough to solve the heat problem of 'DRAM' that reduces the performance of smartphones and PCs?



Dynamic Random Access Memory (DRAM) is a type of memory mainly used as the main memory of a computer, and has the advantage of constantly consuming power while providing a large capacity at low cost. However, the 'heat problem' that this DRAM faces has reached a critical situation and is affecting the performance of PCs and smartphones, according to the technology media Semiconductor Engineering.

DRAM Thermal Issues Reach Crisis Point
https://semiengineering.com/dram-thermal-issues-reach-crisis-point/

DRAM is a storage element that uses semiconductor memory , and holds information by storing electric charges in a small capacitor in the chip. Since this charge is gradually lost due to the leakage current inside the device, DRAM requires a memory retention operation (refresh) to replenish the charge at a cycle of several tens of millimeters. One of the strengths of DRAM is that it can provide large capacity at low cost instead of constantly consuming power for refreshing, and it is used in a wide range of high-performance devices at the time of writing the article.

In recent years, various DRAM interfaces such as DDR5 , LPDDR5 , GDDR6 , HBM have appeared, but all of them have the same basic structure and mechanism, 'If it gets too hot, charge is easily lost and performance deteriorates. It also has the common drawback of 'doing'.

Bill Gervasi, a major system architect at Nantero , an American semiconductor technology company, said that DRAM generally works normally from 0 to 85 degrees, but when it exceeds 85 degrees, performance begins to be impaired. He pointed out that data tends to start to be lost if it exceeds the limit. And because these numbers are based on the latest 14nm process, 'what happens when you scale down to 10nm, 7nm, 5nm, or 3nm? The chain is out of control,' Gervasi said. rice field.

When DRAM tends to lose charge as the temperature rises, it responds by increasing the refresh frequency ( refresh rate ). However, Marc Greenberg, Group Director of Product Marketing at Cadence Design Systems , a software company for semiconductor development, said, 'The refresh cycle takes into account the rising temperature of the device and the rapid outflow of charge from the capacitors. But unfortunately, this charge refresh consumes a lot of current and generates heat inside the DRAM. The hotter it is, the hotter it is to have to refresh. It will continue to grow and the whole thing will be ruined, 'he points out the problem of falling into a vicious circle.



Also, since each DRAM is close to each other in one memory system, if one DRAM fails due to heat, it is very likely that other DRAMs in the same system will also fail. Steven Woo, the developer of American semiconductor technology company

Rambus , said, 'Even with a robust server memory system, a thermal failure can result in the loss of just a few DRAMs and the entire system fails. That's a huge problem for memory systems. '

Even if it doesn't go to failure, higher refresh rates take up more DRAM bandwidth and slow down DRAM and device performance. '5% of system performance is spent solely on maintaining what has already been written,' Gervasi said. 'To operate at temperatures above 85 ° C, to ensure data integrity.' You have to sacrifice system performance. '

To address these issues, the semiconductor industry is developing solutions to minimize thermal issues and increase reliability. For example, in LPDDR and DDR5, a temperature sensor is built in the die, and power consumption and temperature rise are suppressed by adjusting the refresh rate according to the ambient temperature. In addition, very high-density DRAM such as LPDDR5 and DDR5 is equipped with an error correction function that restores data based on the surrounding bit cells when a charge leak occurs, and it seems to be dealing with thermal problems. is.

In addition to common cooling technologies such as heat sinks and fans, the technology of 'etching microfluidic channels into the chip to allow cooling liquid to flow through the chip' has been the focus of the industry for over a decade. Is collecting. However, although microfluidic cooling technology has been confirmed to work in the laboratory, it needs to solve the problems of fluid erosion and leakage if it is implemented, and it is not very likely to be commercialized. thing.



In addition, Nantero has fundamentally reviewed the memory design and developed NRAM (Non-Volatile Random Access Memory) that can retain memory without leaking charge and without power supply. NRAM is a non-volatile memory made from carbon nanotubes, which is said to be able to withstand extreme thermal conditions. Gervasi believes that carbon nanotubes are a breakthrough in avoiding chip-related thermal problems because they have a very high thermal conductivity and can diffuse heat faster than other materials. That.

Greenberg argues that heat, whether it's a chip or other component, shouldn't be treated simply as an 'inconvenience that can be fixed later', and that the heat problem needs to be addressed more radically. 'People want to buy a bigger heatsink to do the calculations they need. People who make battery-powered devices, smartphones, tablets, phones, etc. are concerned about power consumption but about heat. I don't really care. Many simulation techniques can be used to improve power consumption and heat conditions. '

Large data centers use huge amounts of power for cooling, but if the temperature requirements of the chip are relaxed, the power required for cooling can be reduced accordingly. Previously, when a company asked JEDEC , a business association in the electronic technology industry, to raise the operating temperature trial by 5 degrees, it was estimated that a 5 degree increase in operating temperature would be equivalent to the closure of three coal-fired power plants. The result is that it leads to a reduction in power consumption.

in Hardware, Posted by log1h_ik