Intel announces details of next-generation notebook processor 'Lunar Lake' with emphasis on AI performance and power efficiency

Intel announced details of its next-generation notebook processor '

Lunar Lake ' on Monday, May 20, 2024, at the computer trade fair ' COMPUTEX TAIPEI 2024 ' held in Taiwan. The specifications of 'Intel NPU 4', which achieves a maximum of 48 TOPs and meets Microsoft's Copilot+ PC requirements, were also released.

Computex: Intel accelerates AI everywhere, redefining power, performance, and affordability.

Architecture All Access: Live at Lunar Lake ITT: Lunar Lake Architecture Overview - YouTube

Intel Unveils Lunar Lake Architecture: New P and E cores, Xe2-LPG Graphics, New NPU 4 Brings More AI Performance

Intel's Lunar Lake is a laptop processor that focuses on improving power efficiency and optimizing overall performance from Meteor Lake, which was released in 2023. At the time of writing, the lineup after the 12th generation Core processor ' Alder Lake ' is as follows. The Lunar Lake announced this time is not manufactured at Intel's foundry, but is characterized by the use of TSMC's N3B and N6 manufacturing processes.
Alder/Raptor Lake Meteor Lake Lunar Lake Arrow Lake Panther Lake
P-core (high performance core) architecture Golden Cove/
Raptor Cove
Redwood Cove Lion Cove Lion Cove Cougar Cove?
E-Core (High Efficiency Core) Architecture Gracemont Crestmont Skymont Crestmont? Darkmont?
GPU Architecture Xe-LP Xe-LPG Xe 2 Xe 2? ?
NPU Architecture none NPU 3720 NPU 4 ? ?
Manufacturing Process Intel 7 Intel 4
Intel 20A etc. Intel 18A
Featured Devices Notebook PC
Notebook PC Energy-efficient notebook PC High-performance notebook PC
Release Date October to December 2021 October to December 2023 October to December 2024 October to December 2024 2025

◆Lion Cove P Core
The high-performance P core uses a new architecture called 'Lion Cove.'

Architecture All Access: Lion Cove P-core Microarchitecture Explained - YouTube

Lion Cove is designed with improved single-threaded performance in mind, with improved memory and cache subsystems, better power management, and faster frequencies.

Lion Cove has significant front-end improvements, including 8x predicted blocks, wider

fetches , larger decode bandwidth, and increased cache capacity and read bandwidth. In particular, Lion Cove has redesigned the cache hierarchy from the previous generation 'Redwood Cove', with 48KB of L0D cache, 192KB of L1D cache, and up to 3MB of extended L2 cache per core. The TLB depth has also been increased from 96 pages to 128 pages to improve hit rates.

According to Intel, Lion Cove will deliver a significant improvement in instructions per cycle (IPC) compared to the previous generation Redwood Cove. In particular, hyper-threading will improve IPC by 30% and dynamic power efficiency by 20%. In addition, the power management will be equipped with an AI self-tuning controller that dynamically responds to the operating environment and delivers sustained higher performance.

◆Skymont E-Core
The highly efficient E-core uses the Skymont architecture.

Architecture All Access: Skymont E-core Microarchitecture Explained - YouTube

Skymont has an improved front end and out-of-order execution for more efficient performance, and by sharing a 4MB L2 cache across four cores, it doubles the L2 bandwidth to 128B per cycle, reduces memory access latency, and improves data throughput.

Compared to Meteor Lake's LP E-cores, Lunar Lake's Skymont E-cores consume one-third the power and deliver 2.9x better multi-core performance.

In terms of single-threaded performance, Intel claims that the Skymont E-core offers a 2% improvement in integer and floating-point performance compared to the Raptor Cove P-core, the P-core two generations earlier.

◆Xe 2 architecture
Xe 2 is a GPU architecture for notebook PCs supported by the second-generation Xe core. The second-generation Xe core is said to have 1.5 times the graphics performance compared to the Xe-LPG installed in Meteor Lake, thanks to the

Xe Matrix eXtensions (XMX) engine, a matrix calculation engine for AI. At the same time, the Xe Vector Engine (XVE) for vector calculations installed in the second-generation Xe core has been changed from SIMD8 to SIMD16, increasing the calculation density per clock cycle. This means that the emphasis is on 'quickly processing complex calculations'.

And Xe 2 supports

the VVC codec, which Intel says reduces file sizes by up to 10% compared to AV1 and allows for lower bitrates for streaming without compromising quality, improving performance in multimedia applications.

In addition, Intel has released a movie comparing the power consumption of a 4K quality YouTube AV1 movie played on a notebook PC equipped with Lunar Lake and Meteor Lake. From this, it is clear that Lunar Lake is more power efficient than Meteor Lake.

Lunar Lake Lowers Power With AV1 YouTube Video Playback On E-Cores | Talking Tech | Intel Technology - YouTube

In addition, Xe 2 implements the Windows GPU software stack, providing comprehensive support for many runtimes and followers, including D3D, Bulkan and Intel VPL APIs and frameworks, improving overall efficiency and compatibility across a wide range of software environments.

◆NPU 4
And the biggest focus of the Lunar Lake processor is the inclusion of NPU 4. The computing performance of the previous generation NPU 3 was up to 11.5 TOPs, while the NPU 4 has been improved to 48 TOPs.

Compared to the previous generation, NPU 4 has three times the number of compute tiles.

In addition, the efficiency of the multiply-accumulate unit (MAC) array has been improved, resulting in up to 2x performance at the same power level as the previous generation. The new MAC array features advanced data transformation capabilities to minimize latency and optimize data flow, allowing it to process up to 2048 MAC operations in one clock cycle for NT8 and 1024 MAC operations for FP16. In addition, the vector register length of the NPU 4 is now 512 bits, allowing more vector operations to be performed in one clock cycle.

As a result, the NPU 4 delivers 12x the vector performance, 4x the TOPS, and 2x the IP bandwidth compared to the previous generation NPU 3. These improvements make the NPU4 highly efficient and optimized for machine learning applications that demand performance and low latency.

For I/O, the Lunar Lake processor is natively equipped with Thunderbolt 4, Thunderbolt Share, and Wi-Fi 7 connections. As for Thunderbolt 4, it is now possible to equip a laptop with three Thunderbolt ports, which improves convenience. Lunar Lake also has a system called '

Thunderbolt Share ' that allows multiple PCs to share screens, monitors, keyboards, mice, storage, etc. through Thunderbolt connections.

Lunar Lake's native support for Wi-Fi 7 will enable faster wireless communication for devices equipped with the chip. In addition, Lunar Lake is equipped with 'RF interference mitigation technology' that automatically adjusts the DDR clock frequency to minimize interference with Wi-Fi wireless signals. Intel claims that this reduces throughput degradation due to memory noise by 50%.

Of course, Wi-Fi 7 will enable high-speed wireless communication, enhancing the VR experience.

Lunar Lake is scheduled to be released to the market in the third quarter (October to December) of 2024. Intel has also announced that Arrow Lake for desktop PCs and high-performance laptops will be released at the same time as Lunar Lake, but details of Arrow Lake have not been revealed at the time of writing.

in Hardware,   Video, Posted by log1i_yk