L2 Cache
The advantages of the new 45nm technology allowed Intel engineers to increase the number of processor transistors without making the CPU die any bigger. This possibility was as always used to increase the L2 cache memory in the first place. While the old 65nm dual-core Conroe processors features 4MB shared L2 cache, their 45nm Wolfdale successors got a 6MB cache at their disposal.
The area of the new CPU dies occupied by the L2 cache has finally become larger than any of the other processor functional units, which you can clearly see from the Wolfdale (half of Yorkfield) picture.
As a result, new quad-core Penryn processors will feature 12MB L2 cache: 6MB for each pair of cores. In other words, there will be no changes in the quad-core processor design with the transition to new production process. The core pairs are still located on different dies and will be exchanging data along the system bus and RAM.
Yorkfield processors have 50% larger cache with higher associativity level. They have 24-way set associativity L2 cache compared to 16-way associativity in the previous generation. As a result, Intel hopes to be able to use the L2 cache memory more efficiently and maintain fast data search in it.
Core 2 Extreme QX9650 - on the left, Core 2 Extreme QX6850 - on the right
However, the results of cache latency measurements indicate that the larger L2 cache is still a little bit slower by the new processors.
?/td> | Core 2 Extreme QX9650 (Yorkfield) | Core 2 Extreme QX6850 (Kentsfield) |
L2 latency (CPU-Z) | 15 cycles | 14 cycles |
L2 latency 64 byte stride (ScienceMark 2.0) | 12 cycles | 11 cycles |
L2 latency 256 byte stride (ScienceMark 2.0) | 12 cycles | 12 cycles |
L2 latency 512 byte stride (ScienceMark 2.0) | 13 cycles | 12 cycles |
The cache memory of the new processors not only got larger but also acquired an additional function called enhanced cache line split load . This innovation should speed up the reading of incorrectly aligned data from the cache-memory, namely the data that were supposed to be placed into a single line but for some reason ended up split into two cache lines. The new function attempts to speculatively predict what data this might be and pick it out from the cache as fast as if it sat in one single line. Theoretically, this feature may speed up some applications that deal with data scans.