Intel's Prescott-2M: Pentium 4 660 and Pentium 4 Extreme Edition 3 73GHz
ThoughtsFirstly, I will run 64-bit benchmarks using Windows XP 64-bit as soon as it's released. Some early testing with RC2 shows weirdness. While that weirdness may well persist in the final...
IntroductionAll the big talk in the consumer processor world is of dual-core designs. Two processors on one die, sharing or having exclusive caches, is an extension of multi-processing, a technique used to improve performance that's pervasive in current computing. Intel's HyperThreading technology, which shares the execution resources on a processor, is a form of multi-processing and allows the operating system to think there's two distinct processors per single physical package.
While that's the wave of the future, processor vendors still forge ahead with other ways to increase processor performance and scale back resulting thermal outputs and power consumptions. Following the launch of the initially lacklustre Prescott core for Pentium 4, Intel have definitely been in need of a way to boost basic performance, reduce heat and power consumption and inject some life into an ailing processor that nobody in the press wants to recommend.
Today's release of the 6-series Pentium 4 attempts to address most of what we all generally agree are the Prescott P4's failings, while capitalising on the few things it gets right. Here's what they've done.
2MiB of L2 cache While Prescott-1M had 1MiB of second level cache memory, used by the processor to store the working set and intermediate data of the thread that's currently executing, the Prescott-2M core introducted today doubles that to 2MiB. Level one cache (L1) stays at 16KiB for data and and there's a 12K-entry trace cache that's used to store decoded micro-ops that the processor arranges to form 'full' instructions. The L2 cache retains a 64-byte line size and a 256-bit bus internally to the rest of the processor. L2 cache is inclusive of L1 which means L1 is mirrored in L2 at all times.
The doubling of L2 cache memory increases die area by 20% or so while transistor count, on the 90nm process that Intel use to produce the N0 revision of Prescott-2M, increases to just under 170 million, up from around 125 million on Prescott-1M.
Here you can see a shot of the die on Prescott-2M. I've surrounded the level two cache with a black border so you can see the proportional area of the die it occupies.
Enhanced Intel SpeedStep Technology (EIST) SpeedStep, Intel's until-now technology on mobile processors that dynamically changes the processor's frequency (using multiplier adjustment) and operating voltage, shows up on the 6-series.
It's designed to lower power consumption first and foremost and for Prescott-1M that was a big deal. With Prescott-2M, with nearly 55 million more transistors to power, it's an even bigger concern. With a supporting BIOS, the processor will clock itself down and lower its voltage to decrease consumed power during idle periods, with the knock-on effect of reduced heat.
Beefed up halt-state support and TM2 thermal monitoring Amongst the myriad of ACPI-related power states that a modern processor supports (C0 is throttle or FID/VID change, C1 is halt, C2 is stop grant and C3 is sleep), Intel has increased the power that the C1 state has over processor operation on Prescott-2M. You'll notice that C0's FID/VID change is how the OS can control SpeedStep using ACPI interfaces and C1 is produced by software issuing the HLT processor instruction. Prescott-2M allows C1 to drop multiplier during HLT issues. It should drop voltage too, but that's dependant on the BIOS and on at least one of the mainboards I've tested Prescott-2M on over the last few days, it's not implemented in the programmed ACPI table.
Tied in to ACPI and C0 is an improved thermal monitor called TM2 that can clock the CPU back when it gets too hot. TM1 would simply shut off functional units. So with C1 invoked using HLT on the Prescott-2M, operating temperature drops and power consumption falls off until the processor is reawakened with fresh code. TM2 or C0 throttling and/or C0 FID and VID changing can accomplish the same thing, albeit at the bad stage of processor overheating.
NX bit Introduced on Intel processors with the J revision of Prescott-1M (first shown on the 3.8GHz 570J and now available across the entire 5-series range), the no-execute bit on the processor allows it to mark mapped memory pages as data only, so it won't execute code out of marked pages, raising an exception when it's asked to do so. That allows data in memory to be protected, stopping certain types of code execution from written data that might have been altered. While Windows XP's implementation of support for the NX bit has recently been found to be lacking, it's a low-level mechanism to protect the user from badly designed and possibly malicious software.
EM64T I covered EM64T when I reviewed the Nocona revision of the Xeon processor. It's Intel's implementation of the AMD64 ISA that AMD introduced with Athlon 64 and Opteron nearly two years ago. With the upcoming launch of the 64-bit version of Windows XP, it's supported by a consumer desktop processor from Intel for the very first time.
Along with the 6-series range of Prescott-2M processors, there's also a new Extreme Edition that uses the new core, that rides the same 266MHz bus (1066MHz affective) that the earlier 3.46GHz Extreme Edition did. The new 6-series lineup continues to use the same 200MHz (800MHz) GTL+ bus implementation that's been around since the Northwood core revision.
Let's stick all the data into table form for easy digestion.