Intel Core i7 920, 940 and 965 Processor Review
Manufacturer: | Intel |
Product: | Intel Core i7 CPU Series |
Date: | Mon, Nov 03, 2008 – 12:00 AM |
Written By: | Nathan Kirsch –nate@legitreviews.com |
Share: | ![]() |
The Core i7 Series Arrives
Intel has finally lifted the embargo on the yet-to-be-launched Intel Core i7 processors and the Intel X58 Express chipset. Intel strongly believes that this new platform will be the must have work horse for digital media & gaming enthusiasts for many months to come. With so much to talk about this new platform we made the decision to focus just on processor performance for this article and then take a deeper dive at other features in the weeks to come. This should work out nicely as the processors won’t be available to purchase until later this month and many companies are just now getting us production grade triple channel memory kits and video card drivers for this new platform.
The Intel Core i7 Processor (known as Nehalem internally) has some very big architecture changes as you can tell from the picture above. The new Core i7 processor has 1366 pins and as a result the size of the processor, socket and heat sink mounting brackets are all larger than LGA 775 based processors that have been out now for a couple of years. The die size of Core i7 processors is 263 mm2 and the transistor count is 731 Million.
Taking a look at the die of the Core i7 processor we see a first for Intel processors — the integrated memory controller. This on-die, triple channel, DDR3 memory controller is unique in the fact that it allows consumers to run three memory modules together for optimal performance. By moving to an integrated memory controller and triple channel memory the platform has over 25GB/s of throughput between the processor and DDR3 memory modules!
For those that follow processor architecture you will notice a brand new cache structure on the Core i7 diagram shown above. All Intel Core i7 processors feature L1, L2, and shared L3 caches. Before, Intel Core 2 Duo and Quad processors had just an L1 and L2 cache. The break down on the cache is as follows: there is a 64K L1 cache (32K Instruction, 32K Data) per core, 1MB of total L2 cache, and an impressive 8MB chunk of L3 cache that is shared across all the cores. That means that all Intel Core i7 processors have over 9MB of memory right there on the 45nm processor!
Can it get any better than this?
Of course it can! The new Core i7 processor has a huge list of improvements that have been made to it.
- New SSE4.2 Instructions
- Improved Lock Support
- Additional Caching Hierarchy
- Deeper Buffers
- Improved Loop Streaming
- Simultaneous Multi-Threading
- Faster Virtualization
- Better Branch Prediction
Intel always told us that Hyper-Threading was not dead and they were right as the technology has surfaced again and is enabled on all of the Core i7 processors. With Hyper-Threading enabled on quad-core Core i7’s processors the operating system sees eight virtual cores that can be used. Intel has told Legit Reviews that when Hyper-Threading originally came out the idea was solid, but that the Pentium 4 processor might not have been the best processor to bring it to market. The Core i7 series should highlight all the strong points of Hyper-Threading as they are calling it Hyper-Threading “done right” now. If you want a deeper look at the Intel Core i7 architecture take a look at this presentation that was given at the Spring 2008 IDF and this one that was given at the Fall IDF.
Intel will be releasing three Core i7 processors and all have a TDP of 130W and an on-die shared L3 cache of 8MB. All current Core i7 processors are not intended for multi-processor motherboards, so it has only one Quick Path Interconnect (QPI).
- Core i7 965 Extreme Edition – 3.2GHz with 8MB Shared L3 cache and a 1×6.4GT/s QuickPath interconnect – $999
- Core i7 940 – 2.93GHz with 8MB Shared L3 cache and a 1×4.8GT/s QuickPath interconnect – $562
- Core i7 920 – 2.66GHz with 8MB Shared L3 cache and a 1×4.8GT/s QuickPath interconnect – $284
Now that we know what the general processor improvements are let’s take a closer look at the chipset changes.
The Intel X58 Express Chipset
In order to understand this new platform it is best to look at the motherboard chipsets that are going to be used.
The Intel X58 Express chipset is the chipset that was designed just for the Intel Core i7 series of processors as they require a new socket. Since the DDR3 memory controller is located inside the processor itself hundreds of new pins had to be added and the result was a larger CPU with more pins. Intel designed the X58 Express chipset from the ground up for Core i7, but re-used the ICH10/ICH10R southbridge chipset that has been out for several months now.
Together the Intel X58 Express chipset and the ICH10 Southbridge make up what is certain to be a very solid platform to use on high performance systems. The Intel ICH10/ICH10R Southbridge was launched with the Intel P45 Express chipset and has already proven itself a winner with some of the best Solid State Drive performance numbers of any chipset on the market. The X58 Express supports up to 36 lanes of PCI Express 2.0 connectivity, and since many boards using these chipsets will have both NVIDIA SLI and ATI CrossFire enabled it will mean that Triple-SLI and Quad CrossFireX will be easy to implement. This is due to the fact that NVIDIA is allowing motherboard makers to use a special sBIOS if they pay a licensing fee for SLI Technology. So, finally multi-GPU technology from both graphics card companies can be used on the same board. If that isn’t enough Intel has done away with the Front Side Bus and now has the Quick Path Interconnet to handle the flow of data between the processor and the chipset. The memory now has over 25.5 GB/s of throughput since it now has a direct connection to the processor.
The Test System
Before we look at the numbers, here is a brief glance at the test system that was used.
All testing was done on a fresh install of Windows Vista Ultimate 64-bit. All benchmarks were completed on the desktop with no other software programs running. All of the modules were run in dual channel mode with a 120mm fan placed on top of them to keep them cool except for the Core i7 system that was run in triple channel. The EVGA GeForce 8800 GTS 512MB used NVIDIA ForceWare 169.28 video card drivers and the. The LGA 775 test system used the ASUS P5E3 motherboard using BIOS version 1201 and the LGA 1366 test system used the ASUS P6T Deluxe motherboard with BIOS v8004. The AMD Phenom testing was done on the MSI K9A2 Platinum motherboard with BIOS v1.5b5 installed along with ATI system driver version 8.50.
Memory Settings:
- Core i7 920, 940, 965 – 1600MHz @ 8-8-8-24 (DDR3)
- QX9775 – 800MHz @ 5-5-5-15 (FB-DIMM)
- QX9770 – 1600MHz @ 7-7-7-20 (DDR3)
- Q9300 – 1333MHz @ 7-7-7-20 (DDR3)
- QX6850 – 1333MHz @ 7-7-7-20 (DDR3)
- Q6600 – 1066MHz @ 7-7-7-20 (DDR3)
- E8500 – 1333MHz @ 7-7-7-20 (DDR3)
- E7200 – 1066MHz @ 7-7-7-20 (DDR3)
- E6750 – 1333MHz @ 7-7-7-20 (DDR3)
- Phenom X4 9950 – 800MHz @ 4-4-4-12 (DDR2)
- Phenom X4 9850 – 800MHz @ 5-5-5-15 (DDR2)
- Phenom X4 9600 – 800MHz @ 5-5-5-15 (DDR2)
- Phenom X4 9350e – 800MHz @ 4-4-4-12 (DDR2)
- Phenom X3 8750 – 800MHz @ 5-5-5-15 (DDR2)
- Athlon 64 X2 5000+ – 800MHz @ 4-4-4-12 (DDR2)
Here is the Intel LGA 1366 Test platform:
Intel Test Platform | |||||
---|---|---|---|---|---|
Component | Brand/Model | Live Pricing | |||
Processor | See Above | ||||
Motherboard |
ASUS P6T Deluxe
|
||||
Memory |
6GB Corsair DDR3 1600MHz
|
||||
Video Card | EVGA GeForce 8800 GTS 512 | ||||
Hard Drive | Western Digital RaptorX 150GB | ||||
Cooling | Thermaltake BigWater 760i | ||||
Power Supply | Corsair HX1000W | ||||
Operating System | Windows Vista Ultimate 64-Bit |
Here is the Intel LGA 775 Test platform:
Intel Test Platform | |||||
---|---|---|---|---|---|
Component | Brand/Model | Live Pricing | |||
Processor | See Above | ||||
Motherboard |
ASUS P5E3 Deluxe
|
||||
Memory |
4GB Corsair DDR3 1800C7
|
||||
Video Card | EVGA GeForce 8800 GTS 512 | ||||
Hard Drive | Western Digital RaptorX 150GB | ||||
Cooling | Corsair Nautilus 500 | ||||
Power Supply | PC Power and Cooling 1KW | ||||
Operating System | Windows Vista Ultimate 64-Bit |
Here is the Intel Skulltrail Test platform:
Skulltrail Test Platform | |||||
---|---|---|---|---|---|
Component | Brand/Model | Live Pricing | |||
Processor | 2x Intel Core 2 QX9775 | ||||
Motherboard |
Intel D5400XS
|
||||
Memory |
4GB Micron 800MHz FB-DIMM
|
||||
Video Card | EVGA GeForce 8800 GTS 512 | ||||
Hard Drive | Western Digital RaptorX 150GB | ||||
Cooling | Zalman AT Fan/Heatsink | ||||
Power Supply | PC Power and Cooling 1KW | ||||
Operating System | Windows Vista Ultimate 64-Bit |
Here is the AMD Phenom Test platform:
AMD Test Platform | |||||
---|---|---|---|---|---|
Component | Brand/Model | Live Pricing | |||
Processor | All AM2 and AM2+ CPUs | ||||
Motherboard |
MSI K9A2 Platinum
|
||||
Memory |
4GB OCZ Flex PC2-6400
|
||||
Video Card | EVGA GeForce 8800 GTS 512 | ||||
Hard Drive | Western Digital RaptorX 150GB | ||||
Cooling | Zalman AT Fan/Heatsink | ||||
Power Supply | PC Power and Cooling 1KW | ||||
Operating System | Windows Vista Ultimate 64-Bit |
Sandra 2009 Memory Bandwidth
Sisoft; Sandra 2009:
The Sisoft Sandra 2009 benchmark utility just came out recently and we have started to include it in our benchmarking. With Sandra 2009 you can now easily compare the performance of the tested device with its speed and its (published) power (TDP)! Sandra XII SP2 also has SSE4 (Intel) and SSE4A (AMD) benchmark code-paths, which is great for those of you testing next-generation AMD & Intel chips.
Results: Sandra 2009 showed that the Intel Core i7 processors blow away the competion thanks to the new memory design being used. The Core i7 platform used three 2GB memory modules in Triple-Channel at 1600MHz with 8-8-8-24 1T timings, which is what we think will become the standard kit for this platform. Corsair already has announced 1866MHz CL9 kits for this platform and Kingston Technology has announced 2GHz 3GB kits, so enthusiasts will easily break the 30GB/Sec mark with high performance memory kits.
Photodex ProShow Gold 3.2
ProShow Gold allows the user to combine photos, videos and music to create spectacular slide shows. The software provides the capability to share memories with friends and family on DVD, PC and the Web. ProShow Gold brings still photos to life by adding motion effects like pan, zoom, and rotate. The user can also add captions to a photo or video and choose from over 280 transition effects.
The workload we are using takes 29 high resolution jpeg photos and converts them to an mpeg2, widescreen DVD quality, 3min 9sec slideshow video file. The input photos are in 3872×2592 resolution and total about 170MB in size.
ProShow Gold 3.2 lets you share your slide shows in virtually any format and on any device. You can upload your shows directly to YouTube or choose from over 20 devices to directly output to including the iPod, Blackberry, ZuneTM and more. Not bad for software that runs under $70 and is optimized for eight-cores! Our benchmark testing wasn’t at 100% load the entire time, but averaged around 95% during the testing period.
Benchmark Results: Photodex Proshow software showed that the Intel Core i7 quad-core processors do well with Hyper-Threading, but it wasn’t enough to pass up the true 8-core QX9775 platform. The 3.2GHz Intel Core i7-965 was 11 seconds faster than the Intel Core 2 Quad QX9770, which is very impressive as they offer the same clock frequency.
Microsoft Excel 2007
Microsoft Office Excel 2007 is a powerful and widely used tool with which you can create and format spreadsheets, and analyze and share information to make more informed decisions. It allows you to import, organize and explore massive data sets within spreadsheets and then communicate your analysis with professional-looking charts. Excel 2007 also provides tools to “see” important trends and find exceptions in your data. Legit Reviews has two benchmarking tests that we do on Microsoft Office Excel 2007.
The first workload executes approximately 28,000 sets of calculations using the most common calculations and functions found in Excel. These include common arithmetic operations like addition, subtraction, division, rounding and square root. It also includes common statistical analysis functions such as Max, Min, Median and Average. The calculations are performed after a spreadsheet with a large dataset is updated with new values and must re-calculate many data points. The input file is the 6.2 MB spreadsheet seen above.
Benchmark Results: Lots of people use Microsoft Office at work and home, so this is an important test for many of our readers. Many people don’t run 28,000 sets of calculations at once, but if you do the CPU will determine how fast the task is completed.
The Black-Scholes model is used in our second Excel test to calculate a theoretical call and put price using the five key determinants of an option’s price: stock price, strike price, volatility, time to expiration, and short-term (risk free) interest rate.
This workload calculates the European Put and Call option valuation for Black-Scholes option pricing using Monte Carlo simulation. It simulates the calculations performed when a spreadsheet with input parameters is updated and must recalculate the option valuation. In this scenario we execute approximately 300,000 iterations of Monte Carlo simulation. In addition, the workload uses Excel lookup functions to compare the put price from the model with the historical market price for 50,000 rows to understand the convergence. The input file is a 70.1 MB spreadsheet and with 10 times the calculations of the first test, this one should take a bit longer to complete.
Benchmark Results: With 300,000 iterations of Monte Carlo simulation taking place in this benchmark it takes all the processors a bit longer to finish as it puts a good load on the system. The Intel Skulltrail system is in a league of its own as it completes the task in less than ten seconds, but the Core i7 processors are right behind.
Cinebench R9.5
MAXON; CINEBENCH 9.5:
CINEBENCH is the free benchmarking tool for Windows and Mac OS based on the powerful 3D software CINEMA 4D. Consequently, the results of tests conducted using CINEBENCH 9.5 carry significant weight when analyzing a computer’s performance in everyday use. Especially a system’s CPU and the OpenGL capabilities of its graphics card are put through their paces (even multiprocessor systems with up to 16 dedicated CPUs or processor cores). During the testing procedure, all relevant data is ascertained with which the performance of different computers can subsequently be compared, regardless of operating system. Again, higher Frames/Second and lower rendering time in seconds equal better performance.
Cinebench 9.5 was able to put a 100% load across all the cores, which makes this a great benchmark to look at multi-core platforms.
Benchmark Results: Cinebench 9.5 was tested in both 64-bit and 32-bit, which resulted in some minor performance differences as seen above. The Intel Core i7 family of processors showed some nice performance gains over the current generation quad-core processors!
Cinebench R10
MAXON; CINEBENCH R10:
CINEBENCH is the free benchmarking tool for Windows and Mac OS based on the powerful 3D software CINEMA 4D. Consequently, the results of tests conducted using CINEBENCH 10 carry significant weight when analyzing a computer’s performance in everyday use. Especially a system’s CPU and the OpenGL capabilities of its graphics card are put through their paces (even multiprocessor systems with up to 16 dedicated CPUs or processor cores). The test procedure consists of two main components: The first test sequence is dedicated to the computer’s main processor. A 3D scene file is used to render a photo reaslistic image. The scene makes use of various CPU-intensive features such as reflection, ambient occlusion, area lights and procedural shaders. In the first run, the benchmark only uses one CPU (or CPU core), to ascertain a reference value. On machines that have multiple CPUs or CPU cores, and also on those who simulate multiple CPUs (via HyperThreading or similar technolgies), MAXON CINEBENCH will run a second test using all available CPU power. Again, higher Frames/Second and lower rendering time in seconds equal better performance.
Cinebench R10 was able to put a 100% load across all the cores on all of the processors, which makes this a great benchmark to look at multi-core platforms.
Results: Running Cinebench R10 in 64-bit mode showed a significant improvement in performance on all of the processors and the results were in-line with what we expected from running Cinebench R9.5! The Intel Core i7 965 was 27% quicker than the Intel Core 2 Quad QX9770 and both are the same clock frequency!
POV-Ray 3.7 Beta 25
Processor Performance on Pov-Ray 3.7 Beta 25:
The Persistence of Vision Ray-Tracer was developed from DKBTrace 2.12 (written by David K. Buck and Aaron A. Collins) by a bunch of people (called the POV-Team) in their spare time. It is an high-quality, totally free tool for creating stunning three-dimensional graphics. It is available in official versions for Windows, Mac OS/Mac OS X and i86 Linux. The POV-Ray package includes detailed instructions on using the ray-tracer and creating scenes. Many stunning scenes are included with POV-Ray so you can start creating images immediately when you get the package. These scenes can be modified so you do not have to start from scratch. In addition to the pre-defined scenes, a large library of pre-defined shapes and materials is provided. You can include these shapes and materials in your own scenes by just including the library file name at the top of your scene file, and by using the shape or material name in your scene. Since this is free software feel free to download this version and try it out on your own.
The most significant change from the end-user point of view between versions 3.6 and 3.7 is the addition of SMP (symmetric multiprocessing) support, which, in a nutshell, allows the renderer to run on as many CPU’s as you have installed on your computer. This will be particularly useful for those users who intend on purchasing a dual-core CPU or who already have a two (or more) processor machine. On a two-CPU system the rendering speed in some scenes almost doubles. For our benchmarking we used version 3.7 beta 25, which is the most recent version available. The benchmark used all available cores to complete the render.
Once rendering on the object we selected was completed, we took the score from dialog box, which indicates the average PPS for the benchmark. A higher PPS indicates faster system performance.
Benchmark Results: Looking at POV-Ray 3.7 Beta 25, the Intel Core i7-965 was over 30% faster than the QX9770 and 56% faster than the quickest processor AMD offers.
POV-Ray Real-Time Raytracing
Legit Reviews was e-mailed by one of the developers over at POV-Ray to see if LR could include real-time raytracing in our performance analysis, and we were more than happy to include the data in our testing.
E-Mail From POV-Ray — I thought I might ping you about an experimental feature we’ve added to the POV-Ray SMP beta: real-time raytracing. It’s mostly useful to folks who have multi-core systems and in fact is something that I’ve wanted to do for years but the hardware just wasn’t there (at least not in the consumer price range). It works best on a kentsfield or later, but a core 2 duo should be sufficient if you don’t mind sub-10fps frame rates.
If you want to try it out it please feel free to grab it from: http://www.povray.org/beta/rtr/
This experimental software by POV-Ray was a welcomed addition to our testing and was able to spread the work load across all the cores in even our eight core test system as seen above.
Results: POV-Ray Real-Time Raytracing is a great benchmark that we love to use on Legit Reviews and it does a great job at showing how performance scales with CPU cores. The Core i7 series really struts their stuff with Real-Time Raytracing as all three processors rendered the scene over 20FPS.
Futuremark 3DMark06
3DMark06
Futuremark’s 3DMark06 has a built-in CPU test is a multi-threaded DirectX gaming metric that’s useful for comparing relative performance between similarly equipped systems. This test consists of two different 3D scenes that are processed with a software renderer that is dependent on the host CPU’s performance. Calculations that are normally reserved for your 3D accelerator are instead sent to the CPU for processing and rendering. The frame-rate generated in each test is used to determine the final score.
Benchmark Results: The 3DMark 2006 CPU test showed that the Intel Core i7 920, 940 and 965 are hands down the fastest Intel quad-core processors we have ever seen! The pair of 9775 quad-core processors were still the overall leaders, but they cost twice as much and require an expensive dual-socket motherboard.
Overclocking Results
Overclocking greatly varies due to what hardware is being used and who is doing the overclocking. Always remember that no two pieces of hardware will perform the same, so our results will differ from what you might be able to get.
Using the ASUS P6T motherboard with BIOS v8004 we pushed the limits of our early revision processor to see what it could do. At stock settings the Intel Core i7 965 processor runs with a 133MHz baseclock that is multiplied by the CPU multiplier to get the CPU speed and by the QPI multiplier to get the QPI speed. The Intel Core i7 965 has a 24x multiplier that is used to reach the final core clock of 3.20GHz. As you can see above, the ASUS P6T Deluxe motherboard runs at 133.6MHz, so the overall clock frequency is 7Mhz higher than the processor is rated.
By not touching anything in the BIOS other than the CPU Voltage (Auto to 1.35V) we were able to reach 4GHz right off the bat! This is not bad at all and is nearly an 800MHz overclock for a few seconds worth of work.
With a little extra voltage to the processor and a boost of the QPI from 133MHz to 145MHz we were able to hit 4.2GHz. With the system running at 4.2GHz it wasn’t fully stable, but we feel certain that with a little more effort that 4.2GHz should be easily had on most enthusiast motherboards. The Intel Core i7 series can overclock over 1GHz, which is a great sign for a brand new architecture!
Final Thoughts and Conclusions
Power Consumption
Since power consumption is a big deal these days, we ran some simple power consumption tests on our test beds. The systems ran with the power supplies, case fan, video card and hard drive model. To measure idle usage, we ran the system at idle for one hour on the desktop with no screen saver and took the measurement. For load measurements, POV-Ray 3.7 was run on all cores to make sure each and every processor was at 100% load. All of the systems used identical hardware minus the motherboard and processor. It should be noted that the Core i7 processors used a Thermaltake BigWater 760i water cooler and the rest of the systems used a Corsair Nautilus 500 water cooler.
Results: When it came to idle power consumption the Intel Core i7 series used more power than we expected, but for having such a large cache they didn’t do badly by any means. The entire system with a water cooler was still under 300 Watts, which is impressive for being the fastest quad-core processor in the world.
Final Thoughts
This is just a quick look at the Intel Core i7 processor family performance on a number of respected benchmarks. Expect more deep dives in the weeks to come as we have numerous boards, cooling solutions and memory kits that we are still trying out on this new platform.
The performance numbers speak for themselves as the Intel Core i7 965 Extreme Edition proved itself to be more than 35% faster than the equally clocked Core 2 Extreme QX9770 processor in a number of benchmarks. This is an impressive number and one that may be higher than many expected. When overclocked the Core i7 965 was wickedly fast and ripped through performance tests faster than anything we have ever seen. Nehalem offers obvious clock-for-clock performance improvements and that is something the community must see before making a platform change. Pricing for the three new Intel Core i7 processors is fairly aggressive and the Core i7 965 Extreme Edition comes in at $999, which is the price that the Intel Core 2 Extreme QX9770 used to be.
- Core i7 965 Extreme Edition – 3.2GHz with 8MB Shared L3 cache and a 1×6.4GT/s QuickPath interconnect – $999
- Core i7 940 – 2.93GHz with 8MB Shared L3 cache and a 1×4.8GT/s QuickPath interconnect – $562
- Core i7 920 – 2.66GHz with 8MB Shared L3 cache and a 1×4.8GT/s QuickPath interconnect – $284
Intel has once again launched a great part that once again increases the performance gap between them and AMD. With the Intel Core i7 pulling so far ahead of the AMD Phenom series of processors it almost makes you wonder if AMD will be able to ever catch up.
Legit Bottom Line: The performance benchmarks confirm that the Intel Core i7 series of processors are the real deal and the new platform is solid.