Archive for March 20, 2011

HD 6970 Review Introduction

It’s finally here! AMD rolled out its latest high-end GPU, codenamed “Cayman”, which tops the Northern Islands, AMD’s second-generation DirectX 11 compliant GPU family. Using this, AMD is initially carving out two enthusiast-grade products: the AMD Radeon HD 6970 (reviewed here), and the Radeon HD 6950, both released today. There’s also scope for a dual-HD 6970-GPU product in the near future, called HD 6990. AMD’s Radeon HD 6970 “Cayman” GPU faced quite a few hiccups en route today’s launch. It was slated for mid-November, but was delayed by a month due to component shortage. Meanwhile, NVIDIA went ahead with a hard-launch of its GeForce GTX 580 graphics processor, and subsequently, the GeForce GTX 570, to counter the HD 6970.

With Cayman and the HD 6970, AMD is introducing its biggest design change for the GPU’s SIMD processing area since Radeon HD 2900 series, it’s also introducing a greater amount of parallelism to the graphics engine, and doubling the standard memory amount from 1 GB in the previous generation Radeon HD 5870 and Radeon HD 5850, to 2 GB on both Radeon HD 6970 and HD 6950. As a brief lesson on AMD’s naming scheme with this generation, Radeon HD 6950 and HD 6970 represent high-end single GPU SKUs, successors to HD 5800 series, while the recently introduced HD 6800 series are in a segment of their own with no definitive predecessors.

The Radeon HD 6970 from HIS we’re reviewing today, sticks to AMD’s reference board design, including adherence to reference clock speeds. With HD 6900 series, AMD made sure that users of all HD 6900, including those which are factory-overclocked, have access to reference clock speeds at the turn of a switch (detailed later down the review). The Radeon HD 6970 features 2 GB of GDDR5 memory, carries clock speeds of 880 MHz core and 1375 MHz (5500 MHz GDDR5 effective); and display outputs including two DVI, one HDMI 1.4a, and two mini DisplayPort 1.2.

Product Positioning

This slide from AMD instantly tells you the amount of damage the surprise hard-launch of NVIDIA GeForce 580 and GTX 570 caused to the HD 6970 and HD 6950 positioning. Take those two out of the equation, and we’re actually seeing the GTX 480 (which has roughly the same performance as GTX 570) being edged past by HD 6970, and HD 6950 way ahead of whatever else is down there from NVIDIA (GTX 470, GTX 460 1 GB).

AMD is still banking on the previous-generation HD 5970 dual-GPU graphics card to hold the performance leadership (which it is loosely holding on to, with the potential of losing it to the GTX 580 with one good GeForce driver snatching that leadership); HD 6970 to be a notch lower in price but somewhere between GTX 570 and GTX 580 in terms of performance.

Radeon
HD 6850
Radeon
HD 5850
GeForce
GTX 470
Radeon
HD 6870
Radeon
HD 5870
Radeon
HD 6950
GeForce
GTX 570
GeForce
GTX 480
Radeon
HD 6970
GeForce
GTX 580
Radeon
HD 5970
Shader units 960 1440 448 1120 1600 1408 480 480 1536 512 2x 1600
ROPs 32 32 40 32 32 32 40 48 32 48 2x 32
GPU Barts Cypress GF100 Barts Cypress Cayman GF110 GF100 Cayman GF110 2x Cypress
Transistors 1700M 2154M 3200M 1700M 2154M 2640M 3000M 3200M 2640M 3000M 2x 2154M
Memory Size 1024 MB 1024 MB 1280 MB 1024 MB 1024 MB 2048 MB 1280 MB 1536 MB 2048 MB 1536 MB 2x 1024 MB
Memory Bus Width 256 bit 256 bit 320 bit 256 bit 256 bit 256 bit 320 bit 384 bit 256 bit 384 bit 2x 256 bit
Core Clock 775 MHz 725 MHz 607 MHz 900 MHz 850 MHz 800 MHz 732 MHz 700 MHz 880 MHz 772 MHz 725 MHz
Memory Clock 1000 MHz 1000 MHz 837 MHz 1050 MHz 1200 MHz 1250 MHz 950 MHz 924 MHz 1375 MHz 1002 MHz 1000 MHz
Price $180 $260 $260 $240 $360 $300 $330 $450 $370 $500 $580

Architecture


Cayman, named after the lovely Cayman islands in the Caribbean, is AMD’s new high-end GPU. It succeeds Cypress, on which were based Radeon HD 5800 series and the dual-GPU HD 5970. Cayman is built on existing 40 nm process at TSMC. Apart from the processor most of the components inside are the same as the ones found in the previous generation GPUs, except that the hierarchy of components is changed to add a degree of parallelism that goes a step ahead of even Barts. The SIMD cores are completely restructured, too.


With Cypress, there was only one graphics engine (that which computes preliminary data and instructions, and passes them on for low-level processing to the SIMD cores), and one dispatch processor that funneled data and instructions down to the two SIMD engine blocks. Barts introduced a degree of parallelism by giving each SIMD engine block its own dispatch processor, instruction and constant caches. Cayman is taking that a step further, by splitting even the graphics engines between the two SIMD engine blocks. This gives dedicated rasterizers, geometry assemblers to each block, but more importantly, doubles the number of tessellation units, with each graphics engine having one.


As mentioned earlier, AMD brought about a radical change in the stream processor design. Compared to the older VLIW5 design in which an SIMD core consisted of four simple and one complex stream processors with some common resources, the new design, dubbed VLIW4, combines four equally-capable complex stream processors, with two of the four getting special functions. Overall, with a stream processor count of 1536, the Radeon HD 6970 clocked at 880 MHz, is able to churn out a single-precision floating point (IEEE754-SP) performance of 2.7 TFLOPs, and double-precision performance (IEEE754-DP) of 675 GFLOPs. The VLIW4 architecture, hence is aimed to increase performance per mm² of die-area. The render back-ends, have also been redesigned to facilitate 2 times faster 16-bit integer and 32-bit floating-point operations.

In a nutshell, the Cayman die measures 389 mm², holding 2.64 billion transistors. It is built on the 40 nm TSMC process. It has 24 SIMD engines spread across two SIMD engine blocks. There are 1536 stream processors in all. There are 96 texture memory units (TMUs), and 32 raster operation processors (ROPs). New, faster memory controllers allow use of new 5.5 Gbps memory chips. The memory bus width is 256-bit, with which the GPU connects to eight 2 Gbit memory chips to archive 2 GB of total memory.

Packaging


HIS uses their standard package design for the Radeon HD 6970.

Contents

You will receive:

  • Graphics card
  • Driver CD + Documentation
  • DVI adapter
  • PCI-Express power cables

The Card


The HIS Radeon HD 6970 is a complete reference design implementation, with the only difference being the sticker on the cooler. Also the card uses the same cooler and PCB as the HD 6950 reference design. 


HD 6970 requires two slots in your system.


The card has two DVI ports, two mini-DisplayPorts and one HDMI port. AMD’s display output logic is clearly superior to what NVIDIA has to offer at this time. Vendors are free to combine six TMDS links into any output configuration they want (dual-link DVI consuming two links) – and use them all at the same time. AMD has also introduced DisplayPort 1.2 support with their new cards which allows the use of a DisplayPort hub to connect multiple monitors, or daisy chain them together.

An HDMI sound device is also included in the GPU. The HDMI interface is HDMI 1.4a compatible which includes Dolby TrueHD, DTS-HD, AC-3, DTS and up to 7.1 channel audio with 192 kHz / 24-bit output. The new revision also brings support for Blu-ray 3D movies which will become important later this year when we will see first Blu-ray 3D titles shipping.


You may combine up to four HD 6950 and HD 6970 cards in CrossFire for increased performance or improved image quality settings.


Here are the front and the back of the card, high-res versions are also available (front, back). If you choose to use these images for voltmods etc, please include a link back to this site or let us post your article.

A Closer Look


The first piece to come off the card is the backplate. It serves no special purpose other than to protect the card from physical damage and spread the heat around a bit. Since there are no memory chips or other important circuitry on this side of the card, there is no need for a backplate to cool them.


The AMD reference cooler uses a big vapor chamber base to transfer heat away quickly from the GPU. In addition to the GPU, you can also see cooling pads for memory and voltage regulation circuitry.


The Radeon HD 6970 uses a 6+8 power input configuration.


AMD has added a small switch near the card that lets you toggle between two VGA BIOSes. The first one is the normal one and can be flashed. The second one acts as backup and is write-protected, so you can not “destroy” it in case of a bad flash. Should you flash your card with the wrong BIOS, you can switch to the backup BIOS to boot the card, then change the switch to the normal BIOS before flashing. This looks like a good system, but I wonder if it’s worth the added cost.


The GDDR5 memory chips are made by Hynix, and carry the model number H5GQ2H24MFR-R0C. They are specified to run at 1500 MHz (6000 MHz GDDR5 effective).


The Radeon HD 6900 Series are the first graphics cards to use the Volterra VT1556. It offers extensive voltage control and monitoring via I2C. At this time no software supports this controller yet, but I am sure this will change in the weeks to come.

AMD’s new Cayman graphics processor is made on a 40 nm process at TSMC Taiwan. It uses approximately 2.64 billion transistors on a die area of 389 mm².

Test System

Test System – VGA Rev. 12
CPU: Intel Core i7 920 @ 3.8 GHz
(Bloomfield, 8192 KB Cache)
Motherboard: Gigabyte X58 Extreme
Intel X58 & ICH10R
Memory: 3x 2048 MB Mushkin Redline XP3-12800 DDR3
@ 1520 MHz 8-7-7-16
Harddisk: WD Caviar Black 6401AALS 640 GB
Power Supply: akasa 1200W
Software: Windows 7 64-bit
Drivers: GTX 570 & 580: 263.09
NVIDIA: 260.99
HD 6900: 8.79.6.2 RC2
ATI: Catalyst 10.11
Display: LG Flatron W3000H 30″ 2560×1600

Benchmark scores in other reviews are only comparable when this exact same configuration is used.

  • All video card results were obtained on this exact system with the exact same configuration.
  • All games were set to their highest quality setting

Each benchmark was tested at the following settings and resolution:

  • 1024 x 768, No Anti-aliasing. This is a standard resolution without demanding display settings.
  • 1280 x 1024, 2x Anti-aliasing. Common resolution for most smaller flatscreens today (17″ – 19″). A bit of eye candy turned on in the drivers.
  • 1680 x 1050, 4x Anti-aliasing. Most common widescreen resolution on larger displays (19″ – 22″). Very good looking driver graphics settings.
  • 1920 x 1200, 4x Anti-aliasing. Typical widescreen resolution for large displays (22″ – 26″). Very good looking driver graphics settings.
  • 2560 x 1600, 4x Anti-aliasing. Highest possible resolution for commonly available displays (30″). Very good looking driver graphics settings.

Aliens vs. Predator


Aliens vs. Predator is based on a merger of the Aliens and the Predators franchise: two legendary alien species that are in conflict with each other, fighting to the death with human marines caught in between. The first person shooter game was developed by Rebellion Studios, who also developed the first AVP PC title and released in February 2010. It was one of the first DirectX 11 games with support for new features like Tesselation, which is why AMD heavily promoted it at the time of their DX 11 card launches. We used the AVP benchmark utility with tesselation and advanced DX11 shadows enabled.

Battlefield: Bad Company 2


Battlefield: Bad Company 2, released in March 2010 by Electronics Arts, is the most successful DirectX 11 title so far. Even though it contains a full single-player campaign during which the player has to work with a squad to secure a secret weapon, the game is most well known for its fast paced, exciting multiplayer squad action. Thanks to a CPU-based Havok physics engine and skillful use of scripting, the game has destroyable objects, vegetation and terrain without requiring NVIDIA PhysX.
We tested the truck chase scene of the second single-player mission at maximum settings with DirectX 11 enabled.

BattleForge


BattleForge, a card based RTS, is developed by the German EA Phenomic Studio. A few months after launch the game was transformed into a Play 4 Free branded game. That move and the fact that it was included as game bundle with a large number of ATI cards made it one of the more well known RTS games of 2009. You as a player assemble your deck before game to select the units that will be available. Your choice can be from forces of Fire, Frost, Nature and Shadow to complement each other.
The BattleForge engine has full support for DX 9, DX 10 and DX 10.1, we used the internal benchmark tool in DirectX 11 mode to acquire our results.

Call of Duty 4


Call of Duty 4 is a first-person shooter that is built on the award winning Call of Duty Series. It is the first version to play in modern times. In a near-future conflict between the United States, Europe and Russia you get to play as a United States Marine and a British SAS operative. The engine is Infinity Ward’s own creation and has true dynamic lighting, depth of field, dynamic shadows and HDR. Even though the game plot is scripted you will find yourself in intense battles, often working together with computer controlled team mates.

Call of Juarez 2


Call of Juarez 2: Bound in Blood is a prequel to the first Call of Juarez game which was one of the first DX10 titles available on the market. This time the plot evolves around two brothers, before each mission you may pick one to play. Your choices affect the game play since both characters have different ways of handling situations and doing combat.
Call of Juarez 2 uses Techland’s Chrome Engine 4 which adds Edge Anti Aliasing as one of the first engines on the market. Edge Anti Aliasing looks similar to normal AA but comes with a considerably reduced performance drop. However, due to the deferred shading design of Edge AA, normal AA can’t be used on top of it.

Crysis


After the tremendous success of Far Cry, the German game studio Crytek released their latest shooter Crysis in 2007. The game was by far the most hyped and anticipated game in 2007, the forums were full of “Can my system run Crysis?” threads because of the high hardware requirements of this game. Just like in Far Cry the plot evolves on a small island with a thick and richly detailed jungle world. A lot of attention has been given to small details like correct physics. For example when you fire on a tree trunk, it will shatter and the tree will fall over leaving a stump behind. Enemies in a car can be stopped by shooting the tire of the car. The game graphics are by far the best ever seen in a PC game so far, yet the game still runs well on most computers.

Warhammer 40,000: Dawn of War 2


Warhammer 40,000: Dawn of War II by Relic Entertainment is an RTS game based on the Warhammer 40,000 universe. Unlike other Dawn of War titles there is no base-building element in the game, you simply command units on the battlefield. Due to the non-linear mission design, the choices which mission and objective you pick to pursue have considerable impact on game play and mission difficulty. A “hero” unit concept adds RPG elements to the game, allowing you to advance the unit in terms of levels and abilities. Dawn of War 2 uses the Essence Engine 2.0, version 1.0 was used in the Company of Heroes Series.

DiRT 2


DiRT 2 is the first game to offer basic DirectX 11 features, even though they are very limited, the title has been used extensively by AMD to market their DX11 products. The game features a large number of different racing events all over the world with tracks ranging from off-road, over stadiums to complex city courses. We chose not to benchmark DX 11 at this time because the number of DX11 effects is not worth the performance hit.

Formula One 2010


F1 2010 is an official implementation of the Formula One 2010 season with accurate teams, drivers and cars. One highlight of the game are the extensive realism options and the detailed weather effects. You pick a driver and get to race over several seasons, constantly improving your skill and trying to impress the big teams to score a contract with them to enjoy the faster car to race for the world championship. The game is based on an improved Dirt 2 engine and features the latest in DirectX 11 technology. We used the highest details setting for our testing.

Far Cry 2


Four years after the success of Far Cry, Ubisoft has published the sequel called Far Cry 2. While the first part was set on an island, Far Cry 2 takes you deep into Africa with game play that resembles Grand Theft Auto much more than the original Far Cry, which was a classical 3D shooter. Ubisoft engineered a completely new 3D engine called “Dunia” which offers a large amount of popular features like DirectX 9 and DirectX 10 support, destructible environments, physics and non-scripted AI while not being as much of a resource hog as Crytek’s CryEngine. We tested the Ranch Medium level at DirectX 10 with highest details.

Tom Clancy’s HAWX


Tom Clancy’s H.A.W.X. is one of the very few recent flight simulator games on the market. Being a console conversion it emphasizes “flight” more than “simulator”. It is set in a near future in which private military companies have begun fighting conflicts for nations with their own military gear. You are playing an elite pilot who was recruited by such a private company. During the game you get to fly over 50 different aircrafts, ranging from the MIG 21 to the mighty F22 Raptor. One notable feature of its engine is the use of GeoEye satellite imagery for terrain generation which offers one of the most realistic incarnations of battlefield terrain available today.

Metro 2033


Metro 2033 is a first-person shooter game that is set in a post apocalyptic Moscow – as the name suggests inside the metro system. You will fight mutants or other humans who like to take away your shelter. The game has many gameplay elements similar to STALKER, also the engine has similar features. This is because two STALKER engine programmers left GSC Game World and started their own company which is now making Metro 2033.
The engine has support for all the latest eye candy like DirectX 11 and Tesselation. Unfortunately it leaves a less than optimized impression, making it a candidate to surpass Crysis for the highest hardware requirements. We tested in DirectX 11 mode with details set to “Very High”.

The Chronicles of Riddick: Assault on Dark Athena


The Chronicles of Riddick: Assault on Dark Athena is a first person shooter game set in a far future. You are Riddick, a notorious space criminal played by Vin Diesel in the movies. Dark Athena continues where Escape from Butcher Bay ended. A major aspect of the game is its tactical use of shadows and stealth so that enemies can’t detect you. Vin Diesel’s voice acting also adds greatly to the game experience.

S.T.A.L.K.E.R. – Clear Sky


STALKER Clear Sky is GSC Gameworld’s prequel to the 2007 hit “STALKER”. Just like in the first part the game is set around the Russian area of Chernobyl and Pripyat, most well known for the nuclear accident that occurred there. You play the role of a mercenary who spends his days in The Zone trying to make a living. The Zone is an area which is affected by so-called anomalies which cause mutants to appear and laws of physics to change. While you investigate these anomalies the plot leads up to the events that happened right before the first game starts. A new in-game faction system encourages you to befriend various groups in The Zone in exchange for information or items. While the graphics of Clear Sky are based on the first Stalker game engine, there are numerous improvements, including support for DirectX10 and depth-of-field/volumetric effects. The 0.0 FPS scores for NVIDIA cards at 2560×1600 are caused by driver crashes which seem to be related to card with 512 MB memory and below. Since it works fine on ATI this is not a game problem but an NVIDIA driver issue.

Core I7 Review

Posted: March 20, 2011 in Core I7, Uncategorized

Intel Core i7 920, 940 and 965 Processor Review

Manufacturer: Intel
Product: Intel Core i7 CPU Series
Date: Mon, Nov 03, 2008 – 12:00 AM
Written By: Nathan Kirsch –nate@legitreviews.com
Share:

The Core i7 Series Arrives

Intel Core i7 Processor - LGA 1366

Intel has finally lifted the embargo on the yet-to-be-launched Intel Core i7 processors and the Intel X58 Express chipset.  Intel strongly believes that this new platform will be the must have work horse for digital media & gaming enthusiasts for many months to come. With so much to talk about this new platform we made the decision to focus just on processor performance for this article and then take a deeper dive at other features in the weeks to come.  This should work out nicely as the processors won’t be available to purchase until later this month and many companies are just now getting us production grade triple channel memory kits and video card drivers for this new platform.

Intel Core i7 Processor - LGA 1366

The Intel Core i7 Processor (known as Nehalem internally) has some very big architecture changes as you can tell from the picture above.  The new Core i7 processor has 1366 pins and as a result the size of the processor, socket and heat sink mounting brackets are all larger than LGA 775 based processors that have been out now for a couple of years. The die size of Core i7 processors is 263 mm2 and the transistor count is 731 Million.

Intel Core i7 Nehalem Die Diagram

Taking a look at the die of the Core i7 processor we see a first for Intel processors — the integrated memory controller. This on-die, triple channel, DDR3 memory controller is unique in the fact that it allows consumers to run three memory modules together for optimal performance. By moving to an integrated memory controller and triple channel memory the platform has over 25GB/s of throughput between the processor and DDR3 memory modules!

For those that follow processor architecture you will notice a brand new cache structure on the Core i7 diagram shown above. All Intel Core i7 processors feature L1, L2, and shared L3 caches. Before, Intel Core 2 Duo and Quad processors had just an L1 and L2 cache. The break down on the cache is as follows: there is a 64K L1 cache (32K Instruction, 32K Data) per core, 1MB of total L2 cache, and an impressive 8MB chunk of L3 cache that is shared across all the cores. That means that all Intel Core i7 processors have over 9MB of memory right there on the 45nm processor!

Can it get any better than this?

Intel Core i7 965 Performance Features

Of course it can! The new Core i7 processor has a huge list of improvements that have been made to it.

  • New SSE4.2 Instructions
  • Improved Lock Support
  • Additional Caching Hierarchy
  • Deeper Buffers
  • Improved Loop Streaming
  • Simultaneous Multi-Threading
  • Faster Virtualization
  • Better Branch Prediction

Intel always told us that Hyper-Threading was not dead and they were right as the technology has surfaced again and is enabled on all of the Core i7 processors. With Hyper-Threading enabled on quad-core Core i7’s processors the operating system sees eight virtual cores that can be used. Intel has told Legit Reviews that when Hyper-Threading originally came out the idea was solid, but that the Pentium 4 processor might not have been the best processor to bring it to market.  The Core i7 series should highlight all the strong points of Hyper-Threading as they are calling it Hyper-Threading “done right” now.  If you want a deeper look at the Intel Core i7 architecture take a look at this presentation that was given at the Spring 2008 IDF and this one that was given at the Fall IDF.

Intel will be releasing three Core i7 processors and all have a TDP of 130W and an on-die shared L3 cache of 8MB. All current Core i7 processors are not intended for multi-processor motherboards, so it has only one Quick Path Interconnect (QPI).

  • Core i7 965 Extreme Edition – 3.2GHz with 8MB Shared L3 cache and a 1×6.4GT/s QuickPath interconnect – $999
  • Core i7 940 – 2.93GHz with 8MB Shared L3 cache and a 1×4.8GT/s QuickPath interconnect – $562
  • Core i7 920 – 2.66GHz with 8MB Shared L3 cache and a 1×4.8GT/s QuickPath interconnect – $284

Now that we know what the general processor improvements are let’s take a closer look at the chipset changes.

The Intel X58 Express Chipset

In order to understand this new platform it is best to look at the motherboard chipsets that are going to be used.

The Intel X58 Express Block Diagram

The Intel X58 Express chipset is the chipset that was designed just for the Intel Core i7 series of processors as they require a new socket. Since the DDR3 memory controller is located inside the processor itself hundreds of new pins had to be added and the result was a larger CPU with more pins.  Intel designed the X58 Express chipset from the ground up for Core i7, but re-used the ICH10/ICH10R southbridge chipset that has been out for several months now.

The Intel X58 Express Block Chipset

Together the Intel X58 Express chipset and the ICH10 Southbridge make up what is certain to be a very solid platform to use on high performance systems.  The Intel ICH10/ICH10R Southbridge was launched with the Intel P45 Express chipset and has already proven itself a winner with some of the best Solid State Drive performance numbers of any chipset on the market.  The X58 Express supports up to 36 lanes of PCI Express 2.0 connectivity, and since many boards using these chipsets will have both NVIDIA SLI and ATI CrossFire enabled it will mean that Triple-SLI and Quad CrossFireX will be easy to implement.  This is due to the fact that NVIDIA is allowing motherboard makers to use a special sBIOS if they pay a licensing fee for SLI Technology. So, finally multi-GPU technology from both graphics card companies can be used on the same board. If that isn’t enough Intel has done away with the Front Side Bus and now has the Quick Path Interconnet to handle the flow of data between the processor and the chipset.  The memory now has over 25.5 GB/s of throughput since it now has a direct connection to the processor.

The Test System

Before we look at the numbers, here is a brief glance at the test system that was used.

The Intel Core i7 Test System

All testing was done on a fresh install of Windows Vista Ultimate 64-bit. All benchmarks were completed on the desktop with no other software programs running. All of the modules were run in dual channel mode with a 120mm fan placed on top of them to keep them cool except for the Core i7 system that was run in triple channel. The EVGA GeForce 8800 GTS 512MB used NVIDIA ForceWare 169.28 video card drivers and the. The LGA 775 test system used the ASUS P5E3 motherboard using BIOS version 1201 and the LGA 1366 test system used the ASUS P6T Deluxe motherboard with BIOS v8004. The AMD Phenom testing was done on the MSI K9A2 Platinum motherboard with BIOS v1.5b5 installed along with ATI system driver version 8.50.

Memory Settings:

  • Core i7 920, 940, 965 – 1600MHz @ 8-8-8-24 (DDR3)
  • QX9775 – 800MHz @ 5-5-5-15 (FB-DIMM)
  • QX9770 – 1600MHz @ 7-7-7-20 (DDR3)
  • Q9300 – 1333MHz @ 7-7-7-20 (DDR3)
  • QX6850 – 1333MHz @ 7-7-7-20 (DDR3)
  • Q6600 – 1066MHz @ 7-7-7-20 (DDR3)
  • E8500 – 1333MHz @ 7-7-7-20 (DDR3)
  • E7200 – 1066MHz @ 7-7-7-20 (DDR3)
  • E6750 – 1333MHz @ 7-7-7-20 (DDR3)
  • Phenom X4 9950 – 800MHz @ 4-4-4-12 (DDR2)
  • Phenom X4 9850 – 800MHz @ 5-5-5-15 (DDR2)
  • Phenom X4 9600 – 800MHz @ 5-5-5-15 (DDR2)
  • Phenom X4 9350e – 800MHz @ 4-4-4-12 (DDR2)
  • Phenom X3 8750 – 800MHz @ 5-5-5-15 (DDR2)
  • Athlon 64 X2 5000+ – 800MHz @ 4-4-4-12 (DDR2)

Here is the Intel LGA 1366 Test platform:

Intel Test Platform
Component Brand/Model Live Pricing
Processor See Above
Motherboard
ASUS P6T Deluxe
Memory
6GB Corsair DDR3 1600MHz
Video Card EVGA GeForce 8800 GTS 512
Hard Drive Western Digital RaptorX 150GB
Cooling Thermaltake BigWater 760i
Power Supply Corsair HX1000W
Operating System Windows Vista Ultimate 64-Bit

Here is the Intel LGA 775 Test platform:

Intel Test Platform
Component Brand/Model Live Pricing
Processor See Above
Motherboard
ASUS P5E3 Deluxe
Memory
4GB Corsair DDR3 1800C7
Video Card EVGA GeForce 8800 GTS 512
Hard Drive Western Digital RaptorX 150GB
Cooling Corsair Nautilus 500
Power Supply PC Power and Cooling 1KW
Operating System Windows Vista Ultimate 64-Bit

Here is the Intel Skulltrail Test platform:

Skulltrail Test Platform
Component Brand/Model Live Pricing
Processor 2x Intel Core 2 QX9775
Motherboard
Intel D5400XS
Memory
4GB Micron 800MHz FB-DIMM
Video Card EVGA GeForce 8800 GTS 512
Hard Drive Western Digital RaptorX 150GB
Cooling Zalman AT Fan/Heatsink
Power Supply PC Power and Cooling 1KW
Operating System Windows Vista Ultimate 64-Bit

The AMD Phenom X4 9950 Processor Test System

Here is the AMD Phenom Test platform:

AMD Test Platform
Component Brand/Model Live Pricing
Processor All AM2 and AM2+ CPUs
Motherboard
MSI K9A2 Platinum
Memory
4GB OCZ Flex PC2-6400
Video Card EVGA GeForce 8800 GTS 512
Hard Drive Western Digital RaptorX 150GB
Cooling Zalman AT Fan/Heatsink
Power Supply PC Power and Cooling 1KW
Operating System Windows Vista Ultimate 64-Bit

Sandra 2009 Memory Bandwidth

Sisoft; Sandra 2009:

Sisoftware Sandra 2009

The Sisoft Sandra 2009 benchmark utility just came out recently and we have started to include it in our benchmarking. With Sandra 2009 you can now easily compare the performance of the tested device with its speed and its (published) power (TDP)! Sandra XII SP2 also has SSE4 (Intel) and SSE4A (AMD) benchmark code-paths, which is great for those of you testing next-generation AMD & Intel chips.

Sandra XII SP1 Benchmark Scores

Results: Sandra 2009 showed that the Intel Core i7 processors blow away the competion thanks to the new memory design being used.  The Core i7 platform used three 2GB memory modules in Triple-Channel at 1600MHz with 8-8-8-24 1T timings, which is what we think will become the standard kit for this platform.  Corsair already has announced 1866MHz CL9 kits for this platform and Kingston Technology has announced 2GHz 3GB kits, so enthusiasts will easily break the 30GB/Sec mark with high performance memory kits.

Photodex ProShow Gold 3.2

ProShow Gold allows the user to combine photos, videos and music to create spectacular slide shows. The software provides the capability to share memories with friends and family on DVD, PC and the Web. ProShow Gold brings still photos to life by adding motion effects like pan, zoom, and rotate. The user can also add captions to a photo or video and choose from over 280 transition effects.

Photodex Proshow Gold 3.2 Benchmark Settings

The workload we are using takes 29 high resolution jpeg photos and converts them to an mpeg2, widescreen DVD quality, 3min 9sec slideshow video file. The input photos are in 3872×2592 resolution and total about 170MB in size.

Photodex Proshow Gold 3.2 Benchmarking

ProShow Gold 3.2 lets you share your slide shows in virtually any format and on any device. You can upload your shows directly to YouTube or choose from over 20 devices to directly output to including the iPod, Blackberry, ZuneTM and more. Not bad for software that runs under $70 and is optimized for eight-cores! Our benchmark testing wasn’t at 100% load the entire time, but averaged around 95% during the testing period.

Photodex Proshow Gold 3.2 Benchmark Results

Benchmark Results: Photodex Proshow software showed that the Intel Core i7 quad-core processors do well with Hyper-Threading, but it wasn’t enough to pass up the true 8-core QX9775 platform. The 3.2GHz Intel Core i7-965 was 11 seconds faster than the Intel Core 2 Quad QX9770, which is very impressive as they offer the same clock frequency.

Microsoft Excel 2007

Microsoft Office Excel 2007 is a powerful and widely used tool with which you can create and format spreadsheets, and analyze and share information to make more informed decisions. It allows you to import, organize and explore massive data sets within spreadsheets and then communicate your analysis with professional-looking charts. Excel 2007 also provides tools to “see” important trends and find exceptions in your data. Legit Reviews has two benchmarking tests that we do on Microsoft Office Excel 2007.

Microsoft Excel 2007 Testing

The first workload executes approximately 28,000 sets of calculations using the most common calculations and functions found in Excel. These include common arithmetic operations like addition, subtraction, division, rounding and square root. It also includes common statistical analysis functions such as Max, Min, Median and Average. The calculations are performed after a spreadsheet with a large dataset is updated with new values and must re-calculate many data points. The input file is the 6.2 MB spreadsheet seen above.

Microsoft Excel 2007 Benchmark Results

Benchmark Results: Lots of people use Microsoft Office at work and home, so this is an important test for many of our readers. Many people don’t run 28,000 sets of calculations at once, but if you do the CPU will determine how fast the task is completed.

The Black-Scholes model is used in our second Excel test to calculate a theoretical call and put price using the five key determinants of an option’s price: stock price, strike price, volatility, time to expiration, and short-term (risk free) interest rate.

Microsoft Excel 2007 Testing

This workload calculates the European Put and Call option valuation for Black-Scholes option pricing using Monte Carlo simulation. It simulates the calculations performed when a spreadsheet with input parameters is updated and must recalculate the option valuation. In this scenario we execute approximately 300,000 iterations of Monte Carlo simulation. In addition, the workload uses Excel lookup functions to compare the put price from the model with the historical market price for 50,000 rows to understand the convergence. The input file is a 70.1 MB spreadsheet and with 10 times the calculations of the first test, this one should take a bit longer to complete.

Microsoft Excel 2007 Benchmark Results

Benchmark Results: With 300,000 iterations of Monte Carlo simulation taking place in this benchmark it takes all the processors a bit longer to finish as it puts a good load on the system.  The Intel Skulltrail system is in a league of its own as it completes the task in less than ten seconds, but the Core i7 processors are right behind.

Cinebench R9.5

MAXON; CINEBENCH 9.5:

CINEBENCH is the free benchmarking tool for Windows and Mac OS based on the powerful 3D software CINEMA 4D. Consequently, the results of tests conducted using CINEBENCH 9.5 carry significant weight when analyzing a computer’s performance in everyday use. Especially a system’s CPU and the OpenGL capabilities of its graphics card are put through their paces (even multiprocessor systems with up to 16 dedicated CPUs or processor cores). During the testing procedure, all relevant data is ascertained with which the performance of different computers can subsequently be compared, regardless of operating system. Again, higher Frames/Second and lower rendering time in seconds equal better performance.

Cinebench 9.5 Benchmarking

Cinebench 9.5 was able to put a 100% load across all the cores, which makes this a great benchmark to look at multi-core platforms.

Cinebench 9.5 Benchmark Results

Benchmark Results: Cinebench 9.5 was tested in both 64-bit and 32-bit, which resulted in some minor performance differences as seen above. The Intel Core i7 family of processors showed some nice performance gains over the current generation quad-core processors!

Cinebench R10

MAXON; CINEBENCH R10:

CINEBENCH is the free benchmarking tool for Windows and Mac OS based on the powerful 3D software CINEMA 4D. Consequently, the results of tests conducted using CINEBENCH 10 carry significant weight when analyzing a computer’s performance in everyday use. Especially a system’s CPU and the OpenGL capabilities of its graphics card are put through their paces (even multiprocessor systems with up to 16 dedicated CPUs or processor cores). The test procedure consists of two main components: The first test sequence is dedicated to the computer’s main processor. A 3D scene file is used to render a photo reaslistic image. The scene makes use of various CPU-intensive features such as reflection, ambient occlusion, area lights and procedural shaders. In the first run, the benchmark only uses one CPU (or CPU core), to ascertain a reference value. On machines that have multiple CPUs or CPU cores, and also on those who simulate multiple CPUs (via HyperThreading or similar technolgies), MAXON CINEBENCH will run a second test using all available CPU power. Again, higher Frames/Second and lower rendering time in seconds equal better performance.

Cinebench 10

Cinebench R10 was able to put a 100% load across all the cores on all of the processors, which makes this a great benchmark to look at multi-core platforms.

Cinebench R10 Results

Results: Running Cinebench R10 in 64-bit mode showed a significant improvement in performance on all of the processors and the results were in-line with what we expected from running Cinebench R9.5!  The Intel Core i7 965 was 27% quicker than the Intel Core 2 Quad QX9770 and both are the same clock frequency!

POV-Ray 3.7 Beta 25

Processor Performance on Pov-Ray 3.7 Beta 25:

The Persistence of Vision Ray-Tracer was developed from DKBTrace 2.12 (written by David K. Buck and Aaron A. Collins) by a bunch of people (called the POV-Team) in their spare time. It is an high-quality, totally free tool for creating stunning three-dimensional graphics. It is available in official versions for Windows, Mac OS/Mac OS X and i86 Linux. The POV-Ray package includes detailed instructions on using the ray-tracer and creating scenes. Many stunning scenes are included with POV-Ray so you can start creating images immediately when you get the package. These scenes can be modified so you do not have to start from scratch. In addition to the pre-defined scenes, a large library of pre-defined shapes and materials is provided. You can include these shapes and materials in your own scenes by just including the library file name at the top of your scene file, and by using the shape or material name in your scene. Since this is free software feel free to download this version and try it out on your own.

The most significant change from the end-user point of view between versions 3.6 and 3.7 is the addition of SMP (symmetric multiprocessing) support, which, in a nutshell, allows the renderer to run on as many CPU’s as you have installed on your computer. This will be particularly useful for those users who intend on purchasing a dual-core CPU or who already have a two (or more) processor machine. On a two-CPU system the rendering speed in some scenes almost doubles. For our benchmarking we used version 3.7 beta 25, which is the most recent version available.  The benchmark used all available cores to complete the render.

Pov-Ray 3.7 Beta 25

Once rendering on the object we selected was completed, we took the score from dialog box, which indicates the average PPS for the benchmark. A higher PPS indicates faster system performance.

Pov-Ray 3.7 Beta 25

Benchmark Results: Looking at POV-Ray 3.7 Beta 25, the Intel Core i7-965 was over 30% faster than the QX9770 and 56% faster than the quickest processor AMD offers.

POV-Ray Real-Time Raytracing

Legit Reviews was e-mailed by one of the developers over at POV-Ray to see if LR could include real-time raytracing in our performance analysis, and we were more than happy to include the data in our testing.

E-Mail From POV-Ray — I thought I might ping you about an experimental feature we’ve added to the POV-Ray SMP beta: real-time raytracing. It’s mostly useful to folks who have multi-core systems and in fact is something that I’ve wanted to do for years but the hardware just wasn’t there (at least not in the consumer price range). It works best on a kentsfield or later, but a core 2 duo should be sufficient if you don’t mind sub-10fps frame rates.

If you want to try it out it please feel free to grab it from: http://www.povray.org/beta/rtr/

POV-Ray real-time raytracing

This experimental software by POV-Ray was a welcomed addition to our testing and was able to spread the work load across all the cores in even our eight core test system as seen above.

POV Ray RTR Benchmark Chart

Results: POV-Ray Real-Time Raytracing is a great benchmark that we love to use on Legit Reviews and it does a great job at showing how performance scales with CPU cores. The Core i7 series really struts their stuff with Real-Time Raytracing as all three processors rendered the scene over 20FPS.

Futuremark 3DMark06

Futuremark 3DMark 2006

3DMark06

Futuremark’s 3DMark06 has a built-in CPU test is a multi-threaded DirectX gaming metric that’s useful for comparing relative performance between similarly equipped systems. This test consists of two different 3D scenes that are processed with a software renderer that is dependent on the host CPU’s performance. Calculations that are normally reserved for your 3D accelerator are instead sent to the CPU for processing and rendering. The frame-rate generated in each test is used to determine the final score.

Futuremark CPU Benchmark Results

Futuremark CPU Benchmark Results

Benchmark Results: The 3DMark 2006 CPU test showed that the Intel Core i7 920, 940 and 965 are hands down the fastest Intel quad-core processors we have ever seen!  The pair of 9775 quad-core processors were still the overall leaders, but they cost twice as much and require an expensive dual-socket motherboard.

Overclocking Results

Overclocking greatly varies due to what hardware is being used and who is doing the overclocking. Always remember that no two pieces of hardware will perform the same, so our results will differ from what you might be able to get.

Intel Core i7 965 Processor Overclocking

Using the ASUS P6T motherboard with BIOS v8004 we pushed the limits of our early revision processor to see what it could do.  At stock settings the Intel Core i7 965 processor runs with a 133MHz baseclock that is multiplied by the CPU multiplier to get the CPU speed and by the QPI multiplier to get the QPI speed. The Intel Core i7 965 has a 24x multiplier that is used to reach the final core clock of 3.20GHz.  As you can see above, the ASUS P6T Deluxe motherboard runs at 133.6MHz, so the overall clock frequency is 7Mhz higher than the processor is rated.

Intel Core i7 965 Processor Overclocking

By not touching anything in the BIOS other than the CPU Voltage (Auto to 1.35V) we were able to reach 4GHz right off the bat!  This is not bad at all and is nearly an 800MHz overclock for a few seconds worth of work.

Intel Core i7 965 Processor Overclocking

With a little extra voltage to the processor and a boost of the QPI from 133MHz to 145MHz we were able to hit 4.2GHz.  With the system running at 4.2GHz it wasn’t fully stable, but we feel certain that with a little more effort that 4.2GHz should be easily had on most enthusiast motherboards.  The Intel Core i7 series can overclock over 1GHz, which is a great sign for a brand new architecture!

Final Thoughts and Conclusions

Power Consumption

Since power consumption is a big deal these days, we ran some simple power consumption tests on our test beds. The systems ran with the power supplies, case fan, video card and hard drive model. To measure idle usage, we ran the system at idle for one hour on the desktop with no screen saver and took the measurement. For load measurements, POV-Ray 3.7 was run on all cores to make sure each and every processor was at 100% load. All of the systems used identical hardware minus the motherboard and processor. It should be noted that the Core i7 processors used a Thermaltake BigWater 760i water cooler and the rest of the systems used a Corsair Nautilus 500 water cooler.

Power Consumption Results

Results: When it came to idle power consumption the Intel Core i7 series used more power than we expected, but for having such a large cache they didn’t do badly by any means. The entire system with a water cooler was still under 300 Watts, which is impressive for being the fastest quad-core processor in the world.

Intel Core i7 Retail Heatsink

Final Thoughts

This is just a quick look at the Intel Core i7 processor family performance on a number of respected benchmarks. Expect more deep dives in the weeks to come as we have numerous boards, cooling solutions and memory kits that we are still trying out on this new platform.

The performance numbers speak for themselves as the Intel Core i7 965 Extreme Edition proved itself to be more than 35% faster than the equally clocked Core 2 Extreme QX9770 processor in a number of benchmarks.  This is an impressive number and one that may be higher than many expected.  When overclocked the Core i7 965 was wickedly fast and ripped through performance tests faster than anything we have ever seen.  Nehalem offers obvious clock-for-clock performance improvements and that is something the community must see before making a platform change. Pricing for the three new Intel Core i7 processors is fairly aggressive and the Core i7 965 Extreme Edition comes in at $999, which is the price that the Intel Core 2 Extreme QX9770 used to be.

  • Core i7 965 Extreme Edition – 3.2GHz with 8MB Shared L3 cache and a 1×6.4GT/s QuickPath interconnect – $999
  • Core i7 940 – 2.93GHz with 8MB Shared L3 cache and a 1×4.8GT/s QuickPath interconnect – $562
  • Core i7 920 – 2.66GHz with 8MB Shared L3 cache and a 1×4.8GT/s QuickPath interconnect – $284

Intel has once again launched a great part that once again increases the performance gap between them and AMD.  With the Intel Core i7 pulling so far ahead of the AMD Phenom series of processors it almost makes you wonder if AMD will be able to ever catch up.

Legit Reviews Editor's Choice

Legit Bottom Line: The performance benchmarks confirm that the Intel Core i7 series of processors are the real deal and the new platform is solid.