NVIDIA's latest offering is cheaper than the GTX 670 - but is it cheap enough for the performance you get?
By Bennett Ring on August 21, 2012 at 9:30 am
It’s not often that I get to say this, but it’s never been a better time to be a PC gamer. Today’s console dinosaurs have both feet in the silicon grave, leaving the PC to reclaim its place as the enthusiast gamer’s platform of choice. These aren’t merely the claims of a PC gamer desperately trying to justify a maxed-out credit card to his disapproving spouse; even EA recently acknowledged that the PC was the second biggest platform over the last financial quarter, not to mention the fastest growing.
It’s not just about the popularity of PC games though; we’re also spoilt for choice when it comes to hardware. The latest round in the ongoing NVIDIA versus AMD slug-fest has been closer than ever, with both companies trading blows over each successive GPU release. AMD delivered an opening haymaker with the powerful Radeon HD 7970 and 7950 back in December 2011, delivering oodles of performance at the cost of watt-guzzling energy needs. It took a few months for NVIDIA to get back on its feet with a technically perfect uppercut in the Kepler-based GTX 680 and 670 products, exceeding AMD’s performance while also delivering products that didn’t need a fusion reactor for power. AMD’s response? Slashing and burning prices, making the choice of which high-end card to buy harder than ever.
This brings us to today’s magical moment, the release of a new product that could not have existed without such close competition between the two GPU giants. The GeForce GTX 660 Ti is NVIDIA’s offering to those who demand the impossible – high performance at mainstream prices. Once again NVIDIA has wheeled out the impressive Kepler design to do the polygon pushing, but unlike the previous GeForce GTX 560 Ti it hasn’t taken a silicon sledgehammer to the card’s ankles. As you’ll see, the GTX 660 Ti has barely been touched by the engineer’s laser scalpel.
Kepler – the nitty gritty
At the heart of the GTX 660 Ti is the Kepler architecture. First announced in 2010, this design was largely a result of the PC industry’s new love affair with mobility. While core gamers crave the desktop behemoths that make up a tiny percentage of NVIDIA’s business, laptops are far bigger slice of the pie.
NVIDIA’s prior GPU design, Fermi, was a gas-guzzling brick that doubled as a cosy foot heater during the winter months, making it a laptop-melting monster. The new GPU design needed to be as comfortable ripping up Battlefield 3 in a desktop as it was in a laptop. Accordingly, NVIDIA’s biggest goal for Kepler was to improve performance per watt — and the company used a variety of means to arrive at its new eco-friendly destination.
First and foremost was the move to a smaller manufacturing process; smaller transistors need less power to operate and also pump out less heat in the process. Fermi’s transistors were built on a 40 nanometre process, while Kepler adopted a smaller 28 nanometre process. Take a peek inside the Radeon 7970’s GPU and you’ll find the same 28nm process at work, yet AMD’s latest cards pump out exorbitant levels of heat by comparison. Obviously something else helped drop NVIDIA’s electricity bills, and that’s where Kepler’s CUDA cores come into play.
Most of the transistors inside Kepler are grouped into units called CUDA cores, and they’re the worker bees in this silicon hive. To lower each CUDA core’s energy consumption, Kepler runs them at the same speed as the graphics clock, whereas Fermi ran them at twice the speed of the graphics clock — but there’s an obvious problem with this approach. Half the speed usually equals half the performance, so NVIDIA came up with a brute-force solution. Where the GTX 580 Fermi had just 512 CUDA cores, the Kepler found in the GTX 680 has a whopping 1536 CUDA cores. These are in turn grouped into eight Streaming Multiprocessor (SMX) units, each comprised of 192 CUDA cores.
All of these SMX units need to be fed data from the graphics card’s on-board memory, and NVIDIA also whipped Kepler’s memory bandwidth into shape. It doubled the total theoretical memory bandwidth of Fermi by using a 256-bit wide bus while also running the memory at a whopping 6GHz memory clock. This gave the GTX 680 a rather impressive 192.3GB/sec of memory bandwidth, which it shared with the GTX 670.
These are just three of the major improvements to Kepler over Fermi, but there are many other smaller improvements too numerous to mention here. The end result was a product that could out-perform AMD’s best, all while using considerably less energy to get the job done. As desktop gamers with a permanent lifeline to the electricity grid, power improvements might not sound too exciting, unless you’ve had to live with a high-end PC that sounds like a hive of wasps every time your GTX 580 kicks into sixth gear. The Kepler purring at the heart of the GTX 680 delivers blistering performance, with whisper-quiet cooler noise as an added bonus.
When it came time to release the cheaper GTX 670, the solution was simple; cut a single SMX out of the 680’s loop. This also gave NVIDIA a way to use chips that had minor flaws in one SMX unit, something that happens surprisingly frequently when you’re building a postage stamp out of more than three billion transistors. The GPU frequency was also dropped slightly, yet the memory bandwidth remained identical. Now, most techies expected NVIDIA to do exactly the same thing with the GTX 660 Ti, disabling one more SMX unit to deliver a product one more rung down the performance ladder from the GTX 670. If the race between AMD and NVIDIA wasn’t so close, it’s quite likely that the GTX 660 Ti would only have had six SMX units, comprised of a total of 1152 CUDA cores.
The good news is that this isn’t the case.
NVIDIA hasn’t touched the number of CUDA cores within the GTX 660 Ti when compared to the GTX 670 – they’re both built using 7 SMX units, for a total of 1344 CUDA Cores. Even better news is that the GTX 660 Ti GPU runs at the exact same speed as the GTX 670, with a base speed of 915MHz. Thanks to NVIDIA’s new GPU Boost technology, when the GPU detects it has thermal headroom to spare the GTX 660 Ti even boosts to the exact same speed as the GTX 670, up to 980MHz. By now you’re probably wondering just what the hell NVIDIA has done to justify selling the GTX 660 Ti for $100 less than the GTX 670?
Instead of touching the number of CUDA cores, NVIDIA has lowered memory bandwidth. While the GTX 680 and 670 use 256-bit memory buses, the GTX 660 Ti drops this to a 192-bit memory bus. However, NVIDIA has kept the same 6GHz memory frequency, theoretically dropping memory bandwidth by just 25%. This should only be a concern for those running stupidly high resolutions or anti-aliasing.
Unfortunately this has been achieved by dropping one of the four ROP partitions that usually make up the 256-bit memory bus. Without getting too technical, the original four-way ROP design plays nicely with the 2GB of on-board memory usually found on GTX 660 Ti cards, but a 192-bit bus is usually associated with either 1.5GB or 3GB of memory. The former is too little for the 660 Ti to be competitive, while the latter is too expensive at this price point (though 3GB GTX 660 Ti products will be possible). Instead NVIDIA is using proprietary technology to force the ill-fitting memory to be friends with the three ROP partitions, technology which it is keeping secret for the time being.
We’re left with a card that is almost identical to the GTX 670, with slightly less memory bandwidth and a recommended retail price $100 cheaper. As you’ll see though, RRPs don’t mean a whole lot in the real world, and the price pressure on every other product has left the GTX 660 Ti in a rather sticky situation. Let’s see why.
To the testbench
We’d like to thank Gigabyte for supplying our review sample in the form of the new Gigabyte GTX 660Ti Windforce 2X. As seen with the release of the GTX 670, most GTX 660 Ti cards will ship pre-overclocked with a faster base and boost frequency, and the Gigabyte GTX 660 Ti is no exception. Base frequency has been given a healthy shot in the arm, increasing to 1032MHz, while Boost speed has also increased, up to 1111MHz. As the name suggests Gigabyte has employed its unique Windforce 2X cooler on this card, while output duties are handled by dual DVI ports and a single DisplayPort and HDMI port. Gigabyte has set a recommended retail price of $399 for this card, but retailers are currently selling it for anywhere between $398 and $450 at the time of writing.
When compared to AMD’s offerings, today’s price puts it somewhere between the price of an AMD Radeon HD 7950 and 7970, the former of which has just received a healthy 15% performance increase courtesy of a new AMD BIOS. It’s also dangerously close in price to the cheapest GTX 670s on the market.
To ensure the testbench wasn’t the limiting factor in the benchmarks, I tested the cards on a machine out of the price range of mere mortals. At its heart is Intel’s rather zippy and ridiculously expensive i7 3960X CPU, mounted in an Intel DX79SI motherboard with 8GB of DDR3-1800. Continuing the Intel theme was the use of a 520 series SSD, while all audio was disabled for testing. Given that this card is aiming at users with 1920 x 1080 displays, all tests were run at this resolution. Ultra detail was selected for every benchmark, and I used the in-game benchmarks for both DiRT 3 and Shogun 2. Battlefield 3 doesn’t include an automatic benchmark, so instead I recorded the minimum and average FPS of the opening 60 seconds of the Operation: Swordbreaker mission with FRAPS.
So, how did the card perform? As expected, 3DMark 11 performance was extremely respectable, almost matching the more expensive Radeon HD 7970. DiRT 3 sadly wasn’t nearly as impressive, with the 660 Ti left eating the dust of the 7970 but proving an even match for the more affordable Radeon HD 7950. Battlefield 3 performance also saw the 660 Ti keeping pace with the 7950, albeit with a better minimum frame rate. The final test of Shogun 2 saw the GTX 660 Ti record a healthy lead over its more expensive brethren, which caused more than a few raised eyebrows. After pinging my local NVIDIA rep, it turns out that this is caused by a performance fix incorporated into the review drivers for the GTX 660 Ti, which should soon make their way into NVIDIA’s WHQL drivers.
The final test used a sound meter to test fan noise while each card was under extreme load, and the Gigabyte card came out in the middle of the pack. Amazingly, Gigabyte’s cooler managed to stay at the exact same volume during my overclocking tests, where I reached a maximum stable GPU core speed of 1250MHz and memory frequency of 7GHz. At these speeds it’s safe to expect another 10% boost in performance, if not a little more.
To buy, or not to buy
There’s no denying that ordinarily the Gigabyte GTX 660 Ti would be a very capable product for the price. But these are no ordinary times, with the fierce competition between AMD and NVIDIA causing video card prices to plummet. At the time of writing, the cheapest GTX 670 in Australia is selling for just $399, while the most affordable Radeon HD 7970 is only $50 more, both prices that the Gigabyte GTX 660 Ti is currently selling for. At these prices both offer noticeably better performance than the GTX 660 Ti, so they obviously get our nod of approval.
Thankfully the shine will start to wear off the GTX 660 Ti’s launch in a month or two, and street prices should drop to around the $350 price point. But even then it’s going to be hard to swallow a 20% performance drop for a mere $50 saving, so Gigabyte is going to have to figure out how to drop the prices even more. As we’ve seen in the past with the GTX 670 versus the GTX 680, NVIDIA (and Gigabyte’s) greatest competition isn’t even AMD – it’s itself.
- Excellent performance while remaining whisper quiet
- Decent overclocker
- Solid value for money
- Outshone by the GTX 670 and Radeon 7970
- Retailers aren’t sticking to the RRP