GTX 970 memory bug reportedly cripples performance in memory intensive scenarios

Started by Silver Knight, 26-01-2015

0 Members and 1 Guest are viewing this topic.

Silver Knight

Quotever since the GTX 970 and GTX 980 launched, I've made no secret of preferring the GTX 970 as a gaming solution for the vast majority of people. For $200 less, this GPU seemed to pack nearly all the benefits of Nvidia's Maxwell, with very little downside — until now. Now, reports are surfacing that the GTX 970 may suffer from a serious memory bug that may cripple performance in certain scenarios.

In order to explain things, I need to first cover how GPU memory access is supposed to work. One of the reasons of building a large ring bus and multiple memory controllers around the outside of the die is that memory accesses to every block of RAM are supposed to have the same latencies and take the same amount of time. Accessing the first 500MB of GPU VRAM shouldn't be faster or slower than accessing the last 500MB.

Unfortunately, that's not what the GTX 970 is doing, as evidenced by benchmark results like the below. What follows is a simple RAM bandwidth test of both the DRAM and the L2 cache on the GTX 970 (left) and the GTX 980 (right).

QuoteNote that performance is constant between both cards until you hit the end of the memory pool in both cases. When the GTX 970 hits the 3.2GiB mark, performance craters, falling to less than 20% of its initial rating in some cases. The GTX 980, in contrast, maintains full bandwidth across the entire GPU — 178GB/s for DRAM and 554GB/s for L2 cache.

Nvidia has been looking into the issue and has now issued the following statement:

QuoteThe GeForce GTX 970 is equipped with 4GB of dedicated graphics memory.  However the 970 has a different configuration of SMs than the 980, and fewer crossbar resources to the memory system. To optimally manage memory traffic in this configuration, we segment graphics memory into a 3.5GB section and a 0.5GB section.  The GPU has higher priority access to the 3.5GB section.  When a game needs less than 3.5GB of video memory per draw command then it will only access the first partition, and 3rd party applications that measure memory usage will report 3.5GB of memory in use on GTX 970, but may report more for GTX 980 if there is more memory used by other commands.  When a game requires more than 3.5GB of memory then we use both segments.

We understand there have been some questions about how the GTX 970 will perform when it accesses the 0.5GB memory segment.  The best way to test that is to look at game performance.  Compare a GTX 980 to a 970 on a game that uses less than 3.5GB.  Then turn up the settings so the game needs more than 3.5GB and compare 980 and 970 performance again.

Here's an example of some performance data:



Shadows of Mordor

<3.5GB setting = 2688x1512 Very High



>3.5GB setting = 3456x1944

55fps (-24%)

45fps (-25%)

Battlefield 4

<3.5GB setting = 3840x2160 2xMSAA



>3.5GB setting = 3840x2160 135% res

19fps (-47%)

15fps (-50%)

Call of Duty: Advanced Warfare

<3.5GB setting = 3840x2160 FSMAA T2x, Supersampling off



>3.5GB setting = 3840x2160 FSMAA T2x, Supersampling on

48fps (-41%)

40fps (-44%)

Quotebreslau: if i cant cheat i dont wanna play
breslau: period

Silver Knight

Quotebreslau: if i cant cheat i dont wanna play
breslau: period


Blake.H: And im also working on whipping him into shape
Blake.H: He's nice
Blake.H: He doesn't moan
Blake.H: The sheer obedience is enough to fuel my erection anyway