Blog Archive

Blog Archive

Tuesday, March 27, 2012

postheadericon Kepler Unveiled: Nvidia's GTX 680 Benchmarked In-Depth!

Johannes Kepler wrote: "Nature uses as little as possible of anything."

recent GPU, code-named Kepler after the German mathematician, seems inspired by the budget, and by Kepler's mathematical prowess of origin. The new GPU GTX 680 - offers excellent graphics power, but only requires two 6-pin PCI Express power connectors. This is an excellent starting point for the next generation GTX 580, which was fast but power hungry.

Discussion on performance

bit, but first let's look at the underlying architecture of Kepler.

small equals high

Kepler GPUs are built with a 28nm manufacturing process, allowing the construction of circuits Nvidia in the field of dying younger.

As Fermi, Kepler is a modular architecture, allowing Nvidia to scale the design up or down by adding or subtracting the functional blocks. In Fermi streaming multiprocessors, or SMS, for short, are the basic building blocks that built the line GTX 500 GPU. Basic figures CUDA in SMS may vary. For example, each block in the GTX 560 Ti SM contained 48 CUDA cores, while the GTX 580 SM was built with 32 cores. The GTX 580, however, had a total of 16 SMS of 32 cores each, for a total of 512 CUDA cores.

Kepler

function block is the SMX. Kepler is based on the 28nm GPU, Nvidia allowing architects to develop things a little differently. So Nvidia has increased the number of cores in an SMX Kepler to 192 CUDA cores each impressive.

The GTX 680 GPU

is built from blocks of eight SMX, arranged in groups of couples called GPC (Graphics Performance groups). This gives the GTX 680 up to 1536 CUDA cores.

The SMX not only the home of CUDA cores, however. SMX built in each polymorph is the new engine, which contains the tessellation hardware engine characteristics, and related configuration. Also included are 16 texture units. This gives the GTX 680 of 128 texture units (compared to 64 texture units built GTX 580). Interestingly, the cache has changed a bit, each SMX still has 64 KB of L1 cache, some of which can be used as shared memory for GPU computing. But this means that the total L1 cache has decreased slightly, as there are only eight units of SMX GTX 680, not 16 as with the GTX 580. The L2 cache is smaller than 512 KB instead of 768Kb Fermi.

Another interesting change is that before decoding and dependency checking downloaded the software, while Fermi is handled in hardware. What Nvidia got in return was more effective and more space given instructions. Interestingly, the number of transistors on the GTX 680 GPU is 3.5 million, only some $ 3 million of the GTX 580. The chip size is reduced, however, a much more manageable 294mm2, however, Sandy Bridge 32 nm Intel quad core CPU is 216mm2.

Textures

, antialiasing, and more

One of the coolest features of a few new textures bindless real application. Before Kepler, NVIDIA GPUs were limited to 128 simultaneous textures, Kepler is increased by allowing textures to be allocated as needed, in the shader program, with up to 1 million simultaneous textures available. It is doubtful that games that use lots of textures, but certain types of architectural representation might benefit.

Nvidia is incorporating into its mode of antialiasing FXAA property, but added a new method called TXAA. The "T" means "temporary". TXAA in its standard mode, is actually a variant of multiple sampling, 2x AA, but the sampling pattern varies with time (ie several images.) The result is a better edge quality than even 8x MSAA, but the performance hit is more like 2x multisampling.

Another cool new feature that finally also supported by the Nvidia GPU Vsync is more adaptable. Today, if you lock the vertical sync to the refresh rate of your monitor (typically 60 Hz, but some as high as 120 Hz screens), you win the game softer. However, you can see a stutter, as the frame rate drops to 30 fps or less, because output frame is closed to vsync. On the other hand, if you stay with vsync off, you can see the ripping part, as new frames are sent to the screen before the old is complete.

Adaptive

Vsync locks the frame rate to a vertical refresh rate, until the driver detects the running speed falls below the refresh rate. Vsync is temporarily disabled until the frame rate rises above the monitor refresh rate. The end result is a much smoother performance from the perspective of the user.


Find best price for : --EVGA----vsync----Adaptive----TXAA----Fermi----CUDA----Nvidia----GPUs----Kepler--

0 comments:

About Me