Besides the photographs of graphic accelerator NVIDIA tesla, during the official announcement of graphic processor GT300 was shown new information concerning the architecture with the code name fermi. On the slides are demonstrated the general characteristics of this card, the organization of stream multiprocessors and their structure, results of productivity in floating-point calculations and other.
The graphic processor new generation GT300 contains three billion transistors, it has 512 shader core  and ensures productivity in the calculations with double precision, eight times exceeding productivity GT200 chip .
Stream multiprocessors are located around the general cache of the second level. On the slide, each multiprocessor is the vertical rectangle, which contains the orange part (planner and organizer), the green part (executive modules) and the blue parts ( registers files and first level cache  ).
On the following slide, the internal structure of multiprocessor is shown. Each of 16 multiprocessors has 32 shader core.
Concerning memory, new GPU has six 64-bit GDDR5 memory controllers   , this gives the 384- bit memory bus and support up to 6 GB GDDR5 memory. Fermi is the first architecture, which supports the errors correction code  (ECC) for data, which are stored in the memory. The technology NVIDIA parallel dataCache considerably accelerates mathematical calculations and fulfillment of other functions.
On the slide is shown the comparison of productivity in the calculations of floating-point numbers with double precision between tesla c1060 and new model on fermi architecture  . In the test with 20480 objects the novelty shows the result of 18,16 frames per second, producing in a second 7,61 billion iterations. Its predecessor is capable only to 3,52 sequences per second,  1,47 billion iterations per second.
The solutions on fermi architecture are called first in the world computational GPU. Because the collection of parallel thread instruction eXecution second generation (PTX 2.0), in them is realized the hardware support of programming as C, C++, Fortran and other function sets (such as the standardized address space, OpenCL and DirectCompute).
The basic task Of fermi is considered the transfer of calculations on the large data array to GPU.
It is expected that in the next months NVIDIA company  will finish the work on GT300 chip.
In conclusion, we give the comparative table of GT300 chip characteristics and  its predecessors:
GPU |
G80 |
GT200 |
GT300 |
Transistors |
681 million |
1,4 billion |
3,0 billion |
Stream processors |
128 |
240 |
512 |
Calculations double precision |
— |
30 FMA/time |
256 FMA/time |
Calculations  single accuracy |
128 MAD/time |
240 MAD/time |
512 FMA/time |
Warp- planners |
1 |
1 |
2 |
Special functional modules (SFU) |
2 |
2 |
4 |
Divided memory |
16 KB |
16 KB |
to 48 KB |
Caches L1 |
— |
— |
to 48 KB |
Caches L2 |
— |
— |
768 KB |
Support ECC memory |
— |
— |
+ |
Competitory Core |
— |
— |
to 16 |
Width of  address |
32 bits |
32 bits |
64 bits | Related Products :
|