GPUs have been optimized to process a large amount of memory from a single location or sequential locations (so-called “streaming operation”); this is in contrast to a CPU designed for random memory accesses [Boyd10]. Moreover, because vertices and pixels are independently processed, GPUs have been architected to be massively parallel; for example, the NVIDIA “Fermi” architecture supports up to 16 streaming multiprocessors of 32 CUDA cores for a total of 512 CUDA cores [NVIDIA09].
//-------------------------------------------------------------------------------------------------------------
GPU 从单一存储单元或者连续存储单元, 处理一大堆内存,已经被最优化了(所谓的“流操作“);相比之下,就把CPU设计成随机内存入口。再者,顶点和像素已经被独立处理,GPU已经被设计成大规模的并行结构。比如,对全部512个CUDA核心,NVIDIA(一家公司) “Fermi”体系结构支持16个包含有32个CUDA核心的流处理器
//-------------------------------------------------------------------------------------------------------------
GPU 从单一存储单元或者连续存储单元, 处理一大堆内存,已经被最优化了(所谓的“流操作“);相比之下,就把CPU设计成随机内存入口。再者,顶点和像素已经被独立处理,GPU已经被设计成大规模的并行结构。比如,对全部512个CUDA核心,NVIDIA(一家公司) “Fermi”体系结构支持16个包含有32个CUDA核心的流处理器