I know that I'm fascinated by all things related to electronics, but from what point of view, today's Field Programmable Gate Arrays (FPGAs) seem to stand out from the crowd and are really great devices. If in this intelligent era, in this field, you want to have a skill you have not paid attention to FPGA, then the world will abandon you and the era will abandon you.
Introduce GPU and FPGA from several respects.
In terms of peak performance, the GPU (10Tflops) is much higher than the FPGA (<1TFlops). It is still spectacular that tens of thousands of cores on the GPU run at GHz at the same time. The latest GPU peak performance can reach 10 TFlops or more. The GPU architecture has been carefully designed (for example, using deep pipelines, retiming, etc.). The circuit implementation is based on the standard cell library and the critical path can be customized by hand. Even under the necessary circumstances, the semiconductor fab can be designed according to design requirements. Fine-tune the process, so many cores can run at very high frequencies at the same time. Relatively speaking, FPGA design resources are greatly limited. For example, if you want to add a few cores to the GPU, you only need to increase the chip area, but the FPGA will be determined once you have selected the upper limit of the logic resource (Floating-point arithmetic will be in the FPGA. Take up a lot of resources). Moreover, the logic cells in the FPGA are based on SRAM-lookup tables, and their performance will be much worse than the standard logic cells in the GPU. Finally, the FPGA's routing resources are also limited (some lines have to go far around), unlike GPUs that walk ASIC flows around, which also limits performance.
In addition to chip performance, one of the advantages of GPUs over FPGAs is the memory interface. The bandwidth of the GPU's memory interface (traditional GDDR, and more recently, HBM and HBM2) is much better than the FPGA's traditional DDR interface. Well-known server-side machine learning algorithms require frequent memory access.
But in terms of flexibility, FPGAs are far better than GPUs. The FPGA can be used to program the hardware according to a specific application (for example, if there are many addition operations in the application, a large amount of logic resources can be used to implement the adder), but the GPU cannot be modified once it is designed, and cannot be adjusted according to the application. Hardware resources. Most of the current machine learning is suitable for using the SIMD architecture (that is, only one instruction can process large amounts of data in parallel), so a GPU is suitable. However, some applications are MISD (that is, single data needs to be processed in parallel with many instructions. In 2014, Microsoft presented an example of MISD for extracting features in parallel). In this case, an MISD architecture is implemented using FPGAs. Will have an advantage over GPU. However, FPGA programming is not easy for programmers, so in order to allow the machine learning programmers to easily use the FPGA often need to carry out secondary development based on the compiler provided by the FPGA company, these are only large companies can do.
FPGA-implemented machine learning accelerators are architecturally optimized based on specific applications and therefore have advantages over GPUs, but the GPU's operating speed (>1 GHz) has advantages over FPGAs (~200 MHz).
Therefore, for the average performance, it is to see whether the advantages of the FPGA accelerator architecture can compensate for the disadvantages of the operating speed. If the architecture optimization on the FPGA can bring two to three orders of magnitude advantage compared to the GPU architecture, then the FPGA will be better than the GPU in average performance. For example, paper published by Baidu on HotChips shows that the average performance of GPUs is much better than that of FPGAs on the standard batch data SIMD bench such as matrix operations; however, it handles a small number of processing requests on the server side (that is, frequent requests but each time If the requested amount of data and the amount of calculation are not large, the average performance will be better than the GPU.
In terms of power consumption, although the GPU's power consumption (200W) is much larger than the FPGA's power consumption (10W), if you want to compare power consumption, you should compare the power consumption that is required when performing the same efficiency. If the architecture optimization of the FPGA can be done so well that the average performance of an FPGA can be close to a GPU, the total power consumption of the FPGA solution is much smaller than that of the GPU, and the heat dissipation problem can be greatly reduced. Conversely, if twenty FPGAs are needed to achieve the average performance of a GPU, FPGAs have no advantage in terms of power consumption.
The comparison of energy efficiency ratios is also similar. Energy efficiency refers to the energy consumed to complete the execution of the program, and the energy consumption is equal to the power consumption multiplied by the program execution time. Although the power consumption of the GPU is much greater than that of the FPGA, if the FPGA takes several decades to execute the same program than the GPU, then the FPGA has no advantage in terms of energy efficiency; otherwise, if the hardware architecture implemented on the FPGA is optimized For a particular machine learning application, the time required to execute the algorithm is only a few times that of the GPU or even close to the GPU, so the FPGA's energy efficiency is better than that of the GPU.
Induction Coils,Electronic Eye Induction Coils,Rectangular Induction Coils,Induction Coils For Medical Industry
Shenzhen Sichuangge Magneto-electric Co. , Ltd , https://www.scginductor.com