Open-Dragonfly6825

Open-Dragonfly6825 OP t1_j74oes8 wrote

I guess the suitability of the acceleration devices change depending on your specific context of development and/or application. Deep learning is such a broad field with so many applications, it may be reasonable that different applications benefit from different accelerators better.

Thank you for your comment.

2

Open-Dragonfly6825 OP t1_j74ntpw wrote

Hey, maybe it's true that I know my fair lot about acceleration devices. But, until you mentioned it, I had actually forgotten about backpropagation, which is something basic for deep learning. (Or, rather than forget, I hadn't thought about it.)

Now that you mentioned it, it makes so much sense why FPGAs might be better suited but only for inference.

1

Open-Dragonfly6825 OP t1_j72yyst wrote

Could you elaborate on some of the points you make? I have read the opposite to what you say regarding the folliwng points:

  • Many scientific works claim that FPGAs have similar or better power (energy) efficiency than GPUs in almost all applications.
  • FPGAs are considered a good AI technology for embedded devices where low energy consumption is key. Deep Learning models can be trained somewhere else, using GPUs, and, theoretically, inference can be done on the embedded devices using the FPGAs, for good speed and energy efficiency. (Thus, FPGAs are supposedly well-suited for inference.)
  • Modern high-end (data center) FPGAs target 300 MHz clock speeds as base speeds. It is not unusual for designs to achieve performances higher than 300 MHz. Not much higher, though, unless you highly optimize the design and use some complex tricks to boost the clock speeds.

The comparison you make about the largest FPGA being comparable only to small embedded GPUs is interesting. I might look more into that.

1

Open-Dragonfly6825 OP t1_j72s5ov wrote

One question: what do you mean by "kernels" here? It is the CNN operation you do to the layers? (As I said, I am not familiar with Deep Learning, and "kernels" means another thing when talking about GPU and FPGA programming.)

I know about TPUs and I understand they are the "best solution" for deep learning. However, I did not mention them since I won't be working with them.

Why wouldn't GPU parallelization make inference faster? Isn't inference composed mainly of matrix multiplications as well? Maybe I don't understand very well how GPU training is performed and how it differs from inference.

1

Open-Dragonfly6825 OP t1_j72qtao wrote

That actually makes sense. FPGAs are very complex to program, even though the gap between software and hardware programming has been narrowed with High Level Synthesis (e.g. OpenCL). I can see how it is just easier to use a GPU that is simpler to program, or a TPU that already has compatible libraries built for that abstract the low level details.

However, FPGAs have been increasing in area and available resources in recent years. It is still not enough circuitry?

1

Open-Dragonfly6825 OP t1_j72pzlc wrote

FPGAs are reconfigurable hardware accelerators. That is, you could theoretically "syntehthize" (implement) any digital circuit into an FPGA, given that the FPGA has a high enough amount of "resources".

This would let the user to deploy custom hardware solutions to virtually any application, which could be way more optimized than software solutions (including using GPUs).

You could implement tensor cores or a TPU using an FPGA. But, obviously, an ASIC is faster and more energy efficient than its equivalent FPGA implementation.

Linking to what you say, besides all the "this is just theory, in practice things are different" of FPGAs, programming GPUs with CUDA is way way easier than programming FPGAs as of today.

2

Open-Dragonfly6825 OP t1_j72om7m wrote

Maybe I missed it, but the posts I read don't specify that. Some scientific works claim that FPGAs are better than GPUs both for training and inference.

Why would you say they are better only for inference? Wouldn't a GPU be faster for inference too? Or is it just that inference doesn't require high speeds and FPGAs are for their energy efficiency?

1