Comments on: Which GPU(s) to Get for Deep Learning: My Experience and Advice for Using GPUs in Deep Learning

By: Zoran

Zoran — Mon, 14 Apr 2025 18:12:28 +0000

Threadripper 3945wx works almost 2x slower than intel 12400 as per my test.

By: Zoran

Zoran — Fri, 16 Feb 2024 06:01:42 +0000

Where can i find the code for the 8-bit inference?

By: Andrea de Luca

Andrea de Luca — Mon, 18 Dec 2023 18:24:43 +0000

Hi Tim. I think there is a bit of confusion in the article regarding the RTX A6000. You wrote:
“The best GPUs for academic and startup servers seem to be A6000 Ada GPUs (not to be confused with A6000 Turing). ”

The RTX A6000 is a 48gb Ampere card, not Turing. Its performance (in any domain) is slightly better than a 3090, while in the charts it performs equal to the old Turing workstation cards (which is frankly impossible). Other than that, the Ada 48gb workstation card is officially called “RTX 6000 Ada” (without the “A”). Thanks.

By: Andrea de Luca

Andrea de Luca — Mon, 18 Dec 2023 18:24:02 +0000

Hi Tim. I think there is a bit of confusion in the article regarding the RTX A6000. You wrote:
<>

By: Zoran

Zoran — Sun, 30 Apr 2023 15:21:59 +0000

Hello,

Have you had any chance to test high end CPU only interface, for example on Intel 13900K CPU? It seemed like could be pretty fast and a dedicated server with it can be around 100 EUR per month. Much cheaper compared to GPU dedicated server or cloud.

By: David Laxer

David Laxer — Tue, 17 Jan 2023 20:24:30 +0000

Hi Tim,

Thanks for your posts.

Do you have any comments on Apple’s M1/M2 chips for Deep Learning research?
Apple’ Metal Performance Shader API only supports float32 as well as 32bit complex numbers. Do you see this as a
‘show stopper’ for Deep Learning Research for M1/M2 chips.
The M1/M2 processors do use considerably less power then NVIDIA GPUs
does this significantly change the trade-off calculus?

Thanks in advance.

By: Tim Dettmers

Tim Dettmers — Mon, 14 Mar 2022 17:09:07 +0000

In reply to Paulo Ricardo. Yes, I would love to see a Portuguese translation! Go for it. The only thing that I would ask you to do if you translate it section by section is to include the source somewhere in the introduction to the translation.

By: Paulo Ricardo

Paulo Ricardo — Thu, 10 Feb 2022 14:04:39 +0000

What a wonderful text.

Can I translate to Portuguese?

By: Simon Demeule

Simon Demeule — Sun, 07 Nov 2021 06:34:57 +0000

In reply to Tim Dettmers. Hi Tim! Thank you for your insight. After some more research, I have found that in synthetic benchmarks the 5950X actually outperforms the 3975WX in both single and multi core performance (which is very surprising given it has half the core count) — it seems the architecture version (Gen 3 vs Gen 2) plays a large role here. It really doesn't make sense to choose Threadripper Pro over Ryzen for my use case, or at least right now. Gen 3 Threadripper Pro is likely coming out soon, as some benchmark scores have been leaked (and they are definitely scoring higher than Ryzen there). I will likely order the GPUs soon, and maybe wait a bit until the next generation Threadripper Pro CPUs are revealed.

By: Tim Dettmers

Tim Dettmers — Sun, 24 Oct 2021 18:51:51 +0000

In reply to Simon Demeule. Hi Simon! I think the choice of A5000 GPUs is great in your use-case. I think both CPU options can have their advantages. For inference, a CPU can sometimes be critical to achieving good latency, but that can depend on many factors, mainly how much pre and post-processing needs to be done on CPU. Depending on this, it could be that a good CPU has large benefits or not. On the other hand, such performance will not be very noticeable for general use, but might be critical for user-facing applications. So if you aim to build projects that are user-facing and require low latency it might be worth it to go with the 3975WX. However, since CPUs have quirks and many threads are difficult to use, in many cases you will get the same performance for the much cheaper 5950X. As such, it might also be a waste of money. I think I would probably go with the 5950X, and if the user needs to wait for an additional 50ms then the user just needs to wait for that!