With the release of the Titan V, we now entered deep learning hardware limbo. It is unclear if NVIDIA will be able to keep its spot as the main deep learning hardware vendor in 2018 and both AMD and Intel Nervana will have a shot at overtaking NVIDIA. So for consumers, I cannot recommend buying any hardware right now. The most prudent choice is to wait until the hardware limbo passes. This might take as little as 3 months or as long as 9 months. So why did we enter deep learning hardware limbo just now?
NVIDIA has decided that it needs to cash-in on its monopoly position before the competition emerges. It needs the cash in order to defend itself in the next 1-2 years. This is reflected by the choice to price the Titan V at $3000. With TensorCores the Titan V has a new shiny deep learning feature, but at the same time, its cost/performance ratio is abysmal. This makes the Titan V very unattractive. But because there is no alternative, people will need to eat what there are served – at least for now.
The competition is strong. We have AMD whose hardware is now already better than NVIDIA’s and plans to get itself together to produce some deep learning software which is actually usable. With this step, the cost/performance ratio will easily outmatch NVIDIA cards and AMD will become the new standard. NVIDIA’s cash advantage will help fight AMD off so that we might see very cheap NVIDIA cards in the future. Note that this will only happen if AMD is able to push forward with good software — if AMD falters, NVIDIA cards will remain expensive and AMD will have lost its opportunity to grab the throne.
There is also a new contender in town: The Neural Network Processor (NNP) form Intel Nervana. With several unique features, it packs quite a punch. These new features make me drool — they are exactly what I want as a CUDA developer. The NNP solves most problems I face when I want to write CUDA kernels which are optimized for deep learning. This chip is the first true deep learning chip.
In general, for a 1-chip vs 1-chip ranking, we will see Nervana > AMD > NVIDIA, just because NVIDIA has to service gaming/deep learning/high-performance computing at once, while AMD only needs to service gaming/deep learning, whereas Nervana can just concentrate on deep learning – a huge advantage. The more concentrated a designed architecture, the less junk is on the chip for deep learning.
However, the winner is not determined by pure performance, and not even by pure cost/performance. It is determined by cost/performance + community + deep learning frameworks.
Let’s have a closer look at the individual positions of Nervana, AMD, and NVIDIA to see where they stand.
Nervana’s Neural Network Processors
— Intel AI (@IntelAI) October 31, 2017
Nervana’s design is very special mainly due to its large programmable caches (similar to CUDA shared memory) which are 10 times bigger per chip compared to GPUs and 50 times bigger per compute unit compared to GPUs. With this one will be able to design in-cache algorithms and models. This will speed up inference by at least an order of magnitude and one will be able to easily train on terabytes of data with small in-cache deep learning models, say, a multi-layer LSTM with 200 units. This will make this chip very attractive for startups and larger companies. Due to a special datatype, Flexpoint, one is able to store more data in caches/RAM and compute faster yielding even more benefits. All of this could mean speedup of about 10x compared to current NVIDIA GPUs for everybody. But this is only so if the main obstacles can be overcome: Community and software.
For the normal users and researchers, it will all depend on the community. Without community, we will not see in-cache algorithms. Without community, we will not see good software frameworks and it will be difficult to work with the chip. Everybody wants to use solid deep learning frameworks and it is questionable if Neon, Nervana’s deep learning framework, is up for the task. Software comes before hardware. If Nervana only ships pretty chips and does not push the software and community aspect effectively it will lose out to AMD and NVIDIA.
The community and software question is tightly bound to the price. If the price is too high, and students are not able to afford the NNP then no community can manifest itself around it. You do not get robust communities by just catering for industry. Although industry yields the main income for hardware companies, students are the main driver for the community. So if the price is right and students can afford it, then the community and the software will follow. Anything above $3000 will not work out. Anything above $2000 is critical and one would require special discounts for students to create a robust community. An NNP priced at $2000 will be manageable and find some adoption. Anything below $1500 will make Nervana the market leader for at least 2-3 years. An NNP at $1000 would make it extremely tough for NVIDIA and AMD to compete — software would not even be a question here, it follows automatically.
I personally will switch to NNPs if they are priced below $2500. They are just so much superior to GPUs for deep learning and I will be able to do things which are just impossible with NVIDIA hardware. If they are over $2500 then it also reaches my pain point for good hardware. I save up a lot of money to buy hardware — good hardware is just important to me — but I have to live from something.
For usual consumers not only the price will be important, but also how the community is handled. If we do not see Intel immediately pumping resources into the community to start up a solid software machinery then the NNP is likely to stagnate and die off. Unfortunately, Intel has a good history of mismanaging communities — it would be a shame if this happens because I really would like to see Nervana succeed.
In summary, we will see Nervana’s NNP will emerge as a clear winner if it will be priced below $2000 and if we see strong community and software development within the first few months after its release. With a higher price and less community support, the NNP will be strong, but might not be able to surpass other solutions in terms of cost/performance and convenience. If the software and community efforts fail or if the NNP is priced at $4000 it will likely fail. A price above $2000 will require significant discounts for students for the NNP to be viable.
AMD: Cheap and Powerful – If You Can Use It
AMDs cards are incredible. The Vega Frontier Edition series clearly outmatches NVIDIA counterparts, and, from unbiased benchmarks of Volta vs Pascal, it seems that the Vega Frontier will be on-a-par or better compared to a Titan V if it is liquid cooled. Note that the Vega is based on an old architecture while the Titan V is brand new. The new AMD architecture, which will be released in 2018Q3 will increase performance further still.
AMD hopes to advance deep learning hardware by just switching from 32-bit floats to 16-bit floats. This is a very simple and powerful strategy. The chips will not be useful for high-performance computing, but they will be solid for gamers and the deep learning community while development costs will be low because 16-bit float computation is straightforward.
They will not be able to compete in terms of performance with Nervana’s NNP, but the cost/performance might outmatch everything on the market. You can get a liquid cooled Vega Frontier for $700 which might be just a little worse than a $3000 Titan V.
The problem is software. Even if you have this powerful AMD GPU, you will hardly be able to use it – no major framework supports AMD GPUs well enough.
AMD is in limbo itself – in software limbo. It seems they want to abandon OpenCL for HIP but currently they officially still push and support the OpenCL path. If they push through with HIP, and if they put some good deep learning software on the market (not only libraries for convolution and matrix multiplication but full deep learning frameworks, say, HIP support for PyTorch) in the next 9 months then their release of their new GPU in 2018Q3 has the potential to demolish all competitors.
So in summary, if AMD gets its shit together in terms of software, it might become the dominating deep learning hardware solution.
NVIDIA: The Titan
NVIDIA’s position is solid. They have the best software, the best tools, their hardware is good and the community is large, strong and well integrated.
NVIDIA’s main issue is that they have to serve multiple communities: High-performance computing people, deep learning people, and gamers, and this is a huge strain on their hardware. It is expensive to design chips which are custom made for these communities and NVIDIA’s strategy was currently to design a one-size-fits-all architecture. This worked until it didn’t. The Titan V is just mediocre all-around.
With the emerging competitors, NVIDIA has two choices. (1) Push the price on their cards down until they starve the competition to death, or (2) they can develop specialized deep learning hardware on their own. NVIDIA has the resources to pursue the first strategy, and it also has the expertise for the second strategy. A new design, however, will take some time and NVIDIA might lose the throne to another company in the meantime. So we might see both strategies played out a once: Starving competitors so that NVIDIA can compete until their own deep learning chip hits the market.
In summary, NVIDIAs throne is threatened, but it has the resources and the expertise to fight against emerging players. We will probably see cheaper NVIDIA cards in the future and chips which are more specialized for deep learning. If NVIDIA does not lower its prices it might (temporarily) pass the throne to another player.
Deep learning hardware limbo means that it makes no sense to invest in deep learning hardware right now, but it also means we will have cheaper NVIDIA cards, usable AMD cards, and ultra-fast Nervana cards quite soon. It is an exciting time and we consumers will profit from this immensely. But for now, we have to be patient. We have to wait. I will keep you updated as the situation changes.