In my last blog post I explained what model and data parallelism is and analysed how to use data parallelism effectively in deep learning. In this blog post I will focus on model parallelism.
In my last blog post I showed what to look out for when you build a GPU cluster. Most importantly, you want a fast network connection between your servers and using MPI in your programming will make things much easier than to use the options available in CUDA itself.
In this blog post I explain how to utilize such a cluster to parallelize neural networks in different ways and what the advantages and downfalls are for such algorithms. The two different algorithms are data and model parallelism. In this blog entry I will focus on data parallelism.
It is again and again amazing to see how much speedup you get when you use GPUs for deep learning: Compared to CPUs 5x speedups are typical, but on larger problems one can achieve 10x speedups. With GPUs you can try out new ideas, algorithms and experiments much faster than usual and get almost immediate feedback as to what works and what does not. If you are serious about deep learning you should definitely get a GPU. But which one should you get? In this blog post I will guide you through the choices to the GPU which is best for you.