There are zero DL frameworks that support AMD cards as a primary target.
Most have some kind of branch or patchset with OpenCL support. The problem is that they aren't great. If you need any new layers there is no support. There is nothing like CuDNN so you don't get the high speed convolutional kernels.
It's great to blame developers for supporting NVidia, but the thing is NVidia are great to work with. They dedicate large teams to deep learning support (not like the 2 or 3 part time Devs AMD does), and they publish good research and tutorials. AMD does nothing like this.
Amd had always had first class support for openCL, cuda is Nvidia proprietary and although Nvidia "supports" OpenCL, it's quite bad. Issue #22 on tensorflow is regarding openCL support.
Additionally, they support cuda decompiling to HIP which is an intermediary that can be built to target Nvidia (via nvcc) or amd (via HCC).
Nvidia has built a lot of tooling for DL, such as cudnn, and the new Tesla cards have dedicated silicon for tensor calculation. Amd does have a cudnn equivalent called MlOpen. They have also ported caffe via HIP and it works well. Work is being done by amd right now to torch, mxnet and tensorflow to add support for amd hardware with minimal burden to the maintainers of these projects.
I think it's particularly bad form on behalf of everyone in the DL framework and library world to cator only to Nvidia and cuda, and that they very much walked into this shake down with open arms.
The original comment is correct in that contributing support for OpenCL (which works on mobile too) will alleviate this to a fair degree. It's one of those things where the more momentum is behind it, the more device manufactures will focus on ensuring their opencl compiler is building properly optimized kernels for their hardware.
Start contributing to OpenCL or adding hip support to existing projects and we'll see some viable alternatives pop up from not only AMD, but players like Qualcomm and Samsung.
I'm not saying it's a great situation, I'm saying that NVidia has always had better libraries, tools and performance, and it isn't surprising that developers use them.
Deep learning is hard and slow enough without using second class tools.
Much respect to NVIDIA and their software team but the situation is changing. PlaidML is like cuDNN for every GPU. Fully open source, faster in many cases than TF+cuDNN on NVIDIA, beats vendor tools on other architectures, Linux/Mac/Win. Supports Keras currently but more frameworks are not difficult (patches welcome).
The PlaidML benchmarks are suspect. They compare to Keras + Tensorflow, which is a really unfair comparison since 1) Tensorflow is probably the slowest of the big deep learning frameworks out there (compared to PyTorch, MXNet, etc.), and 2) Keras itself is quite slow. Keras is optimized more for ease of use, introduces lots of abstractions, and often doesn't take advantage of many TF optimizations, (for just one example until very recently Keras did not use TF's fused batch norm, which the TF docs claim provides a 10-30% speedup in overall network performance, which alone could be enough to account for many of the benchmarks showing PlaidML ahead).
In my opinion it's extremely fair. The benchmarks are Keras+PlaidML compared to Keras+TensorFlow, it allows running exactly the same nets (just imported from the Keras included applications) and whatever penalty Keras might impose is equal in the two cases. Having one very direct comparison is actually why we constructed the tests that way (none of the other frameworks run on our high priority platforms).
That said we'd be pretty excited if someone wanted to add support for TF, PyTorch, MXNet, etc. We like Keras but are happy to have integrations for all frameworks. With work you could pair it with Docker and containerize GPU-accelerated workloads without the guests even needing to know what hardware it's running on. Lots of possibilities.
> whatever penalty Keras might impose is equal in the two cases.
The penalty Keras imposes when using Tensorflow depends on its Tensorflow implementation. The penalty Keras imposes when using MXNet depends on its MXNet implementation. The penalty Keras imposes when using PlaidML depends on whatever the PlaidML devs implemented. When you build a Keras layer, it's calling different Keras code for each backend.
The comparison would be fair if Plaid claimed to be the fastest Keras backed, not if it were actually claiming to be faster than Tensorflow.
There was someone on reddit/ml who posted some pretty interesting numbers for training.
I think they have a lot of challenges ahead of them, but I’m still more optimistic about Plaid than AMD’s own efforts.
AMD says that they don’t care about ML[1], and their actions back that up.
Edit: and to be clear, I think comparing Keras+Plaid vs Keras+TF is an entirely valid thing to do. Lots of people work in Keras, and if you download a random NN code off github it likely to be Keras (or Pytorch now of course).
Batch 1 inference on convnets is key for us internally but training does work pretty well. The underlying machinery can do much more. Here's a blog post that talks about how it works with some links to more detailed docs & the actual implementations:
Two of the big motivators for opening the code were 1) giving students taking the popular courses a way to get started with GPU in whatever machine they've got (recent Intel GPUs in say a MacBook Air are enough) and 2) giving researchers a platform where it's simple to add efficient GPU-accelerated ops.
For scale on #2 check out the entire implementation of convolution:
Most have some kind of branch or patchset with OpenCL support. The problem is that they aren't great. If you need any new layers there is no support. There is nothing like CuDNN so you don't get the high speed convolutional kernels.
It's great to blame developers for supporting NVidia, but the thing is NVidia are great to work with. They dedicate large teams to deep learning support (not like the 2 or 3 part time Devs AMD does), and they publish good research and tutorials. AMD does nothing like this.