Ilia Karmanov has created a github repository for comparing the performance of neural networks across different (popular) platforms/frameworks.
He ran the Jupyter notebooks with the NN code on Nvidia K80 GPU on a Microsoft Azure VM. As per Ilia:
"The notebooks are not specifically written for speed, instead they aim to create an easy comparison between the frameworks." [source]
Ilia tested a convolutional neural network on CIFAR-10 dataset for image recognition on a few libraries. Contrary to what some practitioners might expect, the top 3 libraries in terms of training time in seconds were:
- Caffe2 - took 149 seconds to train the CNN for an accuracy of 79%
- MXNet - took 149 seconds to train the CNN for an accuracy of 77%
- Gluon - took 157 seconds to train the CNN for an accuracy 77%
Libraries like TensorFlow, Keras, and PyTorch had slightly lower (very close) performance.
Ilia also tested a Recurrent Neural Network on IMDB for sentiment analysis. The top 3 performing libraries were:
- CNTK - took 29 seconds to train the RNN for an accuracy of 86%
- MXNet - took 29 seconds to train the RNN for an accuracy of 86% (same as CNTK)
- PyTorch - took 32 seconds to train the RNN for an accuracy of 85%
I'd say that the performance of these libraries is really close; the difference being of a few hundred seconds between the top and the bottom. This might be really important on very large datasets and complicated NN designs, but it's not as important on smaller datasets and simpler design. For the very technical and insightful details, you can check/read the repo below:
Running Neural Networks on Different Frameworks - [Comparison]
To stay in touch with me, follow 
Cristi Vlad Self-Experimenter and Author