While studying backpropagation, I came across this great mathematical explanation or breakdown of how it goes. Matt Mazur, a data scientist working for Help Scout, wrote about it on his personal blog.
I am so excited about the mathematical details that power neural networks that I enjoy writing down on a piece of paper how the actual computations work, both for forward propagation (the logits and the activations) and for backpropagation with gradient descent (the partial derivatives).
This passion and excitement rose inside of me as doing the courses in the Deep Learning specializations of Professor Andrew Ng at Coursera. This part of neural networks is greatly explain in the first course.
Anyway, backpropagation simply means back propagating the error you get as you reach the output of a network, with the purpose of updating the weights to minimize this error. Matt details both the forward pass and the backward pass through a neural network, with the equations and derivatives that they imply.
So, if you're a geek that's excited about this stuff, I'd recommend taking a piece of paper and replicating the work done by Matt on his blog. Also, do check the visualizations he provides as they enhance the understanding of backprop.
To stay in touch with me, follow 
Cristi Vlad Self-Experimenter and Author