Musical expression is in the minute fractal deviations from regularity: micro-rhythms, micro-tones, and micro-sounds. It flows from fingers and lips, carrying the influence of pulse and breathing, the awareness of people (ensemble and audience) and events surrounding the musicians in a point of historic time.
Pop music based on MIDI is fake - the quantized irrational pitch (2^(1/12), metronomic time, over-regularized waveforms of synthesis, and computerized sequencing are suitable for background Muzak and soundtracks, but do not pass muster as music to trained musicians - unless they over-intellectualize the music hearing only the notations.
I was reminded of this by contrasts in the great TIME:SPANS concert series. The opening live concert was six conservatory trained pianists generating MIDI control information feeding six instances of Pianoteq that ran physical models of micro-tuned pianos for a brilliant piece.
The timing of the MIDI had milliseconds of uncertainty going through an asynchronous interface in a computer architecture and communications protocols that are not designed for real time information. I am familiar with the pianists, having recorded five of them, and their timing on acoustic pianos is considerably better than what I heard. (three of them played ensemble with two pianos).
The converse was a concert on solo cello and electronics, with assistance from an electronicist. In one piece there was a complex of loops with processing that were triggered by the cello players' feet. He had practiced for years to play this piece in that modality, like a drummer or organist with independent limb rhythms, and the musico-emotional effect was powerful.
https://towardsdatascience.com/generating-music-using-deep-learning-cb5843a9d55e
Lockhart Tech Blogs