Neural Networks Create Accurate Facial Reconstructions
Supasorn Suwajanakorn recently posted a video of a neural network given only Obama's speech as input was able to generate these facial animations:
Concerns
Naturally, this has raised concerns* over possible malicious abuse of these techniques to create fake news and spread confusion and propaganda. I think the best medicine for ignorance is to freely share knowledge and educate ourselves on these matters.
Synthesizing Obama: Learning Lip Sync from Audio
Supasorn Suwajanakorn, Steven M. Seitz and Ira Kemelmacher-Shlizerman wrote a paper titled "Synthesizing Obama: Learning Lip Sync from Audio" for the University of Washington in which they state the following:
Given audio of President Barack Obama, we synthesize a high-quality video
of him speaking with accurate lip sync, composited into a target video clip.
Trained on many hours of his weekly address footage, a recurrent neural
network learns the mapping from raw audio features to mouth shapes. Given
the mouth shape at each time instant, we synthesize high-quality mouth
texture and composite it with proper 3D pose matching to change what he
appears to be saying in a target video to match the input audio track. Our
approach produces photorealistic results.
Here's a nice high level overview image of the neural network architecture used from the paper:
Source: http://grail.cs.washington.edu/projects/AudioToObama/siggraph17_obama.pdf
I highly encourage you to read the rest of the paper too if you're so inclined, it's great.
Links:
http://grail.cs.washington.edu/projects/AudioToObama
http://grail.cs.washington.edu/projects/AudioToObama/siggraph17_obama.pdf
http://nordic.businessinsider.com/cgi-ai-fake-news-videos-real-2017-7?r=US&IR=T*