by Lisa Vaas
Google has released a data set of thousands of deepfake videos that it produced using paid, consenting actors in order to help researchers in the ongoing work of coming up with detection methods.
In order for researchers to train and test automated detection tools, they need to feed them a whole lot of deepfakes to scrutinize. Google is helping by making its dataset available to researchers, who can use it to train algorithms that spot deepfakes.
The data set, available on Github, contains more than 3,000 deepfake videos. Google said on its artificial intelligence (AI) blog that the hyperrealistic videos, created in collaboration with its Jigsaw technology incubator, have been incorporated into the Technical University of Munich and the University Federico II of Naples’ new FaceForensics benchmark – an effort that Google co-sponsors.
To produce the videos, Google used 28 actors, placing pairs of them in quotidian settings: hugging, talking, expressing emotion and the like.
A sample of videos from Google’s contribution to the FaceForensics benchmark. To generate these, pairs of actors were selected randomly and deep neural networks swapped the face of one actor onto the head of another.
To transform their faces, Google used publicly available, state-of-the-art, automatic deepfake algorithms: Deepfakes, Face2Face, FaceSwap and NeuralTextures. You can read more about those algorithms in this white paper from the FaceForensics team. In January 2019, the academic team, led by a researcher from the Technical University of Munich, created another data set of deepfakes, FaceForensics++, by performing those four common face manipulation methods on nearly 1,000 YouTube videos.
Google added to those efforts with another method that does face manipulation using a family of dueling computer programs known as generative adversarial networks (GANs): machine learning systems that pit neural networks against each other in order to generate convincing photos of people who don’t exist. Google also added the Neural Textures image manipulation algorithm to the mix.
Yet another data set of deepfakes is in the works, this one from Facebook. Earlier this month, it announced that it was launching a $10m deepfake detection project.
It will, as the name DeepFake Detection Challenge suggests, help people detect deepfakes. Like Google, Facebook’s going to make the data set available to researchers.
Powerful, business-grade protection at home.
Play Video Try for Free
An arms race
This is, of course, an ongoing battle. As recently as last month, when we heard about mice being pretty good at detecting deepfake audio, that meant that the critters were close to the median accuracy of 92% for state of the art detection algorithms: algorithms that detect unusual head movements or inconsistent lighting, or, in shoddier deepfakes, which spot subjects who don’t blink. (The US Defense Advanced Research Projects Agency [DARPA] has found that a lack of blinking, at least as of the circa August 2018 state of the technology’s evolution, was a giveaway.)
In spite of the current, fairly high detection rate, we need all the help we can get to withstand the ever more sophisticated fakes that are coming. Deep fake technology is evolving at breakneck speed, and just because detection is fairly reliable now doesn’t mean it’s going to stay that way. Thus was difficult-to-detect impersonation a “significant” topic at this year’s Black Hat and Def Con conferences, as the BBC reported last month.
We’re already seeing GANs reportedly used to create what an AP investigation recently suggested was a deepfake LinkedIn profile of a comely young woman who was suspiciously well-connected to people in power.
Forensic experts easily spotted 30-year-old “Katie Jones” as a deepfake. That was fairly recent: the story was published in June. Then, we got DeepNude, an app that also used GANs and which appeared to have advanced the technology all that much further, plus put it into an app that anybody could use to generate a deepfake within 30 seconds.
This isn’t Google’s first contribution to the field of unmasking fakes: in January, it released a database of synthetic speech to help out with fake audio detection. Google says that it also plans to add to its deepfake dataset as deepfake generation technology evolves:
We firmly believe in supporting a thriving research community around mitigating potential harms from misuses of synthetic media, and today’s release of our deepfake dataset in the FaceForensics benchmark is an important step in that direction.
Link to original article: Google made thousands of deepfakes to aid detection efforts
My two sats on this...
Even if your not busy constantly feeding your hunger for paranoia, it's pretty easy to put 2 and 2 together regarding the more transparent steps taken to evolve AI tech beyond what uninformed people would see as science fiction.
Never underestimate the quick progress that can be made once you start teaching machines how to learn!
Imagine what is possible with AI in terms of facial recognition and voice recognition if AI now already starts mastering the differentiating between actual footage and pics and deep fakes.
More or less every pattern can by used to analyze and detect or to analyze and to profile us all!
A certain style of walking, sniffing, blinking for instance can be enough to identify you and to possibly profile you!
One example of this would be the possibility to detect certain diseases based on a walking style!
You think I'm kidding... No I'm not!:
This may or may not be AI put to good use but imagine what an health insurance company could do with this information to reduce their risk!