DeepFakes: a New Threat to Face Recognition? Assessment and Detection

TLDR

Automatic face swapping with GANs is now easy, and high‑profile scandals have highlighted the need for reliable Deepfake detection. This study releases the first publicly available Deepfake video set derived from the VidTIMIT database to aid detection research. We generated 640 videos (320 low‑quality and 320 high‑quality) using open‑source GAN software, varying training and blending parameters to affect visual fidelity. Face recognition models (VGG, FaceNet) suffered 85.62 % and 95 % false acceptance rates, audio‑visual lip‑sync methods failed, and the best visual‑quality metric achieved only an 8.97 % equal error rate on high‑quality Deepfakes, underscoring the challenge.

Abstract

It is becoming increasingly easy to automatically replace a face of one person in a video with the face of another person by using a pre-trained generative adversarial network (GAN). Recent public scandals, e.g., the faces of celebrities being swapped onto pornographic videos, call for automated ways to detect these Deepfake videos. To help developing such methods, in this paper, we present the first publicly available set of Deepfake videos generated from videos of VidTIMIT database. We used open source software based on GANs to create the Deepfakes, and we emphasize that training and blending parameters can significantly impact the quality of the resulted videos. To demonstrate this impact, we generated videos with low and high visual quality (320 videos each) using differently tuned parameter sets. We showed that the state of the art face recognition systems based on VGG and Facenet neural networks are vulnerable to Deepfake videos, with 85.62% and 95.00% false acceptance rates respectively, which means methods for detecting Deepfake videos are necessary. By considering several baseline approaches, we found that audio-visual approach based on lip-sync inconsistency detection was not able to distinguish Deepfake videos. The best performing method, which is based on visual quality metrics and is often used in presentation attack detection domain, resulted in 8.97% equal error rate on high quality Deepfakes. Our experiments demonstrate that GAN-generated Deepfake videos are challenging for both face recognition systems and existing detection methods, and the further development of face swapping technology will make it even more so.

References

Page 1

	Year	Citations

Page 1