Detecting Compressed Deepfake Videos in Social Networks Using Frame-Temporality Two-Stream Convolutional Network

TLDR

Deepfake videos are increasingly produced and widely shared on social networks, yet detecting them in compressed formats remains challenging. The study aims to develop a method for identifying compressed Deepfake videos. The authors introduce a two‑stream convolutional network that first prunes a frame‑level stream to mitigate compression noise, then a temporality‑level stream captures temporal correlations. The combined two‑stream approach achieves superior performance over existing methods for compressed Deepfake video detection.

Abstract

The development of technologies that can generate Deepfake videos is expanding rapidly. These videos are easily synthesized without leaving obvious traces of manipulation. Though forensically detection in high-definition video datasets has achieved remarkable results, the forensics of compressed videos is worth further exploring. In fact, compressed videos are common in social networks, such as videos from Instagram, Wechat, and Tiktok. Therefore, how to identify compressed Deepfake videos becomes a fundamental issue. In this paper, we propose a two-stream method by analyzing the frame-level and temporality-level of compressed Deepfake videos. Since the video compression brings lots of redundant information to frames, the proposed frame-level stream gradually prunes the network to prevent the model from fitting the compression noise. Aiming at the problem that the temporal consistency in Deepfake videos might be ignored, we apply a temporality-level stream to extract temporal correlation features. When combined with scores from the two streams, our proposed method performs better than the state-of-the-art methods in compressed Deepfake videos detection.

References

Page 1

	Year	Citations

Page 1