How To Backdoor Federated Learning

TLDR

Federated learning lets many participants train a shared deep learning model without exchanging private data, such as smartphones jointly building a next‑word predictor while keeping user inputs confidential. The study aims to design and evaluate a model‑replacement poisoning method that enables a participant to embed a hidden backdoor into the global model. The authors introduce a model‑replacement attack that replaces the global model with a malicious one during training, allowing the attacker to embed a backdoor that triggers on specific inputs. In a single federated‑learning round, the attack achieves 100 % backdoor accuracy, outperforms data‑poisoning baselines, and evades anomaly‑detection defenses via a constrain‑and‑scale technique.

Abstract

Federated learning enables thousands of participants to construct a deep learning model without sharing their private training data with each other. For example, multiple smartphones can jointly train a next-word predictor for keyboards without revealing what individual users type. We demonstrate that any participant in federated learning can introduce hidden backdoor functionality into the joint global model, e.g., to ensure that an image classifier assigns an attacker-chosen label to images with certain features, or that a word predictor completes certain sentences with an attacker-chosen word. We design and evaluate a new model-poisoning methodology based on model replacement. An attacker selected in a single round of federated learning can cause the global model to immediately reach 100% accuracy on the backdoor task. We evaluate the attack under different assumptions for the standard federated-learning tasks and show that it greatly outperforms data poisoning. Our generic constrain-and-scale technique also evades anomaly detection-based defenses by incorporating the evasion into the attacker's loss function during training.

References

Page 1

	Year	Citations

Page 1