Federated learning improves site performance in multicenter deep learning without data sharing

TLDR

The study aims to enable multi‑institutional deep learning training without centralizing or sharing patient data by using federated learning. Models were trained locally at each institution on their own data, and a federated learning model was trained across all sites without data sharing. The federated learning model outperformed all single‑institution models, achieving significantly higher accuracy on held‑out test sets and an external challenge dataset, demonstrating improved generalizability while preserving patient privacy across three academic institutions.

Abstract

Abstract Objective To demonstrate enabling multi-institutional training without centralizing or sharing the underlying physical data via federated learning (FL). Materials and Methods Deep learning models were trained at each participating institution using local clinical data, and an additional model was trained using FL across all of the institutions. Results We found that the FL model exhibited superior performance and generalizability to the models trained at single institutions, with an overall performance level that was significantly better than that of any of the institutional models alone when evaluated on held-out test sets from each institution and an outside challenge dataset. Discussion The power of FL was successfully demonstrated across 3 academic institutions while avoiding the privacy risk associated with the transfer and pooling of patient data. Conclusion Federated learning is an effective methodology that merits further study to enable accelerated development of models across institutions, enabling greater generalizability in clinical use.

References

Page 1

	Year	Citations

Page 1