A whole-slide foundation model for digital pathology from real-world data

TLDR

Digital pathology faces computational challenges because gigapixel slides contain tens of thousands of tiles, and prior models that subsample tiles miss critical slide‑level context. The study introduces Prov‑GigaPath, a whole‑slide pathology foundation model pretrained on 1.3 billion 256 × 256 tiles from 171,189 slides across 28 cancer centres. Prov‑GigaPath is built using the GigaPath vision transformer, which scales LongNet to handle tens of thousands of tiles per gigapixel slide, and is pretrained on 1.3 billion tiles from 171,189 slides covering 31 tissue types, then evaluated on a benchmark of 9 cancer subtyping and 17 pathomics tasks. Prov‑GigaPath achieves state‑of‑the‑art performance on 25 of 26 benchmark tasks, outperforming the next best method on 18 tasks, and also shows strong results in vision‑language pretraining, underscoring the value of real‑world data and whole‑slide modeling.

Abstract

Abstract Digital pathology poses unique computational challenges, as a standard gigapixel slide may comprise tens of thousands of image tiles 1–3 . Prior models have often resorted to subsampling a small portion of tiles for each slide, thus missing the important slide-level context 4 . Here we present Prov-GigaPath, a whole-slide pathology foundation model pretrained on 1.3 billion 256 × 256 pathology image tiles in 171,189 whole slides from Providence, a large US health network comprising 28 cancer centres. The slides originated from more than 30,000 patients covering 31 major tissue types. To pretrain Prov-GigaPath, we propose GigaPath, a novel vision transformer architecture for pretraining gigapixel pathology slides. To scale GigaPath for slide-level learning with tens of thousands of image tiles, GigaPath adapts the newly developed LongNet 5 method to digital pathology. To evaluate Prov-GigaPath, we construct a digital pathology benchmark comprising 9 cancer subtyping tasks and 17 pathomics tasks, using both Providence and TCGA data 6 . With large-scale pretraining and ultra-large-context modelling, Prov-GigaPath attains state-of-the-art performance on 25 out of 26 tasks, with significant improvement over the second-best method on 18 tasks. We further demonstrate the potential of Prov-GigaPath on vision–language pretraining for pathology 7,8 by incorporating the pathology reports. In sum, Prov-GigaPath is an open-weight foundation model that achieves state-of-the-art performance on various digital pathology tasks, demonstrating the importance of real-world data and whole-slide modelling.

References

Page 1

	Year	Citations

Page 1