Concepedia

Publication | Open Access

Benchmarking joint multi-omics dimensionality reduction approaches for the study of cancer

222

Citations

25

References

2021

Year

TLDR

High‑dimensional multi‑omics data are now standard and can greatly enhance biological understanding, but integrating them efficiently requires joint dimensionality reduction methods, and many such methods exist, necessitating a benchmark. The study aims to systematically evaluate nine representative jDR methods across three complementary benchmarks and provide reproducible code in a Jupyter notebook. The authors assess the methods by (1) recovering ground‑truth clustering from simulated data, (2) predicting survival, clinical annotations, and pathways in TCGA cancer data, and (3) classifying multi‑omics single‑cell data, with results and code shared in the momix notebook. intNMF outperforms others for clustering, while MCIA shows robust performance across multiple contexts.

Abstract

Abstract High-dimensional multi-omics data are now standard in biology. They can greatly enhance our understanding of biological systems when effectively integrated. To achieve proper integration, joint Dimensionality Reduction (jDR) methods are among the most efficient approaches. However, several jDR methods are available, urging the need for a comprehensive benchmark with practical guidelines. We perform a systematic evaluation of nine representative jDR methods using three complementary benchmarks. First, we evaluate their performances in retrieving ground-truth sample clustering from simulated multi-omics datasets. Second, we use TCGA cancer data to assess their strengths in predicting survival, clinical annotations and known pathways/biological processes. Finally, we assess their classification of multi-omics single-cell data. From these in-depth comparisons, we observe that intNMF performs best in clustering, while MCIA offers an effective behavior across many contexts. The code developed for this benchmark study is implemented in a Jupyter notebook—multi-omics mix (momix)—to foster reproducibility, and support users and future developers.

References

YearCitations

Page 1