Concepedia

TLDR

Cancer translational research relies on identifying clinically relevant tumor subtypes and omics signatures, yet despite large‑scale genomic profiling resources such as TCGA, few computationally efficient methods exist for integrative clustering of multi‑type omics data. The study aims to develop iClusterBayes, a fully Bayesian latent variable method that jointly models continuous and discrete omics data to identify tumor subtypes and relevant omics features. iClusterBayes employs a small number of latent variables to capture the shared structure across multiple omics datasets, enabling joint dimensionality reduction, clustering of tumor samples in latent space, and Bayesian variable selection to identify features driving the clusters. Compared to iClusterPlus, iClusterBayes delivers superior statistical inference and faster computation, and analyses of TCGA and simulated datasets show it effectively reveals clinically meaningful tumor subtypes and driver omics features.

Abstract

Identification of clinically relevant tumor subtypes and omics signatures is an important task in cancer translational research for precision medicine. Large-scale genomic profiling studies such as The Cancer Genome Atlas (TCGA) Research Network have generated vast amounts of genomic, transcriptomic, epigenomic, and proteomic data. While these studies have provided great resources for researchers to discover clinically relevant tumor subtypes and driver molecular alterations, there are few computationally efficient methods and tools for integrative clustering analysis of these multi-type omics data. Therefore, the aim of this article is to develop a fully Bayesian latent variable method (called iClusterBayes) that can jointly model omics data of continuous and discrete data types for identification of tumor subtypes and relevant omics features. Specifically, the proposed method uses a few latent variables to capture the inherent structure of multiple omics data sets to achieve joint dimension reduction. As a result, the tumor samples can be clustered in the latent variable space and relevant omics features that drive the sample clustering are identified through Bayesian variable selection. This method significantly improve on the existing integrative clustering method iClusterPlus in terms of statistical inference and computational speed. By analyzing TCGA and simulated data sets, we demonstrate the excellent performance of the proposed method in revealing clinically meaningful tumor subtypes and driver omics features.

References

YearCitations

Page 1