Publication | Open Access
Regularized Multivariate Regression for Identifying Master Predictors with Application to Integrative Genomics Study of Breast Cancer.
247
Citations
55
References
2010
Year
Breast OncologyEngineeringGeneticsGene Expression ProfilingHigh DimensionalityBiomedical Data ScienceIdentifying Master PredictorsBiostatisticsMolecular DiagnosticsMicroarray Data AnalysisStatisticsCancer ResearchPrediction ModellingMaster PredictorsPredictive AnalyticsGene ExpressionBioinformaticsFunctional GenomicsComputational BiologyBreast CancerRegulatory Network ModellingSystems BiologyMedicineMultivariate Regression
remMap is motivated by investigating regulatory relationships among biological molecules using high‑dimensional genomic data, particularly the influence of DNA copy number alterations on RNA transcript levels. The study proposes remMap, a regularized multivariate regression method for identifying master predictors in high‑dimension‑low‑sample‑size settings. remMap models RNA expression as a function of DNA copy numbers via multivariate linear regression with regularization to handle high dimensionality and encode network structures, selects tuning parameters by the discussed criteria, and is evaluated through simulations and applied to 172 breast‑cancer tumor samples. The method identified a trans‑hub region on cytoband 17q12‑q21 whose amplification drives expression of more than 30 unrelated genes, providing insights into breast‑cancer pathology.
In this paper, we propose a new method remMap - REgularized Multivariate regression for identifying MAster Predictors - for fitting multivariate response regression models under the high-dimension-low-sample-size setting. remMap is motivated by investigating the regulatory relationships among different biological molecules based on multiple types of high dimensional genomic data. Particularly, we are interested in studying the influence of DNA copy number alterations on RNA transcript levels. For this purpose, we model the dependence of the RNA expression levels on DNA copy numbers through multivariate linear regressions and utilize proper regularization to deal with the high dimensionality as well as to incorporate desired network structures. Criteria for selecting the tuning parameters are also discussed. The performance of the proposed method is illustrated through extensive simulation studies. Finally, remMap is applied to a breast cancer study, in which genome wide RNA transcript levels and DNA copy numbers were measured for 172 tumor samples. We identify a trans-hub region in cytoband 17q12-q21, whose amplification influences the RNA expression levels of more than 30 unlinked genes. These findings may lead to a better understanding of breast cancer pathology.
| Year | Citations | |
|---|---|---|
Page 1
Page 1