Concepedia

Publication | Open Access

PriPath: Identifying Dysregulated Pathways from Differential Gene Expression via Grouping, Scoring and Modeling with an Embedded Machine Learning Approach

11

Citations

38

References

2022

Year

Abstract

Abstract Background: Cell homeostasis relies on the concerted actions of several genes; and dysregulated genes lead to disease manifestations. In living organisms, genes or their products do not act alone, but instead act within a large network. Subsets of these networks can be viewed as modules which provide certain functionality in an organism. Kyoto Encyclopedia of Genes and Genomes (KEGG) systematically analyzes gene functions, proteins, molecules, and provides a PATHWAY database. Measurements of gene expression (e.g., RNA-seq data) can be mapped into KEGG pathways in order to determine which modules are affected or dysregulated in a disease. However, genes acting in multiple pathways, and some other inherent issues complicate such analyses. To detect dysregulated pathways, current approaches may only employ gene expression data and neglect some of the existing knowledge stored in KEGG pathways. For a more holistic association between gene expression and pathways, new approaches which take into account more of the compiled information are required. Results: PriPath is a novel approach that transfers the generic approach of grouping, scoring followed by modeling for the analysis of gene expression with KEGG pathways. In PriPath, we utilize the KEGG pathway as the grouping information and insert this information into a machine learning algorithm for selecting the most significant KEGG pathways. Those groups are utilized to train a machine learning model for the classification task. We have tested PriPath on 13 gene expression datasets of various cancers and other diseases. Our proposed approach successfully assigned biologically and clinically relevant KEGG terms to the differentially expressed genes. We have comparatively evaluated the performance of PriPath against other tools, which are similar in their merit. For each dataset, we manually confirmed the top results of PriPath in literature, and we compared PriPath predictions to the predictions of Reactome and DAVID. Conclusions: PriPath can thus aid in determining dysregulated pathways, which is applicable to medical diagnostics. In the future, we aim to advance this approach in such a way that it will be possible to perform patient stratification based on gene expression and to identify druggable targets. Thereby, we cover two aspects of precision medicine.

References

YearCitations

Page 1