Publication | Open Access
Towards Universal Cell Embeddings: Integrating Single-cell RNA-seq Datasets across Species with SATURN
21
Citations
39
References
2023
Year
Unknown Venue
EngineeringGeneticsGenomicsBioinformatics DatabaseProtein EmbeddingsMolecular EcologySingle Cell SequencingSingle-cell Rna-seq DatasetsComputational GenomicsLanguage ModelsTranslational BioinformaticsRna BiologySingle-cell GenomicsGene ExpressionSingle-cell AnalysisFunctional GenomicsBioinformaticsBiologyGene Sequence AnnotationEvolutionary BiologyComputational BiologyUniversal Cell EmbeddingsSystems BiologyMedicine
Analysis of single-cell datasets generated from diverse organisms offers unprecedented opportunities to unravel fundamental evolutionary processes of conservation and diversification of cell types. However, inter-species genomic differences limit the joint analysis of cross-species datasets to homologous genes. Here, we present SATURN, a deep learning method for learning universal cell embeddings that encodes genes' biological properties using protein language models. By coupling protein embeddings from language models with RNA expression, SATURN integrates datasets profiled from different species regardless of their genomic similarity. SATURN has a unique ability to detect functionally related genes co-expressed across species, redefining differential expression for cross-species analysis. We apply SATURN to three species whole-organism atlases and frog and zebrafish embryogenesis datasets. We show that cell embeddings learnt in SATURN can be effectively used to transfer annotations across species and identify both homologous and species-specific cell types, even across evolutionarily remote species. Finally, we use SATURN to reannotate the five species Cell Atlas of Human Trabecular Meshwork and Aqueous Outflow Structures and find evidence of potentially divergent functions between glaucoma associated genes in humans and other species.
| Year | Citations | |
|---|---|---|
Page 1
Page 1