Publication | Open Access
A step-by-step workflow for low-level analysis of single-cell RNA-seq data
821
Citations
44
References
2016
Year
EngineeringGeneticsSingle-cell Rna-seq DataMultiomicsTranscriptomics TechnologyGenomicsStem Cell BiologyGene Expression ProfilingTrajectory AnalysisSingle Cell SequencingBioinformatics PipelinesTranscriptomicsMolecular DiagnosticsRna SequencingTranslatomicsSingle-cell GenomicsOmicsBulk Rna SequencingGene ExpressionSingle-cell AnalysisFunctional GenomicsSequencingCell BiologyBioinformaticsScrna-seq DataComputational BiologySingle-cell BiologySystems BiologyMedicine
Single‑cell RNA sequencing provides cell‑level transcriptomic resolution that bulk RNA‑seq cannot match, but its higher technical noise and data complexity require dedicated analytical methods rather than repurposing bulk pipelines. This article presents a computational workflow for low‑level scRNA‑seq analysis built on Bioconductor software. The workflow includes quality control, data exploration, normalization, cell‑cycle assignment, identification of highly variable and correlated genes, clustering into subpopulations, and marker gene detection, and is illustrated on multiple publicly available datasets.
<ns4:p>Single-cell RNA sequencing (scRNA-seq) is widely used to profile the transcriptome of individual cells. This provides biological resolution that cannot be matched by bulk RNA sequencing, at the cost of increased technical noise and data complexity. The differences between scRNA-seq and bulk RNA-seq data mean that the analysis of the former cannot be performed by recycling bioinformatics pipelines for the latter. Rather, dedicated single-cell methods are required at various steps to exploit the cellular resolution while accounting for technical noise. This article describes a computational workflow for low-level analyses of scRNA-seq data, based primarily on software packages from the open-source Bioconductor project. It covers basic steps including quality control, data exploration and normalization, as well as more complex procedures such as cell cycle phase assignment, identification of highly variable and correlated genes, clustering into subpopulations and marker gene detection. Analyses were demonstrated on gene-level count data from several publicly available data sets involving haematopoietic stem cells, brain-derived cells, T-helper cells and mouse embryonic stem cells. This will provide a range of usage scenarios from which readers can construct their own analysis pipelines.</ns4:p>
| Year | Citations | |
|---|---|---|
Page 1
Page 1