Publication | Open Access
HiCRep: assessing the reproducibility of Hi-C data using a stratum-adjusted correlation coefficient
585
Citations
31
References
2017
Year
EngineeringGeneticsStratum-adjusted Correlation CoefficientData PreparationMolecular BiologyHi-c Data ReproducibilityGenomicsComputational ReproducibilityBioinformatics DatabaseReproducible ResearchData ScienceScientific Data ManagementComputational GenomicsData IntegrationBiostatisticsData ManagementStatisticsHi-c DataSequence AnalysisOmicsDomain StructureBioinformaticsFunctional GenomicsChromatinNext-generation SequencingComputational BiologySystems BiologyMedicinePresent HicrepSequence AssemblyData Modeling
Hi‑C is a powerful technology for studying genome‑wide chromatin interactions, yet existing reproducibility methods often give misleading results by ignoring spatial features such as domain structure and distance dependence. We present HiCRep, a framework that systematically accounts for these spatial features to assess Hi‑C data reproducibility. HiCRep introduces the stratum‑adjusted correlation coefficient (SCC), a novel similarity measure that incorporates domain structure and distance dependence, and is implemented in a freely available R package. SCC delivers a statistically sound, accurate, and scalable assessment of Hi‑C reproducibility, outperforming existing methods, enabling quantification of differences, optimal sequencing depth determination, and providing an easy‑to‑interpret, automatable quality‑control metric.
Hi-C is a powerful technology for studying genome-wide chromatin interactions. However, current methods for assessing Hi-C data reproducibility can produce misleading results because they ignore spatial features in Hi-C data, such as domain structure and distance dependence. We present HiCRep, a framework for assessing the reproducibility of Hi-C data that systematically accounts for these features. In particular, we introduce a novel similarity measure, the stratum adjusted correlation coefficient (SCC), for quantifying the similarity between Hi-C interaction matrices. Not only does it provide a statistically sound and reliable evaluation of reproducibility, SCC can also be used to quantify differences between Hi-C contact matrices and to determine the optimal sequencing depth for a desired resolution. The measure consistently shows higher accuracy than existing approaches in distinguishing subtle differences in reproducibility and depicting interrelationships of cell lineages. The proposed measure is straightforward to interpret and easy to compute, making it well-suited for providing standardized, interpretable, automatable, and scalable quality control. The freely available R package HiCRep implements our approach.
| Year | Citations | |
|---|---|---|
Page 1
Page 1