Publication | Open Access
A rapid, accurate approach to inferring pedigrees in endogamous populations
21
Citations
62
References
2020
Year
Unknown Venue
Genetic TestingGeneticsGenetic EpidemiologyLinkage AnalysisHaplotype SharingGenotype-phenotype AssociationHuman VariationBreedingBiostatisticsEndogamous PopulationsPublic HealthAccurate ReconstructionPedigree AnalysisStatistical GeneticsGenetic VariationPopulation GeneticsEvolutionary BiologyRapid AlgorithmGenetic AdmixturePopulation GenomicsMedicine
ABSTRACT Accurate reconstruction of pedigrees from genetic data remains a challenging problem. Pedigree inference algorithms are often trained only on European-descent families in urban locations. Many relationship categories can be difficult to distinguish (e.g. half-sibships versus avuncular) without external information. Furthermore, existing methods perform poorly in endogamous populations for which there may be reticulations within the pedigrees and elevated haplotype sharing. We present a simple, rapid algorithm which initially uses only high-confidence first-degree relationships to seed a machine learning step based on summary statistics of identity-by-descent (IBD) sharing. One of these statistics, our “haplotype score”, is novel and can be used to: (1) distinguish half-sibling pairs from avuncular or grandparent-grandchildren pairs; and (2) assign individuals to ancestor versus descendant generation. We test our approach in a sample of 700 individuals from northern Namibia, sampled from an endogamous population called the Himba. Due to a culture of concurrent relationships in the Himba, there is a high proportion of half-sibships. We accurately identify first through fourth-degree relationships and distinguish between various second-degree relationships: half-sibships, avuncular pairs, and grandparent-grandchildren. We further validate our approach in a second diverse African-descent dataset, the Barbados Asthma Genetics Study (BAGS). Accurate reconstruction of pedigrees holds promise for tracing allele frequency trajectories, improved phasing and other population genomic questions.
| Year | Citations | |
|---|---|---|
Page 1
Page 1