Publication | Closed Access
Inferring weak population structure with the assistance of sample group information
3.5K
Citations
18
References
2009
Year
GeneticsPopulation DynamicSampling TechniqueSample Group InformationGenetic AnalysisData ScienceMolecular EcologyHuman VariationComputational GenomicsBiostatisticsPublic HealthStatisticsGenetic Clustering AlgorithmsSampling TheoryStatistical GeneticsSampling (Statistics)Genetic VariationPopulation StudyPopulation GeneticsBioinformaticsStructure ProgramComputational BiologyNew ModelsStatistical InferenceGenetic AdmixturePopulation GenomicsMedicineWeak Population Structure
Genetic clustering algorithms require a certain amount of data to produce informative results. In the common situation that individuals are sampled at several locations, we show how sample group information can be used to achieve better results when the amount of data is limited. We develop new structure models that modify the prior distribution for each individual's population assignment, allowing cluster proportions to vary by location, and test them on simulated data and CEPH microsatellite data. We demonstrate that the new models detect structure at lower divergence levels or with less data than original structure models or principal components methods, are unbiased when structure is absent, and are implemented in a freely available online version of structure.
Genetic clustering algorithms require a certain amount of data to produce informative results. In the common situation that individuals are sampled at several locations, we show how sample group information can be used to achieve better results when the amount of data is limited. New models are developed for the structure program, both for the cases of admixture and no admixture. These models work by modifying the prior distribution for each individual's population assignment. The new prior distributions allow the proportion of individuals assigned to a particular cluster to vary by location. The models are tested on simulated data, and illustrated using microsatellite data from the CEPH Human Genome Diversity Panel. We demonstrate that the new models allow structure to be detected at lower levels of divergence, or with less data, than the original structure models or principal components methods, and that they are not biased towards detecting structure when it is not present. These models are implemented in a new version of structure which is freely available online at http://pritch.bsd.uchicago.edu/structure.html.
| Year | Citations | |
|---|---|---|
2000 | 33.7K | |
2005 | 21.6K | |
1979 | 14.2K | |
2003 | 8K | |
2006 | 5.5K | |
2004 | 5.1K | |
2007 | 3.5K | |
2002 | 3K | |
2003 | 874 | |
2006 | 645 |
Page 1
Page 1