Publication | Open Access
Maximum likelihood estimation of the heterogeneity of substitution rate among nucleotide sites.
641
Citations
0
References
1995
Year
Maximum Likelihood EstimationGeneticsGenomicsSequence AlignmentSubstitution RateMolecular EcologyRate HeterogeneityComputational GenomicsGenome AnalysisHeterogeneity RhoBiostatisticsSequence AnalysisStatistical GeneticsMaximum Likelihood ApproachGenetic VariationPopulation GeneticsBioinformaticsLinkage DisequilibriumNucleotide SitesNatural SciencesEvolutionary BiologyPopulation GenomicsMedicine
This paper presents a maximum likelihood approach to estimating the variation of substitution rate among nucleotide sites. We assume that the rate varies among sites according to an invariant+gamma distribution, which has two parameters: the gamma parameter alpha and the proportion of invariable sites theta. Theoretical treatments on three, four, and five sequences have been conducted, and computer program have been developed. It is shown that rho = (1 + theta alpha)/(1 + alpha) is a good measure for the rate heterogeneity among sites. Extensive simulations show that (1) if the proportion of invariable sites is negligible, i.e., theta = 0, the gamma parameter alpha can be satisfactorily estimated, even with three sequences; (2) if the proportion of invariable sites is not negligible, the heterogeneity rho can still be suitably estimated with four or more sequences; and (3) the distances estimated by the proposed method are almost unbiased and are robust against violation of the assumption of the invariant + gamma distribution.