Publication | Open Access
PANTHER: A Library of Protein Families and Subfamilies Indexed by Function
3K
Citations
43
References
2003
Year
In the genomic era, a fundamental goal is to characterize protein function on a large scale. We describe PANTHER, a method that robustly and accurately relates protein sequence relationships to functional relationships. PANTHER comprises a library of protein families represented by multiple sequence alignments, HMMs, and family trees, and an abbreviated ontology that summarizes and navigates molecular functions and biological processes, enabling mapping of sequence to function. The method reports family size and sequence diversity, offers a high‑level gene‑function map across human and mouse genomes, and ranks missense SNPs by their likelihood of affecting protein function on a database‑wide scale.
In the genomic era, one of the fundamental goals is to characterize the function of proteins on a large scale. We describe a method, PANTHER, for relating protein sequence relationships to function relationships in a robust and accurate way. PANTHER is composed of two main components: the PANTHER library (PANTHER/LIB) and the PANTHER index (PANTHER/X). PANTHER/LIB is a collection of “books,” each representing a protein family as a multiple sequence alignment, a Hidden Markov Model (HMM), and a family tree. Functional divergence within the family is represented by dividing the tree into subtrees based on shared function, and by subtree HMMs. PANTHER/X is an abbreviated ontology for summarizing and navigating molecular functions and biological processes associated with the families and subfamilies. We apply PANTHER to three areas of active research. First, we report the size and sequence diversity of the families and subfamilies, characterizing the relationship between sequence divergence and functional divergence across a wide range of protein families. Second, we use the PANTHER/X ontology to give a high-level representation of gene function across the human and mouse genomes. Third, we use the family HMMs to rank missense single nucleotide polymorphisms (SNPs), on a database-wide scale, according to their likelihood of affecting protein function.
| Year | Citations | |
|---|---|---|
Page 1
Page 1