Publication | Open Access
ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data
15.1K
Citations
22
References
2010
Year
GeneticsGenomicsGenetic MedicineClinical GeneticsAnnotation DatabasesGenome-wide Association StudyMolecular EcologyComputational GenomicsWhole Genome StudiesPublic HealthMolecular DiagnosticsAnnovar ToolVariant InterpretationStatistical GeneticsOmicsFunctional AnnotationSequencingFunctional GenomicsBioinformaticsCandidate Gene AnalysisGene Sequence AnnotationNext-generation SequencingAnnotation DataSystems BiologyMedicine
High‑throughput sequencing generates vast genetic variation data, yet identifying functionally important variants remains challenging. The authors developed ANNOVAR to annotate SNVs and indels for functional impact, cytogenetic banding, importance scores, conserved regions, and known variant databases. ANNOVAR uses UCSC Genome Browser or any GFF3‑compatible database to annotate variants and applies a variants‑reduction protocol to filter millions of SNVs and indels, exemplified on a human genome with two causal mutations for Miller syndrome. The stepwise protocol excluded unlikely variants and identified 20 candidate genes, including the causal gene, and ANNOVAR completes gene‑based annotation in ~4 min and variants‑reduction in ~15 min on 4.7 million variants, enabling processing of hundreds of genomes per day. ANNOVAR is freely available at http://www.openbioinformatics.org/annovar/.
High-throughput sequencing platforms are generating massive amounts of genetic variation data for diverse genomes, but it remains a challenge to pinpoint a small subset of functionally important variants. To fill these unmet needs, we developed the ANNOVAR tool to annotate single nucleotide variants (SNVs) and insertions/deletions, such as examining their functional consequence on genes, inferring cytogenetic bands, reporting functional importance scores, finding variants in conserved regions, or identifying variants reported in the 1000 Genomes Project and dbSNP. ANNOVAR can utilize annotation databases from the UCSC Genome Browser or any annotation data set conforming to Generic Feature Format version 3 (GFF3). We also illustrate a 'variants reduction' protocol on 4.7 million SNVs and indels from a human genome, including two causal mutations for Miller syndrome, a rare recessive disease. Through a stepwise procedure, we excluded variants that are unlikely to be causal, and identified 20 candidate genes including the causal gene. Using a desktop computer, ANNOVAR requires ∼4 min to perform gene-based annotation and ∼15 min to perform variants reduction on 4.7 million variants, making it practical to handle hundreds of human genomes in a day. ANNOVAR is freely available at http://www.openbioinformatics.org/annovar/.
| Year | Citations | |
|---|---|---|
2003 | 6.7K | |
2006 | 4.6K | |
2005 | 4.2K | |
2008 | 3.7K | |
2009 | 2.4K | |
2002 | 2.3K | |
2009 | 2K | |
2009 | 1.9K | |
2001 | 896 | |
2001 | 661 |
Page 1
Page 1