Publication | Open Access
Evolving gene/transcript definitions significantly alter the interpretation of GeneChip data
1.8K
Citations
24
References
2005
Year
Genome‑wide expression profiling, especially with Affymetrix GeneChips, is a powerful tool, but its probe design relied on outdated genome and transcriptome annotations, causing significant informatics problems that distort data interpretation. This study aims to resolve these annotation issues by providing updated probe set definitions. The authors identified probe‑level problems using current genome and transcriptome databases, then reorganized probes on popular GeneChips into gene‑, transcript‑, and exon‑specific sets based on updated annotations, cDNA/EST clustering, and SNP information. Comparison of analyses using the original versus redefined probe sets revealed a 30–50% discrepancy in differentially expressed genes, demonstrating that many past GeneChip conclusions are likely flawed and underscoring the need to reanalyze existing data with updated definitions.
Genome-wide expression profiling is a powerful tool for implicating novel gene ensembles in cellular mechanisms of health and disease. The most popular platform for genome-wide expression profiling is the Affymetrix GeneChip. However, its selection of probes relied on earlier genome and transcriptome annotation which is significantly different from current knowledge. The resultant informatics problems have a profound impact on analysis and interpretation the data. Here, we address these critical issues and offer a solution. We identified several classes of problems at the individual probe level in the existing annotation, under the assumption that current genome and transcriptome databases are more accurate than those used for GeneChip design. We then reorganized probes on more than a dozen popular GeneChips into gene-, transcript- and exon-specific probe sets in light of up-to-date genome, cDNA/EST clustering and single nucleotide polymorphism information. Comparing analysis results between the original and the redefined probe sets reveals ∼30–50% discrepancy in the genes previously identified as differentially expressed, regardless of analysis method. Our results demonstrate that the original Affymetrix probe set definitions are inaccurate, and many conclusions derived from past GeneChip analyses may be significantly flawed. It will be beneficial to re-analyze existing GeneChip data with updated probe set definitions.
| Year | Citations | |
|---|---|---|
Page 1
Page 1