Publication | Open Access
TagRecon: High-Throughput Mutation Identification through Sequence Tagging
119
Citations
36
References
2010
Year
EngineeringGeneticsPathologyGenomicsSequence TaggingBioinformatics DatabaseSoftware AnalysisBiostatisticsShotgun ProteomicsProteomicsTandem Mass SpectraMutated PeptidesTranslational BioinformaticsBiological DatabaseSequence AnalysisOmicsFunctional GenomicsBioinformaticsProtein BioinformaticsMutation-based TestingNext-generation SequencingComputational BiologySystems BiologyMedicine
Shotgun proteomics produces collections of tandem mass spectra that contain all the data needed to identify mutated peptides from clinical samples. Identifying these sequence variations, however, has not been feasible with conventional database search strategies, which require exact matches between observed and expected sequences. Searching for mutations as mass shifts on specified residues through database search can incur significant performance penalties and generate substantial false positive rates. Here we describe TagRecon, an algorithm that leverages inferred sequence tags to identify unanticipated mutations in clinical proteomic data sets. TagRecon identifies unmodified peptides as sensitively as the related MyriMatch database search engine. In both LTQ and Orbitrap data sets, TagRecon outperformed state of the art software in recognizing sequence mismatches from data sets with known variants. We developed guidelines for filtering putative mutations from clinical samples, and we applied them in an analysis of cancer cell lines and an examination of colon tissue. Mutations were found in up to 6% of identified peptides, and only a small fraction corresponded to dbSNP entries. The RKO cell line, which is DNA mismatch repair deficient, yielded more mutant peptides than the mismatch repair proficient SW480 line. Analysis of colon cancer tumor and adjacent tissue revealed hydroxyproline modifications associated with extracellular matrix degradation. These results demonstrate the value of using sequence tagging algorithms to fully interrogate clinical proteomic data sets.
| Year | Citations | |
|---|---|---|
Page 1
Page 1