Publication | Closed Access
Interrogating the human genome using uninterpreted mass spectrometry data
85
Citations
0
References
2001
Year
GeneticsMolecular BiologyGenomicsBioinformatics DatabaseHigh Throughput SequencingPublic AvailabilityHuman GenomeProteomicsDna SequencingSequence AnalysisOmicsComputational Mass SpectrometryFunctional GenomicsBioinformaticsEst DatabaseNatural SciencesNext-generation SequencingMass SpectrometryDraft AssemblySystems BiologyMedicine
The public availability of a draft assembly of the human genome has enabled us to demonstrate, for the first time, the feasibility of searching a complete, unmasked eukaryotic genome using uninterpreted mass spectrometry data. A complex LC-MS/MS data set, containing peptides from at least 22 human proteins, was searched against a comprehensive, nonidentical protein database, an expressed sequence tag (EST) database, and the International Human Genome Project draft assembly of the human genome. The results from the three searches are compared in detail, and the merits of the different databases for this application are discussed. In the case of the EST database, the UniGene index provided a method of simplifying and summarising the search results. In the case of the genomic DNA, the presence of introns prevented matching of roughly one quarter of the spectra, but the technique can provide primary experimental verification of predicted coding sequences, and has the potential to identify novel coding sequences.