Publication | Open Access
Support Vector Machine Prediction of N- and O-glycosylation Sites Using Whole Sequence Information and Subcellular Localization
19
Citations
23
References
2009
Year
EngineeringBiomolecular Structure PredictionGlycobiologySubcellular LocalizationMolecular BiologySugar ChainsGlycoproteomicsSupport Vector MachineMolecular CharacterizationGlycosylation SitesComputational BiochemistryGlycosylationBiochemistryProtein ModelingProtein Structure PredictionSolution Nmr SpectroscopyBioinformaticsProtein BioinformaticsBiomolecular EngineeringNatural SciencesComputational BiologySynthetic BiologySystems BiologyCarbohydrate-protein Interaction
Background: Glycans, or sugar chains, are one of the three types of chain (DNA, protein and glycan) that constitute living organisms; they are often called “the third chain of the living organism”. About half of all proteins are estimated to be glycosylated based on the SWISS-PROT database. Glycosylation is one of the most important post-translational modifications, affecting many critical functions of proteins, including cellular communication, and their tertiary structure. In order to computationally predict N-glycosylation and O-glycosylation sites, we developed three kinds of support vector machine (SVM) model, which utilize local information, general protein information and/or subcellular localization in consideration of the binding specificity of glycosyltransferases and the characteristic subcellular localization of glycoproteins. Results: In our computational experiment, the model integrating three kinds of information achieved about 90% accuracy in predictions of both N-glycosylation and O-glycosylation sites. Moreover, our model was applied to a protein whose glycosylation sites had not been previously identified and we succeeded in showing that the glycosylation sites predicted by our model were structurally reasonable. Conclusions: In the present study, we developed a comprehensive and effective computational method that detects glycosylation sites. We conclude that our method is a comprehensive and effective computational prediction method that is applicable at a genome-wide level.
| Year | Citations | |
|---|---|---|
Page 1
Page 1