Publication | Closed Access
A probabilistic classification system for predicting the cellular localization sites of proteins.
298
Citations
6
References
1996
Year
Artificial IntelligenceEngineeringMachine LearningModel-based ReasoningSignal RecognitionMolecular BiologyStatistical Relational LearningCellular Localization SitesData ScienceData MiningPattern RecognitionProbabilistic ReasoningBiostatisticsProteomicsProbabilistic Classification SystemInstance-based LearningKnowledge DiscoveryProtein Localization SitesProtein ModelingProtein Structure PredictionIntelligent ClassificationComputer ScienceCell BiologyBioinformaticsProtein BioinformaticsExpert KnowledgeAutomated ReasoningNatural SciencesComputational BiologyCellular BiochemistrySystems BiologyLearning Classifier SystemCell Detection
The authors developed software that implements a probabilistic classification model to assign proteins to cellular localization sites based on their amino‑acid sequences. The system achieved 81 % accuracy on 336 E.
We have defined a simple model of classification which combines human provided expert knowledge with probabilistic reasoning. We have developed software to implement this model and have applied it to the problem of classifying proteins into their various cellular localization sites based on their amino acid sequences. Since our system requires no hand tuning to learn training data, we can now evaluate the prediction accuracy of protein localization sites by a more objective cross-validation method than earlier studies using production rule type expert systems. 336 E. coli proteins were classified into 8 classes with an accuracy of 81% while 1484 yeast proteins were classified into 10 classes with an accuracy of 55%. Additionally we report empirical results using three different strategies for handling continuously valued variables in our probabilistic reasoning system.
| Year | Citations | |
|---|---|---|
Page 1
Page 1