Concepedia

Publication | Open Access

Protein–Sol: a web tool for predicting protein solubility from sequence

776

Citations

18

References

2017

Year

TLDR

Protein solubility is a key property for industrial and therapeutic uses, yet predicting it remains challenging despite growing knowledge of physicochemical determinants. Using E. coli cell‑free expression data, the model computes 35 sequence‑based properties, assigns weights by separating low and high solubility subsets, and outputs a predicted solubility with the most deviating features, also profiling fold propensity and net segment charge along the sequence. The added features’ usefulness is illustrated by improved prediction for thioredoxin.

Abstract

Protein solubility is an important property in industrial and therapeutic applications. Prediction is a challenge, despite a growing understanding of the relevant physicochemical properties.Protein-Sol is a web server for predicting protein solubility. Using available data for Escherichia coli protein solubility in a cell-free expression system, 35 sequence-based properties are calculated. Feature weights are determined from separation of low and high solubility subsets. The model returns a predicted solubility and an indication of the features which deviate most from average values. Two other properties are profiled in windowed calculation along the sequence: fold propensity, and net segment charge. The utility of these additional features is demonstrated with the example of thioredoxin.The Protein-Sol webserver is available at http://protein-sol.manchester.ac.uk.jim.warwicker@manchester.ac.uk.

References

YearCitations

Page 1