Publication | Open Access
Unsupervised evolution of protein and antibody complexes with a structure-informed language model
107
Citations
73
References
2024
Year
EngineeringStructural BioinformaticsImmunologyMolecular BiologyStructure-informed Language ModelViral Structural ProteinLarge Language ModelsProtein FoldingComputational LinguisticsProtein ComplexesProtein ModelingProtein Structure PredictionBioinformaticsProtein BioinformaticsAntibody ComplexesStructural BiologyComputational BiologyProtein EvolutionProtein EngineeringSystems BiologyMedicineLinguistics
Large language models trained on sequence information alone can learn high-level principles of protein design. However, beyond sequence, the three-dimensional structures of proteins determine their specific function, activity, and evolvability. Here, we show that a general protein language model augmented with protein structure backbone coordinates can guide evolution for diverse proteins without the need to model individual functional tasks. We also demonstrate that ESM-IF1, which was only trained on single-chain structures, can be extended to engineer protein complexes. Using this approach, we screened about 30 variants of two therapeutic clinical antibodies used to treat severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) infection. We achieved up to 25-fold improvement in neutralization and 37-fold improvement in affinity against antibody-escaped viral variants of concern BQ.1.1 and XBB.1.5, respectively. These findings highlight the advantage of integrating structural information to identify efficient protein evolution trajectories without requiring any task-specific training data.
| Year | Citations | |
|---|---|---|
Page 1
Page 1