Publication | Open Access
BERTology Meets Biology: Interpreting Attention in Protein Language Models
53
Citations
82
References
2020
Year
Unknown Venue
EngineeringStructural BioinformaticsBiomolecular Structure PredictionMolecular BiologyPsycholinguisticsMultilingual PretrainingLarge Language ModelNatural Language ProcessingProtein FoldingComputational LinguisticsProteomicsBiophysicsMachine TranslationNatural LanguageMedicineProtein ModelingProtein Structure PredictionInner WorkingsProtein BioinformaticsStructural BiologyAbstract Transformer ArchitecturesComputational BiologySystems BiologyBertology Meets BiologyLinguisticsFolding Structure
Abstract Transformer architectures have proven to learn useful representations for protein classification and generation tasks. However, these representations present challenges in interpretability. Through the lens of attention, we analyze the inner workings of the Transformer and explore how the model discerns structural and functional properties of proteins. We show that attention (1) captures the folding structure of proteins, connecting amino acids that are far apart in the underlying sequence, but spatially close in the three-dimensional structure, (2) targets binding sites, a key functional component of proteins, and (3) focuses on progressively more complex biophysical properties with increasing layer depth. We also present a three-dimensional visualization of the interaction between attention and protein structure. Our findings align with known biological processes and provide a tool to aid discovery in protein engineering and synthetic biology. The code for visualization and analysis is available at https://github.com/salesforce/provis .
| Year | Citations | |
|---|---|---|
Page 1
Page 1