Publication | Open Access
AlphaFold2 reveals commonalities and novelties in protein structure space for 21 model organisms
31
Citations
38
References
2022
Year
Unknown Venue
EngineeringStructural BioinformaticsBiomolecular Structure PredictionMolecular BiologyProtein Structure SpaceProtein FoldingAf2 Domain StructuresProteomicsBiochemistryAf2 ModelsModel OrganismsProtein ModelingProtein Structure PredictionComputational ModelingDeep LearningBioinformaticsProtein BioinformaticsStructural BiologyNew Af2 ModelsBiologyComputational BiologyProtein EvolutionAlphafold2 Reveals CommonalitiesMedicineFoundation Models
Abstract Over the last year, there have been substantial improvements in protein structure prediction, particularly in methods like DeepMind’s AlphaFold2 (AF2) that exploit deep learning strategies. Here we report a new CATH-Assign protocol which is used to analyse the first tranche of AF2 models predicted for 21 model organisms and discuss insights these models bring on the nature of protein structure space. We analyse good quality models and those with no unusual structural characteristics, i.e., features rarely seen in experimental structures. For the ∼370,000 models that meet these criteria, we observe that 92% can be assigned to evolutionary superfamilies in CATH. The remaining domains cluster into 2,367 putative novel superfamilies. Detailed manual analysis on a subset of 618 of those which had at least one human relative revealed some extremely remote homologies and some further unusual features, but 26 could be confirmed as novel superfamilies and one of these has an alpha-beta propeller architectural arrangement never seen before. By clustering both experimental and predicted AF2 domain structures into distinct ‘global fold’ groups, we observe that the new AF2 models in CATH increase information on structural diversity by 36%. This expansion in structural diversity will help to reveal associated functional diversity not previously detected. Our novel CATH-Assign protocol scales well and will be able to harness the huge expansion (at least 100 million models) in structural data promised by DeepMind to provide more comprehensive coverage of even the most diverse superfamilies to help rationalise evolutionary changes in their functions.
| Year | Citations | |
|---|---|---|
Page 1
Page 1