Concepedia

Publication | Open Access

UniProt: the Universal Protein Knowledgebase in 2023

6.5K

Citations

37

References

2022

Year

TLDR

The scientific community continues to contribute publications and annotations to UniProt entries of interest. The UniProt Knowledgebase aims to provide users with a comprehensive, high‑quality, freely accessible set of protein sequences annotated with functional information, and this publication describes enhancements to its data‑processing pipeline and website to improve user experience. We enhanced the data‑processing pipeline and website, extracting detailed literature annotations for reviewed entries and supplementing unreviewed entries with machine‑learning‑derived annotations, while providing an interface that offers AlphaFold structures for over 85 % of entries and improved subcellular localisation visualisations. UniProtKB now contains over 227 million sequences, with a reference proteome underway for each taxonomic group, and the new interface provides AlphaFold structures for more than 85 % of entries and enhanced subcellular localisation visualisations.

Abstract

Abstract The aim of the UniProt Knowledgebase is to provide users with a comprehensive, high-quality and freely accessible set of protein sequences annotated with functional information. In this publication we describe enhancements made to our data processing pipeline and to our website to adapt to an ever-increasing information content. The number of sequences in UniProtKB has risen to over 227 million and we are working towards including a reference proteome for each taxonomic group. We continue to extract detailed annotations from the literature to update or create reviewed entries, while unreviewed entries are supplemented with annotations provided by automated systems using a variety of machine-learning techniques. In addition, the scientific community continues their contributions of publications and annotations to UniProt entries of their interest. Finally, we describe our new website (https://www.uniprot.org/), designed to enhance our users’ experience and make our data easily accessible to the research community. This interface includes access to AlphaFold structures for more than 85% of all entries as well as improved visualisations for subcellular localisation of proteins.

References

YearCitations

Page 1