Concepedia

Publication | Open Access

CD-HIT Suite: a web server for clustering and comparing biological sequences

2.7K

Citations

9

References

2010

Year

TLDR

CD‑HIT is a widely used program for clustering and comparing large biological sequence datasets. The authors aimed to enhance CD‑HIT with additional functions, improved accuracy, scalability, and flexibility, and to develop the CD‑HIT Suite web server for clustering user‑uploaded datasets or comparing them to other datasets at different identity levels. They upgraded CD‑HIT by adding new features, boosting accuracy, scalability, and flexibility, and introduced the CD‑HIT Suite web server that enables users to cluster uploaded sequences or compare them to other datasets at various identity thresholds, while also providing downloadable cluster sets for public databases such as NCBI NR, SwissProt, and PDB. Users can now interactively explore the clusters within web browsers. Free access is available at http://cd‑hit.org, contact liwz@sdsc.edu, and supplementary data are available online.

Abstract

Abstract Summary: CD-HIT is a widely used program for clustering and comparing large biological sequence datasets. In order to further assist the CD-HIT users, we significantly improved this program with more functions and better accuracy, scalability and flexibility. Most importantly, we developed a new web server, CD-HIT Suite, for clustering a user-uploaded sequence dataset or comparing it to another dataset at different identity levels. Users can now interactively explore the clusters within web browsers. We also provide downloadable clusters for several public databases (NCBI NR, Swissprot and PDB) at different identity levels. Availability: Free access at http://cd-hit.org Contact: liwz@sdsc.edu Supplementary information: Supplementary data are available at Bioinformatics online.

References

YearCitations

Page 1