Concepedia

Publication | Open Access

Roary: rapid large-scale prokaryote pan genome analysis

5.6K

Citations

7

References

2015

Year

TLDR

Prokaryote population sequencing studies now routinely involve hundreds or thousands of isolates, enabling detailed insights into the genetic structure of prokaryotic genomes. Roary was developed to rapidly build large‑scale pan genomes and identify core and accessory genes. Roary, implemented in Perl, clusters orthologous genes across isolates to construct pan genomes and delineate core and accessory genes. Roary enables construction of pan genomes for thousands of samples on a standard desktop with high accuracy, producing a 1000‑isolate pan genome in 4.5 hours using 13 GB RAM, and scaling further with multiple processors. Roary is freely available under GPLv3 at http://sanger-pathogens.github.io/Roary, with contact roary@sanger.ac.uk and supplementary data hosted on Bioinformatics online.

Abstract

Abstract Summary: A typical prokaryote population sequencing study can now consist of hundreds or thousands of isolates. Interrogating these datasets can provide detailed insights into the genetic structure of prokaryotic genomes. We introduce Roary, a tool that rapidly builds large-scale pan genomes, identifying the core and accessory genes. Roary makes construction of the pan genome of thousands of prokaryote samples possible on a standard desktop without compromising on the accuracy of results. Using a single CPU Roary can produce a pan genome consisting of 1000 isolates in 4.5 hours using 13 GB of RAM, with further speedups possible using multiple processors. Availability and implementation: Roary is implemented in Perl and is freely available under an open source GPLv3 license from http://sanger-pathogens.github.io/Roary Contact: roary@sanger.ac.uk Supplementary information: Supplementary data are available at Bioinformatics online.