Publication | Open Access
Unicycler: Resolving bacterial genome assemblies from short and long sequencing reads
8.2K
Citations
47
References
2017
Year
BioinformaticsBiologyLong Sequencing ReadsBacterial Genome AssembliesLong-read SequencingDna SequencingNext-generation SequencingGeneticsBacterial GenomesGenome SequencingMicrobiologyGenomicsIllumina DnaShort ReadsMedicineSequencingSequence AssemblyHigh Throughput Sequencing
Illumina short reads are accurate yet fragmented, while PacBio and Oxford Nanopore long reads can produce complete assemblies but are costly and error‑prone, creating a demand for hybrid assembly tools that combine both strengths. Unicycler is introduced as a hybrid assembler that combines short and long reads to produce accurate, complete, and cost‑effective bacterial genome assemblies. Unicycler constructs an initial assembly graph from short reads with SPAdes, then refines it by aligning long reads to the graph using a novel semi‑global aligner and simplifying the graph with combined read information. Benchmarking on synthetic and real data demonstrates that Unicycler produces larger contigs with fewer misassemblies than competing hybrid assemblers, even at low long‑read depth and accuracy, and is released as open‑source software on GitHub.
The Illumina DNA sequencing platform generates accurate but short reads, which can be used to produce accurate but fragmented genome assemblies. Pacific Biosciences and Oxford Nanopore Technologies DNA sequencing platforms generate long reads that can produce complete genome assemblies, but the sequencing is more expensive and error-prone. There is significant interest in combining data from these complementary sequencing technologies to generate more accurate "hybrid" assemblies. However, few tools exist that truly leverage the benefits of both types of data, namely the accuracy of short reads and the structural resolving power of long reads. Here we present Unicycler, a new tool for assembling bacterial genomes from a combination of short and long reads, which produces assemblies that are accurate, complete and cost-effective. Unicycler builds an initial assembly graph from short reads using the de novo assembler SPAdes and then simplifies the graph using information from short and long reads. Unicycler uses a novel semi-global aligner to align long reads to the assembly graph. Tests on both synthetic and real reads show Unicycler can assemble larger contigs with fewer misassemblies than other hybrid assemblers, even when long-read depth and accuracy are low. Unicycler is open source (GPLv3) and available at github.com/rrwick/Unicycler.
| Year | Citations | |
|---|---|---|
Page 1
Page 1