Concepedia

Publication | Open Access

Fast and sensitive taxonomic classification for metagenomics with Kaiju

2.1K

Citations

17

References

2016

Year

TLDR

Metagenomics is increasingly important for microbial ecology and human health, yet current fast k‑mer–based taxonomic classifiers often miss many reads due to limited sensitivity to evolutionary divergence. The study introduces Kaiju, a new metagenomic classifier that matches reads to protein sequences using the Burrows–Wheeler transform. Kaiju operates by locating maximum exact or inexact matches to protein sequences via the Burrows–Wheeler transform. In benchmark tests Kaiju achieved higher sensitivity with comparable precision to k‑mer classifiers, classified up to ten times more reads in real metagenomes, processes millions of reads per minute, and its source code and web server are publicly available.

Abstract

Abstract Metagenomics emerged as an important field of research not only in microbial ecology but also for human health and disease, and metagenomic studies are performed on increasingly larger scales. While recent taxonomic classification programs achieve high speed by comparing genomic k -mers, they often lack sensitivity for overcoming evolutionary divergence, so that large fractions of the metagenomic reads remain unclassified. Here we present the novel metagenome classifier Kaiju, which finds maximum (in-)exact matches on the protein-level using the Burrows–Wheeler transform. We show in a genome exclusion benchmark that Kaiju classifies reads with higher sensitivity and similar precision compared with current k -mer-based classifiers, especially in genera that are underrepresented in reference databases. We also demonstrate that Kaiju classifies up to 10 times more reads in real metagenomes. Kaiju can process millions of reads per minute and can run on a standard PC. Source code and web server are available at http://kaiju.binf.ku.dk .

References

YearCitations

Page 1