Concepedia

Publication | Open Access

SAM/BAM format v1.5 extensions for <i>de novo</i> assemblies

21

Citations

13

References

2015

Year

Abstract

ABSTRACT Summary: The plain text Sequence Alignment/Map (SAM) file format and its companion binary form (BAM) are a generic alignment format for storing read alignments against reference sequences (and unmapped reads) together with structured meta-data (Li et al. , 2009). Driven by the needs of the 1000 Genomes Project which sequenced many individual human genomes, early SAM/BAM usage focused on pairwise alignments of reads to a reference. However, through the CIGAR P operator multiple sequence alignments can also be preserved. Herein we describe clarifications and additions in version 1.5 of the specification to facilitate storing de novo sequence alignments: Padded reference sequences (with gap characters), annotation of reads or regions of the reference, and the option of embedding the reference sequence within the file. Availability: The latest public release of the specification is at http://samtools.sourceforge.net/SAM1.pdf , with in development drafts at https://github.com/samtools/hts-specs/ under version control. Contact: peter.cock@hutton.ac.uk

References

YearCitations

Page 1