Concepedia

Publication | Open Access

Recompleting the <i>Caenorhabditis elegans</i> genome

162

Citations

88

References

2019

Year

Abstract

<i>Caenorhabditis elegans</i> was the first multicellular eukaryotic genome sequenced to apparent completion. Although this assembly employed a standard <i>C. elegans</i> strain (N2), it used sequence data from several laboratories, with DNA propagated in bacteria and yeast. Thus, the N2 assembly has many differences from any <i>C. elegans</i> available today. To provide a more accurate <i>C. elegans</i> genome, we performed long-read assembly of VC2010, a modern strain derived from N2. Our VC2010 assembly has 99.98% identity to N2 but with an additional 1.8 Mb including tandem repeat expansions and genome duplications. For 116 structural discrepancies between N2 and VC2010, 97 structures matching VC2010 (84%) were also found in two outgroup strains, implying deficiencies in N2. Over 98% of N2 genes encoded unchanged products in VC2010; moreover, we predicted ≥53 new genes in VC2010. The recompleted genome of <i>C. elegans</i> should be a valuable resource for genetics, genomics, and systems biology.

References

YearCitations

Page 1