Publication | Closed Access
An Upper Bound on the Capacity of the DNA Storage Channel
34
Citations
13
References
2019
Year
Unknown Venue
GeneticsDna AnalysisMolecular BiologyBiological ComputingGenomicsDna NanotechnologyData ScienceDna ComputingData ManagementVariable-length CodeDna SequencingInformation TheoryDna Storage ChannelDna ReplicationShannon CapacityUpper BoundBioinformaticsData CompressionBiologyChromatinNatural SciencesComputational BiologyChannel ModelMedicineGenome EditingSequence Assembly
Paved by recent advances in sequencing and synthesis technologies, DNA has evolved to a competitive medium for long-term data storage. In this paper we conduct an information theoretic study of the storage channel-the entity that formulates the relation between stored and sequenced strands. In particular, we derive an upper bound on the Shannon capacity of the channel. In our channel model, we incorporate the main attributes that characterize DNA-based data storage. That is, information is synthesized on many short DNA strands, and each strand is copied many times. Due to the storage and sequencing methods, the receiver draws strands from the original sequences in an uncontrollable manner, where it is possible that copies of the same sequence are drawn multiple times. Additionally, due to imperfections, the obtained strands can be perturbed by errors. We show that for a large range of parameters, the channel decomposes into sub-channels from each input sequence to multiple output sequences, so-called clusters. The cluster sizes hereby follow a Poisson distribution. Furthermore, the ordering of sub-channels is unknown to the receiver. Our results can be used to guide future experiments for DNA-based data storage by giving an upper bound on the achievable rate of any error-correcting code. We further give a detailed discussion and intuitive interpretation of the channel that provide insights about the nature of the channel and can inspire new ideas for error-correcting codes and decoding methods.
| Year | Citations | |
|---|---|---|
Page 1
Page 1