Concepedia

Publication | Open Access

Prevalence of quadruplexes in the human genome

1.7K

Citations

42

References

2005

Year

TLDR

Guanine‑rich DNA sequences can fold into four‑stranded G‑quadruplex structures. The study proposes a rule to predict G‑quadruplex‑forming sequences and a search algorithm to locate them in the human genome. The authors count quadruplexes in the human genome and compare the observed numbers to predictions from Bernoulli and Markov chain models across different window sizes. The loop‑length distribution differs from random, indicating many relevant quadruplexes, and quadruplexes are significantly repressed in exonic coding strands, suggesting they are disfavored in RNA‑forming sequences.

Abstract

Abstract Guanine-rich DNA sequences of a particular form have the ability to fold into four-stranded structures called G-quadruplexes. In this paper, we present a working rule to predict which primary sequences can form this structure, and describe a search algorithm to identify such sequences in genomic DNA. We count the number of quadruplexes found in the human genome and compare that with the figure predicted by modelling DNA as a Bernoulli stream or as a Markov chain, using windows of various sizes. We demonstrate that the distribution of loop lengths is significantly different from what would be expected in a random case, providing an indication of the number of potentially relevant quadruplex-forming sequences. In particular, we show that there is a significant repression of quadruplexes in the coding strand of exonic regions, which suggests that quadruplex-forming patterns are disfavoured in sequences that will form RNA.

References

YearCitations

Page 1