Publication | Open Access
DDIG-in: detecting disease-causing genetic variations due to frameshifting indels and nonsense mutations employing sequence and structural properties at nucleotide and protein levels
64
Citations
34
References
2015
Year
We have built a machine learning method called DDIG-in (FS) based on real human genetic variations from the Human Gene Mutation Database (inherited disease-causing) and the 1000 Genomes Project (GP) (putatively neutral). The method incorporates both sequence and predicted structural features and yields a robust performance by 10-fold cross-validation and independent tests on both FS indels and NS variants. We showed that human-derived NS variants and FS indels derived from animal orthologs can be effectively employed for independent testing of our method trained on human-derived FS indels. DDIG-in (FS) achieves a Matthews correlation coefficient (MCC) of 0.59, a sensitivity of 86%, and a specificity of 72% for FS indels. Application of DDIG-in (FS) to NS variants yields essentially the same performance (MCC of 0.43) as a method that was specifically trained for NS variants. DDIG-in (FS) was shown to make a significant improvement over existing techniques.
| Year | Citations | |
|---|---|---|
1990 | 92.8K | |
1995 | 39.8K | |
1995 | 31.8K | |
2010 | 8K | |
2014 | 6.4K | |
2001 | 5.8K | |
1994 | 3.1K | |
2012 | 2.9K | |
2001 | 2.7K | |
2003 | 2.5K |
Page 1
Page 1