Publication | Closed Access
DNA visual and analytic data mining
173
Citations
6
References
2002
Year
Unknown Venue
EngineeringGeneticsDna AnalysisData VisualizationDna SequencesGene RecognitionData ScienceData MiningPattern RecognitionBiostatisticsNeural Network ClassifiersBiological Network VisualizationSequence AnalysisKnowledge DiscoveryVisual Data MiningAverage Mutual InformationBioinformaticsAnalytic Data MiningComputational BiologyClassificationMedicine
Describes data exploration techniques designed to classify DNA sequences. Several visualization and data mining techniques were used to validate and attempt to discover new methods for distinguishing coding DNA sequences (exons) from non-coding DNA sequences (introns). The goal of the data mining was to see whether some other, possibly non-linear combination of the fundamental position-dependent DNA nucleotide frequency values could be a better predictor than the AMI (average mutual information). We tried many different classification techniques including rule-based classifiers and neural networks. We also used visualization of both the original data and the results of the data mining to help verify patterns and to understand the distinction between the different types of data and classifications. In particular, the visualization helped us develop refinements to neural network classifiers, which have accuracies as high as any known method. Finally, we discuss the interactions between visualization and data mining and suggest an integrated approach.
| Year | Citations | |
|---|---|---|
Page 1
Page 1