Publication | Closed Access
ESTScan: a program for detecting, evaluating, and reconstructing potential coding regions in EST sequences.
1.1K
Citations
15
References
1999
Year
EngineeringGeneticsGene DiscoveryGenomicsGene RecognitionSoftware AnalysisHigh Throughput SequencingData ScienceData MiningMolecular EcologyHidden Markov ModelBiostatisticsPattern AnalysisCoding TheoryVariable-length CodeSequence AnalysisQuality ControlComputer ScienceFunctional GenomicsBioinformaticsSignal ProcessingError Correction CodeProgram AnalysisNext-generation SequencingEst SequencesComputational BiologySystems BiologyMedicineSequence AssemblyPotential Coding Regions
One of the problems in large‑scale analysis of unannotated, low‑quality EST sequences is detecting coding regions and correcting frequent frameshift errors. The study introduces a hidden Markov model that explicitly handles sequence errors and incorporates a correction method. This model was implemented in the efficient, robust program ESTScan. ESTScan detects and extracts coding regions from low‑quality sequences with high selectivity and sensitivity, accurately corrects frameshift errors, and is poised to aid gene discovery, quality control, and contig assembly in genome sequencing projects.
One of the problems associated with the large-scale analysis of unannotated, low quality EST sequences is the detection of coding regions and the correction of frameshift errors that they often contain. We introduce a new type of hidden Markov model that explicitly deals with the possibility of errors in the sequence to analyze, and incorporates a method for correcting these errors. This model was implemented in an efficient and robust program, ESTScan. We show that ESTScan can detect and extract coding regions from low-quality sequences with high selectivity and sensitivity, and is able to accurately correct frameshift errors. In the framework of genome sequencing projects, ESTScan could become a very useful tool for gene discovery, for quality control, and for the assembly of contigs representing the coding regions of genes.
| Year | Citations | |
|---|---|---|
Page 1
Page 1