Publication | Open Access
SyMAP: A system for discovering and viewing syntenic regions of FPC maps
191
Citations
30
References
2006
Year
EngineeringGeovisualizationGeographic Information RetrievalGeneticsLinkage AnalysisGenomicsSequence AlignmentGeospatial MappingData ScienceData MiningMolecular EcologyComputational GeometrySyntenic RegionsCartographyBac End SequencesMachine VisionHybridization MarkersKnowledge DiscoveryGenome StructureStatistical GeneticsGenetic VariationComputer ScienceBioinformaticsFunctional GenomicsFpc MapsSpatial VerificationLinked Data VisualizationBiologyReference GenomeSystems BiologyMedicineSequence Assembly
Prior methods for comparing gene and chromosome organization across genomes relied on genetic maps or genomic sequences. SyMAP aligns FPC-based physical maps to genomic sequences or to each other using BAC end sequences, hybridization markers, and fingerprints, computes synteny blocks via a dynamic‑programming algorithm that automatically sets gap parameters and selects chains based on anchor count and Pearson correlation, and visualizes the results with interactive Java graphics. The algorithm was validated on three diverse datasets differing in BAC counts, marker numbers, anchor distances, and duplication history.
Previous approaches to comparing gene and chromosome organization between two genomes have been based on genetic maps or genomic sequences. We have developed a system to align an FPC-based physical map to a genomic sequence based on BAC end sequences and sequence-tagged hybridization markers and to align two FPC maps to one another based on shared markers and fingerprints. The system, called SyMAP (Synteny Mapping and Analysis Program), consists of an algorithm to compute synteny blocks and Web-based graphics to visualize the results. The approach to calculating the anchors (corresponding elements on the respective maps) maximizes the inclusion of anchors with different rates of divergence. Chains (putative syntenic sets of anchors) are computed using a dynamic programming algorithm, which includes off-diagonal anchors that result from map coordinate errors and small inversions. As the gap parameters (the distances allowed between anchors in a chain) can vary over different data sets and be difficult to set manually, they are automatically computed per data set. The criterion for a chain to be acceptable is based on the number of anchors and the Pearson correlation coefficient. Neighboring chains are merged into synteny blocks for display. This algorithm has been tested with three data sets that vary in the number of BACs, BAC end sequences, hybridization markers, distance between anchors, and number and antiquity of genome duplication events. The Web-based graphics uses Java for a highly interactive display that allows the user to interrogate the evidence of synteny.
| Year | Citations | |
|---|---|---|
Page 1
Page 1