Publication | Closed Access
Disconnected Maximum Common Substructures under Constraints
12
Citations
27
References
2020
Year
Mathematical ProgrammingEngineeringMaximum Common SubstructureCombinatorial DesignMolecular BiologyMaximum Common SubstructuresComputational ChemistryStructural OptimizationChemistryMolecular DesignMolecular ComputingDiscrete MathematicsCombinatorial OptimizationBiophysicsSmall Molecule LibraryConnected ComponentsExtremal Set TheoryScaffold DetectionBioinformaticsStructural BiologyGraph TheoryMolecular PropertyComputational BiologyStructure DiscoveryStructural TopologyMedicine
The maximum common substructure (MCS) problem is an important, well-studied problem in cheminformatics. It is applied in several application scenarios like molecule superimposition and scaffold detection or as a similarity measure in virtual screening and clustering. In many cases, the connected MCS is preferred since it is faster to calculate and a highly fragmented MCS is not very meaningful from a chemical point of view. Nevertheless, a disconnected MCS (dMCS) can be very instructive if it consists of reasonably sized molecular parts connected by variable groups. We present a new algorithm named RIMACS, which is able to calculate the dMCS under constraints. We can control the maximum number of connected components and their minimal size using a modified local substructure mapping approach. A formal proof of correctness is provided as well as extended runtime evaluations on chemical data. The evaluation of RIMACS shows that a small number of connected components helps us to improve MCS similarity in a meaningful way while keeping the runtime requirements in a reasonable range.
| Year | Citations | |
|---|---|---|
Page 1
Page 1