Publication | Closed Access
Bounds on the Complexity of the Longest Common Subsequence Problem
247
Citations
10
References
1976
Year
Computational Complexity TheoryEngineeringGeneral Lower BoundComputational ComplexitySequence AlignmentString-searching AlgorithmPhylogeneticsString ProcessingExtremal CombinatoricsDecision Tree ModelDiscrete MathematicsCoding TheoryCombinatorial OptimizationSequence AnalysisLower BoundAlphabet SizeComputer ScienceAlgorithmic Information TheoryBioinformaticsNatural SciencesEvolutionary BiologyCombinatorial Pattern MatchingTime ComplexityOrder-sorted Logic
The longest common subsequence problem, used for file comparison and genetic sequence analysis, seeks the longest subsequence common to two strings. The authors analyze its computational difficulty via a decision‑tree model where each node tests symbol equality or inequality. They prove that without an alphabet size bound, any algorithm requires time proportional to the product of string lengths, and provide tighter lower bounds depending on the alphabet‑to‑length ratio, showing that forbidding intra‑string comparisons yields linear time for two symbols and quadratic for three or more.
The problem of finding a longest common subsequence of two strings is discussed. This problem arises in data processing applications such as comparing two files and in genetic applications such as studying molecular evolution. The difficulty of computing a longest common subsequence of two strings is examined using the decision tree model of computation, in which vertices represent “equal - unequal” comparisons. It is shown that unless a bound on the total number of distinct symbols is assumed, every solution to the problem can consume an amount of time that is proportional to the product of the lengths of the two strings. A general lower bound as a function of the ratio of alphabet size to string length is derived. The case where comparisons between symbols of the same string are forbidden is also considered and it is shown that this problem is of linear complexity for a two-symbol alphabet and quadratic for an alphabet of three or more symbols.
| Year | Citations | |
|---|---|---|
Page 1
Page 1