Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
article

Incremental String Comparison

Published: 01 April 1998 Publication History

Abstract

The problem of comparing two sequences A and B to determine their longest common subsequence (LCS) or the edit distance between them has been much studied. In this paper we consider the following incremental version of these problems: given an appropriate encoding of a comparison between A and B, can one incrementally compute the answer for A and bB, and the answer for A and Bb with equal efficiency, where b is an additional symbol? Our main result is a theorem exposing a surprising relationship between the dynamic programming solutions for two such "adjacent" problems. Given a threshold k on the number of differences to be permitted in an alignment, the theorem leads directly to an O(k) algorithm for incrementally computing a new solution from an old one, as contrasts the O(k2) time required to compute a solution from scratch. We further show, with a series of applications, that this algorithm is indeed more powerful than its nonincremental counterpart. We show this by solving the applications with greater asymptotic efficiency than heretofore possible. For example, we obtain O(nk) algorithms for the longest prefix approximate match problem, the approximate overlap problem, and cyclic string comparison.

Cited By

View all
  • (2024)An Improved Algorithm for The k-Dyck Edit Distance ProblemACM Transactions on Algorithms10.1145/362753920:3(1-25)Online publication date: 21-Jun-2024
  • (2024)Almost Linear Size Edit Distance SketchProceedings of the 56th Annual ACM Symposium on Theory of Computing10.1145/3618260.3649783(956-967)Online publication date: 10-Jun-2024
  • (2024)Fast Edit Distance Prediction for All Pairs of Sequences in Very Large NGS DatasetsAdvances in Knowledge Discovery and Data Mining10.1007/978-981-97-2238-9_6(72-91)Online publication date: 7-May-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image SIAM Journal on Computing
SIAM Journal on Computing  Volume 27, Issue 2
April 1998
286 pages
ISSN:0097-5397
Issue’s Table of Contents

Publisher

Society for Industrial and Applied Mathematics

United States

Publication History

Published: 01 April 1998

Author Tags

  1. dynamic programming
  2. edit-distance
  3. string matching

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 10 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)An Improved Algorithm for The k-Dyck Edit Distance ProblemACM Transactions on Algorithms10.1145/362753920:3(1-25)Online publication date: 21-Jun-2024
  • (2024)Almost Linear Size Edit Distance SketchProceedings of the 56th Annual ACM Symposium on Theory of Computing10.1145/3618260.3649783(956-967)Online publication date: 10-Jun-2024
  • (2024)Fast Edit Distance Prediction for All Pairs of Sequences in Very Large NGS DatasetsAdvances in Knowledge Discovery and Data Mining10.1007/978-981-97-2238-9_6(72-91)Online publication date: 7-May-2024
  • (2023)Locally Consistent Decomposition of Strings with Applications to Edit Distance SketchingProceedings of the 55th Annual ACM Symposium on Theory of Computing10.1145/3564246.3585239(219-232)Online publication date: 2-Jun-2023
  • (2022)Almost-optimal sublinear-time edit distance in the low distance regimeProceedings of the 54th Annual ACM SIGACT Symposium on Theory of Computing10.1145/3519935.3519990(1102-1115)Online publication date: 9-Jun-2022
  • (2021)Approximating Edit Distance in Truly Subquadratic Time: Quantum and MapReduceJournal of the ACM10.1145/345680768:3(1-41)Online publication date: 13-May-2021
  • (2020)Does preprocessing help in fast sequence comparisons?Proceedings of the 52nd Annual ACM SIGACT Symposium on Theory of Computing10.1145/3357713.3384300(657-670)Online publication date: 22-Jun-2020
  • (2020)Constant-factor approximation of near-linear edit distance in near-linear timeProceedings of the 52nd Annual ACM SIGACT Symposium on Theory of Computing10.1145/3357713.3384282(685-698)Online publication date: 22-Jun-2020
  • (2019)Massively parallel approximation algorithms for edit distance and longest common subsequenceProceedings of the Thirtieth Annual ACM-SIAM Symposium on Discrete Algorithms10.5555/3310435.3310535(1654-1672)Online publication date: 6-Jan-2019
  • (2019)Approximating LCS in linear timeProceedings of the Thirtieth Annual ACM-SIAM Symposium on Discrete Algorithms10.5555/3310435.3310507(1181-1200)Online publication date: 6-Jan-2019
  • Show More Cited By

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media