Edit distance: Sketching, streaming, and document exchange
D Belazzougui, Q Zhang - 2016 IEEE 57th Annual Symposium …, 2016 - ieeexplore.ieee.org
2016 IEEE 57th Annual Symposium on Foundations of Computer Science …, 2016•ieeexplore.ieee.org
We show that in the document exchange problem, where Alice holds x ϵ {0, 1} n and Bob
holds y ϵ {0, 1} n, Alice can send Bob a message of size O (K (log 2 K+ log n)) bits such that
Bob can recover x using the message and his input y if the edit distance between x and y is
no more than K, and output" error" otherwise. Both the encoding and decoding can be done
in time Õ (n+ poly (K)). This result significantly improves on the previous communication
bounds under polynomial encoding/decoding time. We also show that in the referee model …
holds y ϵ {0, 1} n, Alice can send Bob a message of size O (K (log 2 K+ log n)) bits such that
Bob can recover x using the message and his input y if the edit distance between x and y is
no more than K, and output" error" otherwise. Both the encoding and decoding can be done
in time Õ (n+ poly (K)). This result significantly improves on the previous communication
bounds under polynomial encoding/decoding time. We also show that in the referee model …
We show that in the document exchange problem, where Alice holds x ϵ {0, 1} n and Bob holds y ϵ {0, 1} n , Alice can send Bob a message of size O(K(log 2 K + log n)) bits such that Bob can recover x using the message and his input y if the edit distance between x and y is no more than K, and output "error" otherwise. Both the encoding and decoding can be done in time Õ(n + poly(K)). This result significantly improves on the previous communication bounds under polynomial encoding/decoding time. We also show that in the referee model, where Alice and Bob hold x and y respectively, they can compute sketches of x and y of sizes poly(K log n) bits (the encoding), and send to the referee, who can then compute the edit distance between x and y together with all the edit operations if the edit distance is no more than K, and output "error" otherwise (the decoding). To the best of our knowledge, this is the first result for sketching edit distance using poly(K log n) bits. Moreover, the encoding phase of our sketching algorithm can be performed by scanning the input string in one pass. Thus our sketching algorithm also implies the first streaming algorithm for computing edit distance and all the edits exactly using poly(K log n) bits of space.
ieeexplore.ieee.org