## How Hard is to Compute the Edit Distance

#### Giovanni Pighizzini

###### Dipartimento di Scienze dell'Informazione

Università degli Studi di Milano

via Comelico 39 -- 20135 Milano, Italy

##### Abstract

The notion of *edit distance* arises in very different fields, such
as self--correcting codes,
parsing theory, speech recognition, and molecular biology.
The edit distance between an input string and a
language *L* is the minimum cost of a sequence of *edit operations*
(*substitution* of a
symbol in another incorrect symbol, *insertion* of an extraneous symbol,
*deletion* of a symbol) needed to change the input string into
a sentence of *L.*
In this paper we study the complexity of computing the edit distance,
discovering sharp boundaries between classes of languages for which this
function can be efficiently evaluated and classes of languages for which
it seems to be difficult to compute. Our main result is a parallel algorithm for
computing the edit distance for the class of languages accepted by
one--way nondeterministic auxiliary pushdown automata working in
polynomial time, a class that strictly contains context--free languages.
Moreover, we show that this algorithm can be extended in order to find a
sentence of the language with minimum distance with respect to the input
string.

**Download**

Compressed postscript

Compressed dvi