Luís M. S. Russo,

R – INESC-ID Lisboa

Abstract:

Approximate string matching is the problem of finding a pattern in a text
allowing for some errors to occur.
Indexing methods for the approximate string matching problem spend a
considerable effort generating condensed neighborhoods. In this session we
point out that condensed neighborhoods are not a minimal
representation of a pattern neighborhood. We show that we can restrict
our attention to super condensed neighborhoods which are minimal. We
then present a basic algorithm for generating Super Condensed
Neighborhoods. The algorithm runs in O(m s), where m is the
pattern size and s is the size of the super condensed
neighborhood. Previous algorithms took O(m c) time, where c is the
size of the condensed neighborhood. We present an implementation of the
algorithm using modern Bit-Parallelism and Increased Bit-Parallelism
techniques.

 

Date: 2005-Feb-10     Time: 16:30:00     Room: Room 425


For more information: