[Python-ideas] Regex support code as std lib functions

MRAB python at mrabarnett.plus.com
Tue Sep 25 18:31:44 CEST 2012


On 2012-09-25 17:05, Matt Chaput wrote:
> I'm not sure where regex is in its long march toward replacing re, but I
> just noticed something interesting: the regex module seems to include
> implementations of some useful functions to support its regular
> expression matching, e.g.:
>
> - Levenshtein distance
>
> - Unicode case folding
>
> Both of these would be useful as fast functions in the std lib. If/when
> regex replaces re, any possibility all the useful functions that support
> it could be added to the std lib in the appropriate modules as part of
> integrating it?
>
Python 3.3 includes case-folding:

 >>> "\N{LATIN SMALL LETTER SHARP S}".casefold()
'ss'

The regex module doesn't support Levenshtein distance as such, instead 
it supports fuzzy (approximate) matching, where you're concerned not so
much about the _minimum_ edit distance as whether there are no more
than a certain number of errors when matching a regex pattern.

It would be more efficient to implement Levenshtein distance separately.



More information about the Python-ideas mailing list