Re module help?
Tim Peters
tim_one at email.msn.com
Tue Jan 4 03:09:38 EST 2000
[Elf Sternberg]
> There is a line in 'xmllib.py' that has me puzzled. I
> thought I understood regular expressions and XML, but this
> one has me confused.
>
> interesting = re.compile('[]&<]')
>
> I don't get it. What is this looking for?
One of the three characters
] & <
> The '[]' can't mean an empty set, can it?
Right, the way to write a character class that can't match anything is
[^\000-\377]
<arghggh!>.
> That doesn't make any sense. Is this a special case
> where '[]...' allows you to include the ']' in things
> to search for without escaping it?
Yes, and it's common in regexp pkgs. I think the line would be much clearer
as
interesting = re.compile(r'[\]&<]')
There are similarly revolting special cases involving "-" in character sets,
which also ascribe meaning to what *looks* like an error. That's a very
un-Pythonic thing to do, but Python's re package is intended to be
compatible with Perl's.
> Or is it, in fact, a bug?
Perl? Yes <wink>.
non-judgmentally y'rs - tim
More information about the Python-list
mailing list