ignore case only for a part of the regex?
steve+comp.lang.python at pearwood.info
Tue Jan 1 05:14:26 CET 2013
On Sun, 30 Dec 2012 10:20:19 -0500, Roy Smith wrote:
> The way I would typically do something like this is build my regexes in
> all lower case and .lower() the text I was matching against them. I'm
> curious what you're doing where you want to enforce case sensitivity in
> one part of a header, but not in another.
Well, sometimes you have things that are case sensitive, and other things
which are not, and sometimes you need to match them at the same time. I
don't think this is any more unusual than (say) wanting to match an
otherwise lowercase word whether or not it comes at the start of a
is conceptually equivalent to "match case-insensitive `p`, and case-
By the way, although there is probably nothing you can (easily) do about
this prior to Python 3.3, converting to lowercase is not the right way to
do case-insensitive matching. It happens to work correctly for ASCII, but
it is not correct for all alphabetic characters.
The right way is to casefold first, then match:
Curiously, there is an uppercase ß in old German. In recent years some
typographers have started using it instead of SS, but it's still rare,
and the official German rules have ß transform into SS and vice versa.
It's in Unicode, but few fonts show it:
py> unicodedata.lookup('LATIN CAPITAL LETTER SHARP S')
More information about the Python-list