[Tutor] regex question

Richard D. Moores rdmoores at gmail.com
Wed Jan 5 05:44:11 CET 2011


On Tue, Jan 4, 2011 at 14:58, Steven D'Aprano <steve at pearwood.info> wrote:
> Dave Angel wrote:
>
>> One hazard is if the string the user inputs has any regex special
>> characters in it.  If it's anything but letters and digits you probably want
>> to escape it before combining it with your \\b strings.
>
> It is best to escape any user-input before passing it to regex regardless.
> The re.escape function will do the right thing whether the string is all
> letters and digits or not.
>
>>>> re.escape("dev")
> 'dev'
>>>> re.escape("dev+")
> 'dev\\+'

I didn't know about re.escape.

from the 3.1.3 docs:
re.escape(string)
    Return string with all non-alphanumerics backslashed; this is
useful if you want to match an arbitrary literal string that may have
regular expression metacharacters in it.

I'm writing the script for my own use, and don't expect to be
searching on non-alphanumerics. Even so, I'd like to incorporate
re.escape. However, I'm using ' ' to set case sensitive searches, and
'=' to set word searches. Would you take a look at my revised script
at <http://tutoree7.pastebin.com/wQHVV68U>, lines 72-97? I tried using
line 80, but I can't because '=' is a regular expression
metacharacter. I could use some other character instead of '=', but I
would want it to be one that can be typed easily without using the
shift key. '=' is the best, I think. I did try to use 'qq' instead of
'=', but that got messy. Or is there another, completely different
way to do what I do in lines 72-97 with ' ' and '=' that wouldn't
involve increasing the number of prompts? Right now, the user has to
respond to 4 prompts, even though some responses are quickly made:
either by entering nothing, or by entering anything.

Dick


More information about the Tutor mailing list