regex question

Paul McGuire ptmcg at austin.rr._bogus_.com
Mon Jan 8 08:20:38 CET 2007


"proctor" <12cc104 at gmail.com> wrote in message 
news:<1168232001.377605.236270 at 11g2000cwr.googlegroups.com>...
> hello,
>
> i hope this is the correct place...
>
> i have an issue with some regex code i wonder if you have any insight:
>
> ================

There's nothing actually *wrong* wth your regex.  The problem is your 
misunderstanding of raw string notation.  In building up your regex, do not 
start the string with "r'" and end it with a "'".

def makeRE(w):
    print w + " length = " + str(len(w))
    # reString = "r'" + w[:1]
    reString = w[:1]
    w = w[1:]
    if len(w) > 0:
        for c in (w):
            reString += "|" + c
        # reString += "'"
    print "reString = " + reString
    return reString

Or even better:

def makeRE(w):
    print w + " length = " + str(len(w))
    reString = "|".join(list(w))
    return reString

Raw string notation is intended to be used when the string literal is in 
your Python code itself, for example, this is a typical use for raw strings:

ipAddrRe = r'\d{1,3}(\.\d{1,3}){3}'

If I didn't have raw string notation to use, I'd have to double up all the 
backslashes, as:

ipAddrRe = '\\d{1,3}(\\.\\d{1,3}){3}'

But no matter which way I create the string, it does not actually start with 
"r'" and end with "'", those are just notations for literals that are part 
of your Python source.

Does this give you a better idea of what is happening?

-- Paul





More information about the Python-list mailing list