[Tutor] Issues with regex escaping on \{

Wed Jul 29 21:39:37 CEST 2009

On Wed, Jul 29, 2009, vince spicer wrote:
>On Wed, Jul 29, 2009 at 11:35 AM, gpo <goodpotatoes at yahoo.com> wrote:
>
>>
>> My regex is being run in both Python v2.6 and v3.1
>> For this example, I'll give one line.  This lines will be read out of log
>> files.  I'm trying to get the GUID for the User ID to query a database with
>> it, so I'd like a sub match.  Here is the code
>> -----------------
>> import re
>> line = '>Checking Privilege for UserId:
>> {88F96ED2-D471-DE11-95B6-0050569E7C88}, PrivilegeId:
>> {71AD2527-8494-4654-968D-FE61E9A6A9DF}. Returned hr = 0'
>> pUserID=re.compile('UserID: \{(.+)\}',re.I)  #Sub match is one or more
>> characters between the first set of squigglies immediately following
>> 'UserID: '
>>
>> #the output is:
>> (re.search(pUserID,line)).group(1)
>> '88F96ED2-D471-DE11-95B6-0050569E7C88}, PrivilegeId:
>> {71AD2527-8494-4654-968D-FE61E9A6A9DF'
>> -----------
>> Why isn't the match terminating after it finds the first \}  ?
...
>your grouping (.+) appears to be greedy, you can make it non-greedy with a
>question mark

As a general rule, it's a Good Idea(tm) to write regular
expressions using the raw quote syntax.  Instead of:

	re.compile('UserID: \{(.+)\}',...)

Use:

	re.compile(r'UserID: \{(.+)\}',...)

The alternative is to backwhack any special characters with an
appropriate number if ``\'' characters, whatever that may be.

Bill
-- 
INTERNET:   bill at celestial.com  Bill Campbell; Celestial Software LLC
URL: http://www.celestial.com/  PO Box 820; 6641 E. Mercer Way
Voice:          (206) 236-1676  Mercer Island, WA 98040-0820
Fax:            (206) 232-9186  Skype: jwccsllc (206) 855-5792

Government is a broker in pillage, and every election is a sort of advance
auction in stolen goods. -- H.L. Mencken