[Tutor] Issues with regex escaping on \{
Bill Campbell
bill at celestial.net
Wed Jul 29 21:39:37 CEST 2009
On Wed, Jul 29, 2009, vince spicer wrote:
>On Wed, Jul 29, 2009 at 11:35 AM, gpo <goodpotatoes at yahoo.com> wrote:
>
>>
>> My regex is being run in both Python v2.6 and v3.1
>> For this example, I'll give one line. This lines will be read out of log
>> files. I'm trying to get the GUID for the User ID to query a database with
>> it, so I'd like a sub match. Here is the code
>> -----------------
>> import re
>> line = '>Checking Privilege for UserId:
>> {88F96ED2-D471-DE11-95B6-0050569E7C88}, PrivilegeId:
>> {71AD2527-8494-4654-968D-FE61E9A6A9DF}. Returned hr = 0'
>> pUserID=re.compile('UserID: \{(.+)\}',re.I) #Sub match is one or more
>> characters between the first set of squigglies immediately following
>> 'UserID: '
>>
>> #the output is:
>> (re.search(pUserID,line)).group(1)
>> '88F96ED2-D471-DE11-95B6-0050569E7C88}, PrivilegeId:
>> {71AD2527-8494-4654-968D-FE61E9A6A9DF'
>> -----------
>> Why isn't the match terminating after it finds the first \} ?
...
>your grouping (.+) appears to be greedy, you can make it non-greedy with a
>question mark
As a general rule, it's a Good Idea(tm) to write regular
expressions using the raw quote syntax. Instead of:
re.compile('UserID: \{(.+)\}',...)
Use:
re.compile(r'UserID: \{(.+)\}',...)
The alternative is to backwhack any special characters with an
appropriate number if ``\'' characters, whatever that may be.
Bill
--
INTERNET: bill at celestial.com Bill Campbell; Celestial Software LLC
URL: http://www.celestial.com/ PO Box 820; 6641 E. Mercer Way
Voice: (206) 236-1676 Mercer Island, WA 98040-0820
Fax: (206) 232-9186 Skype: jwccsllc (206) 855-5792
Government is a broker in pillage, and every election is a sort of advance
auction in stolen goods. -- H.L. Mencken
More information about the Tutor
mailing list