[Tutor] Regex confusion
Michael P. Reilly
arcege@shore.net
Fri, 17 Dec 1999 09:41:17 -0500 (EST)
[Charset Windows-1252 unsupported, skipping...]
> I'm looking at some Python code (from the DT_HTML.py module of Zope,
> actually) that uses a regex expression that I can't figure out. Perhaps you
> could point me in the right direction?
>
> name_match=regex.compile('[\0- ]*[a-zA-Z]+[\0- ]*').match
> end_match=regex.compile('[\0- ]*\(/\|end\)',regex.casefold).match
> start_search=regex.compile('[<&]').search
>
> The phrase "[\0- ]" in the first two lines confuses me. Is this a group
> reference? And if so, what is group 0? I thought the numbering for match
> groups started at 1. Logically, it would seem that it this phrase is a means
> for grabbing up leading and trailing dashes and whitespace. But it's beyond
> me to figure out how "\0" figures into this.
>
> Also, the ampersand (&) in the third line is a problem for me. I've looked
> through my copy of "Mastering Regular Expressions" and can't find any
> reference to a "&" metacharacter. What am I overlooking?
Hi Jeffrey,
The regular expression [...] is commonly called a character class, it
matches any one character against the characters inside the brackets.
[<&] - one of the two characters "<" or "&"
[\0- ] - any character with ASCII value between 0 ('\0') and 32 (' ')
If you include a caret (^) immediately after the left bracket ([), then
matching is against characters not in the class.
-Arcege
--
------------------------------------------------------------------------
| Michael P. Reilly, Release Engineer | Email: arcege@shore.net |
| Salem, Mass. USA 01970 | |
------------------------------------------------------------------------