regex confusion

Luther Barnum Spam_Sucks at rr.com
Tue Dec 9 17:16:42 CET 2003


MAybe you meant:
     import re, urllib
     rgxPrev = re.compile('.*?a.*?')

     url =
'http://nitace.bsd.uchicago.edu:8080/files/share/showdown_example2.html'
     s = urllib.urlopen(url).read()
     ***m =  match(rgxPrev,s)***
     print m
     print s.find('a')

match takes two arguments

"John Hunter" <jdhunter at ace.bsd.uchicago.edu> wrote in message
news:mailman.266.1070985064.16879.python-list at python.org...
>
> In trying to sdebug why a certain regex wasn't working like I expected
> it to, I came across this strange (to me) behavior.  The file I am
> trying to match definitely contains many instances of the letter 'a',
> so I would expect the regex
>
>   rgxPrev = re.compile('.*?a.*?')
>
> to match it the string contents of the file.  But it doesn't.  Here is
> a complete example
>
>     import re, urllib
>     rgxPrev = re.compile('.*?a.*?')
>
>     url =
'http://nitace.bsd.uchicago.edu:8080/files/share/showdown_example2.html'
>     s = urllib.urlopen(url).read()
>     m =  rgxPrev.match(s)
>     print m
>     print s.find('a')
>
> m is None (no match) and the s.find('a') reports an 'a' at index 48.
>
> I read the regex to mean non-greedy match of anything up to an a,
> followed by non-greedy match of anything following an a, which this
> file should match.
>
> Or am I insane?
>
> John Hunter
>
>
> hunter:~/python/projects/poker/data/pokerroom> uname -a
> Linux hunter.paradise.lost 2.4.20-8smp #1 SMP Thu Mar 13 17:45:54 EST 2003
i686
> i686 i386 GNU/Linux
> hunter:~/python/projects/poker/data/pokerroom> python
> Python 2.3.2 (#1, Oct 13 2003, 11:33:15)
> [GCC 3.3.1] on linux2
> Type "help", "copyright", "credits" or "license" for more information.
> Welcome to rlcompleter2 0.95
> for nice experiences hit <tab> multiple times
>
>






More information about the Python-list mailing list