Something weird about re.finditer()

Jeremiah Dodds jeremiah.dodds at gmail.com
Wed Apr 15 11:01:56 CEST 2009


On Wed, Apr 15, 2009 at 9:46 AM, Gilles Ganault <nospam at nospam.com> wrote:

> Hello
>
>        I stumbled upon something funny while downloading web pages and
> trying to extract one or more blocks from a page: Even though Python
> seems to return at least one block, it doesn't actually enter the for
> loop:
>
> ======
> re_block = re.compile('before (.+?) after',re.I|re.S|re.M)
>
> #Here, get web page and put it into "response"
>
> blocks = None
> blocks = re_block.finditer(response)
> if blocks == None:
>        print "No block found"
> else:
>        print "Before blocks"
>        for block in blocks:
>                #Never displayed!
>                print "In blocks"
> ======
>
> Since "blocks" is no longer set to None after calling finditer()...
> but doesn't contain a single block... what does it contain then?
>
> Thank you for any tip.
> --
> http://mail.python.org/mailman/listinfo/python-list
>


It contains an iterator. It's just an iterator with nothing in it:

In [232]: import re

In [233]: re_block = re.compile('before (.+?) after',re.I|re.S|re.M)

In [234]: blocks = None

In [235]: blocks = re_block.finditer('Hi There Im not going to match
anything')

In [236]: blocks
Out[236]: <callable-iterator object at 0xb75f440c>

In [237]: blocks == None
Out[237]: False

In [238]: for block in blocks:
   .....:     print block
   .....:

In [239]: type(blocks)
Out[239]: <type 'callable-iterator'>

In [241]: blocks.next()
---------------------------------------------------------------------------
StopIteration                             Traceback (most recent call last)

/home/jeremiah/<ipython console> in <module>()

StopIteration:


Maybe you should just use a different method, like findall, where you can
check the length of it. Or, if you don't need to do anything special when
there aren't any blocks, you could continue like this.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-list/attachments/20090415/acd2c2b6/attachment.html>


More information about the Python-list mailing list