A gnarly little python loop
rustompmody at gmail.com
Mon Nov 12 08:09:31 CET 2012
On Nov 11, 3:58 am, Roy Smith <r... at panix.com> wrote:
> I'm trying to pull down tweets with one of the many twitter APIs. The
> particular one I'm using (python-twitter), has a call:
> data = api.GetSearch(term="foo", page=page)
> The way it works, you start with page=1. It returns a list of tweets.
> If the list is empty, there are no more tweets. If the list is not
> empty, you can try to get more tweets by asking for page=2, page=3, etc.
> I've got:
> page = 1
> while 1:
> r = api.GetSearch(term="foo", page=page)
> if not r:
> for tweet in r:
> page += 1
> It works, but it seems excessively fidgety. Is there some cleaner way
> to refactor this?
This is a classic problem -- structure clash of parallel loops -- nd
Steve Howell has given the classic solution using the fact that
generators in python simulate/implement lazy lists.
As David Beazley http://www.dabeaz.com/coroutines/ explains,
coroutines are more general than generators and you can use those if
The classic problem used to be stated like this:
There is an input in cards of 80 columns.
It needs to be copied onto printer of 132 columns.
The structure clash arises because after reading 80 chars a new card
has to be read; after printing 132 chars a linefeed has to be given.
To pythonize the problem, lets replace the 80,132 by 3,4, ie take the
The important difference (explained nicely by Beazley) is that in
generators the for-loop pulls the generators, in coroutines, the
'generator' pushes the consuming coroutines.
from __future__ import print_function
s= ["abc", "def", "ghi"]
# Coroutine-infrastructure from pep 342
gen = func(*args, **kw)
for i in range(0,4):
print((yield), sep='', end='')
print("\n", sep='', end='')
def genStage(s, target):
for line in s:
for i in range(0,3):
if __name__ == '__main__':
More information about the Python-list