A gnarly little python loop

Steve Howell showell30 at yahoo.com
Sun Nov 11 18:16:06 CET 2012


On Sunday, November 11, 2012 1:54:46 AM UTC-8, Peter Otten wrote:
> Paul Rubin wrote:
> 
> 
> 
> > Cameron Simpson <cs at zip.com.au> writes:
> 
> >> | I'd prefer the original code ten times over this inaccessible beast.
> 
> >> Me too.
> 
> > 
> 
> > Me, I like the itertools version better.  There's one chunk of data
> 
> > that goes through a succession of transforms each of which
> 
> > is very straightforward.
> 
> 
> 
> [Steve Howell]
> 
> >     def get_tweets(term, get_page):
> 
> >         page_nums = itertools.count(1)
> 
> >         pages = itertools.imap(api.getSearch, page_nums)
> 
> >         valid_pages = itertools.takewhile(bool, pages)
> 
> >         tweets = itertools.chain.from_iterable(valid_pages)
> 
> >         return tweets
> 
>  
> 
> 
> 
> But did you spot the bug(s)?
> 

My first version was sketching out the technique, and I don't have handy access to the API.

Here is an improved version:

    def get_tweets(term):
        def get_page(page):
            return getSearch(term, page)
        page_nums = itertools.count(1)
        pages = itertools.imap(get_page, page_nums)
        valid_pages = itertools.takewhile(bool, pages)
        tweets = itertools.chain.from_iterable(valid_pages)
        return tweets

    for tweet in get_tweets("foo"):
            process(tweet)

This is what I used to test it:


    def getSearch(term = "foo", page = 1):
        # simulate api for testing
        if page < 5:
            return [
                'page %d, tweet A for term %s' % (page, term),
                'page %d, tweet B for term %s' % (page, term),
            ]
        else:
            return None

    def process(tweet):
        print tweet


More information about the Python-list mailing list