A gnarly little python loop
showell30 at yahoo.com
Sun Nov 11 20:16:06 CET 2012
On Nov 11, 10:34 am, Peter Otten <__pete... at web.de> wrote:
> Steve Howell wrote:
> > On Nov 11, 1:09 am, Paul Rubin <no.em... at nospam.invalid> wrote:
> >> Cameron Simpson <c... at zip.com.au> writes:
> >> > | I'd prefer the original code ten times over this inaccessible beast.
> >> > Me too.
> >> Me, I like the itertools version better. There's one chunk of data
> >> that goes through a succession of transforms each of which
> >> is very straightforward.
> > Thanks, Paul.
> > Even though I supplied the "inaccessible" itertools version, I can
> > understand why folks find it inaccessible. As I said to the OP, there
> > was nothing wrong with the original imperative approach; I was simply
> > providing an alternative.
> > It took me a while to appreciate itertools, but the metaphor that
> > resonates with me is a Unix pipeline. It's just a metaphor, so folks
> > shouldn't be too literal, but the idea here is this:
> > page_nums -> pages -> valid_pages -> tweets
> > The transforms are this:
> > page_nums -> pages: call API via imap
> > pages -> valid_pages: take while true
> > valid_pages -> tweets: use chain.from_iterable to flatten results
> > Here's the code again for context:
> > def get_tweets(term):
> > def get_page(page):
> > return getSearch(term, page)
> > page_nums = itertools.count(1)
> > pages = itertools.imap(get_page, page_nums)
> > valid_pages = itertools.takewhile(bool, pages)
> > tweets = itertools.chain.from_iterable(valid_pages)
> > return tweets
> Actually you supplied the "accessible" itertools version. For reference,
> here's the inaccessible version:
> class api:
> """Twitter search API mock-up"""
> pages = [
> ["a", "b", "c"],
> ["d", "e"],
> def GetSearch(term, page):
> assert term == "foo"
> assert page >= 1
> if page > len(api.pages):
> return 
> return api.pages[page-1]
> from collections import deque
> from functools import partial
> from itertools import chain, count, imap, takewhile
> def process(tweet):
> print tweet
> term = "foo"
> takewhile(bool, imap(partial(api.GetSearch, term), count(1))))),
I know Peter's version is tongue in cheek, but I do think that it has
a certain expressive power, and it highlights three mind-expanding
Here's a re-flattened take on Peter's version ("Flat is better than
nested." -- PEP 20):
term = "foo"
search = partial(api.GetSearch, term)
nums = count(1)
paged_tweets = imap(search, nums)
paged_tweets = takewhile(bool, paged_tweets)
tweets = chain.from_iterable(paged_tweets)
processed_tweets = imap(process, tweets)
The use of deque to exhaust an iterator is slightly overboard IMHO,
but all the other lines of code can be fairly easily understood once
you read the docs.
count, imap, takewhile, chain.from_iterable:
More information about the Python-list