[Tutor] Why are these results different?

Kent Johnson kent37 at tds.net
Thu Nov 19 12:52:43 CET 2009


On Thu, Nov 19, 2009 at 3:24 AM, Stephen Nelson-Smith
<sanelson at gmail.com> wrote:
> I'm seeing different behaviour between code that looks to be the same.
>  It obviously isn't the same, so I've misunderstood something:
>
>
>>>> log_names
> ('access', 'varnish')
>>>> log_dates
> ('20091105', '20091106')
>>>> logs = itertools.chain.from_iterable(glob.glob('%sded*/%s*%s.gz' % (source_dir, log, date)) for log in log_names for date in log_dates)
>>>> for log in logs:
> ...   print log

Here the argument to from_iterable() is a sequence of lists.
from_iterable() iterates each list in the sequence.

> However:
>
> for date in log_dates:
>  for log in log_names:
>     logs = itertools.chain.from_iterable(glob.glob('%sded*/%s*%s.gz'
> % (source_dir, log, date)))

> Gives me one character at a time when I iterate over logs.
>
Here the argument to from_iterable() is a list of strings, i.e. a
sequence of strings. from_iterable() iterates each string in the
sequence. Iterating a string yields each character in the string in
turn.

By the way do you know that the second version loops in a different
order than the first?

> Why is this?
>
> And how, then, can I make the first more readable?

Break out the argument to from_iterable() into a separate variable.
If you like spelling it out as separate loops, but you want a single
sequence, use the second form but put it in a generator function:
def log_file_names(log_names, log_dates):
  for date in log_dates:
   for log in log_names:
      for file_name in glob.glob('%sded*/%s*%s.gz' % (source_dir, log, date)):
        yield file_name

Then your client code can say
for file_name in log_file_names(log_names, log_dates):
    print file_name

If log_names is a constant you can put it into log_file_names()
instead of passing it as a parameter.

Kent


More information about the Tutor mailing list