[Python-ideas] Integrate some itertools into the Python syntax
Stephen J. Turnbull
stephen at xemacs.org
Sat Mar 26 13:00:21 EDT 2016
Chris Barker writes:
> I understand the focus on iterables -- for performance reasons if
> nothing else, but I think it does, in fact, make the language more
> complex. and I think we should think about making the language more
> iterable-focused.
The *language* has always been iterable-focused. Python doesn't have
any counting loops! range() is a builtin (and a necessary one), but
not part of the language. for and while are part of the language, and
they are both "do until done" loops, not "do n times" loops.
I think the problem is more on the user (and teacher) side. We learn
that a square has four sides, but perhaps it's more Pythonic(?) to
think of a square as the process "do until here == start: forward 1
furlong; right 90". But *neither* (or *both*) is how the children
I've observed think of it. They draw four sides, the first three
quite straight, and the last one *warped as necessary* to join with
the first. (And adult practice is even more varied: drawing two
parallel sides first, then joining them at the ends. In the case of
the Japanese and Chinese, a square has *three* sides: the left
vertical is drawn, then the top and right are drawn as one stroke, and
finally the horizontal base.[2])
People don't think like the algorithms that are convenient for us to
teach computers.
> And it came up in another recent thred about a mechanism for doing
> something after a for loop that didn't loop -- in that case, for
> sequences, the idiom is obvious:
>
> if seq:
> do_the_for_loop
> else:
> do_somethign else since the loop wont have run.
>
> But when you plug in an arbitrary iterable into that, it doesn't
> work, and there is no easy, obvious, and robust idiom to replace
> that with. I don't know that that particular issue needs to be
> solved, but it makes my point
But does it? That thread never did present a real use case for
"empty:" with an iterator. Evidently the OP has one, but we didn't
get to see it. All of the realistic iterator cases I can think of are
*dynamic*: eg RSS or Twitter feeds. In those cases, it's not that the
iterator is empty, it's that it's in a wait state. Even an empty
database cursor can be interpreted that way. (If you didn't expect
updates, why are you using a database?)
I am inclined to think this is a general point, that is, the problem
is not *empty* iterators vs. *non-empty* ones. It's iterators that
have produced values recently vs. those that haven't. The empty
vs. non-empty distinction is a property of *sequences*, including
buffers (which are associated with iterators).
> but now that file objects ar iterable, I can do:
>
> for line in the_file:
> ....
>
> much nicer!
>
> but it breaks when I want to debug and try to do:
>
> for line in the_file[:10]:
> ...
>
> arrgg! files are not indexable!.
No, they are *enumerable*:
for i, line in enumerate(the_file):
if i >= 10: break
...
I guess your request to make iterators more friendly is a good part of
why enumerate() got promoted to builtin.
> So maybe there should be some ability to index / slice iterables?
There's no way to index an iterator, except to enumerate it. Then,
"to memoize or not to memoize, that is the question." Slicing makes
more sense to me, but again the fact that your discarded data may or
may not be valuable means that you need to make a choice between
memoizing and not doing so. Putting that in the API is complexity, or
perhaps even complication. If you just want head or tail, then
takewhile or dropwhile from itertools is your friend. (I have no
opinion -- not even -0 -- on whether promoting those functions to
builtin is a good idea.)
> But aside from that -- just the idea that looking at how to make
> iterable a more "natural" part of the language is a good thing.
I think it's from Zen and the Art of Motorcycle Maintenance (though
Pirsig may have been quoting), but I once read the advice: "If you
want to paint a perfect painting, make yourself perfect and then paint
naturally." I think iterators are just "unnatural" to a lot of
people, and will remain unnatural until people evolve. Which they may
not!
Real life "do until done" tasks are careers ("do ... until dead") or
have timeouts ("do n times: break if done else ..."). In computation
that would be analogous to scrolling a Twitter feed vs. grabbing a
pageful. In the context of this discussion, a feed is something you
wait for (and maybe timeout and complain to the operator if it blocks
too long), while you can apply len() to pages. And you know which is
which in all applications I've dealt with -- except for design of
abstract programming language facilities like "for ... in". The point
being that "real life" examples don't seem to help people's intuition
on the Python versions.
More information about the Python-ideas
mailing list