[Python-ideas] Integrate some itertools into the Python syntax
Michael Selik
mike at selik.org
Wed Mar 23 15:24:29 EDT 2016
On Wed, Mar 23, 2016 at 1:39 PM Andrew Barnert <abarnert at yahoo.com> wrote:
> On Mar 23, 2016, at 10:13, Michael Selik <mike at selik.org> wrote:
> >
> >> On Mar 23, 2016, at 6:37 AM, Chris Angelico <rosuav at gmail.com> wrote:
> >>
> >> On Wed, Mar 23, 2016 at 9:04 PM, Michel Desmoulin
> >> <desmoulinmichel at gmail.com> wrote:
> >>>> (Whether or not to make slice notation usable outside subscript
> >>>> operations could then be tackled as an independent question)
> >>>>
> >>>> For itertools.chain, it may make sense to simply promote it to the
> builtins.
> >>>
> >>> Same problem as with new keywords : it can be a problem with people
> >>> using chain as a var name.
> >>
> >> Less of a problem though - it'd only be an issue for people who (a)
> >> use chain as a var name, and (b) want to use the new shorthand. Their
> >> code will continue to work identically with itertools.chain (or not
> >> using it at all). With a new keyword, their code would instantly fail.
> >
> > I enjoy using ``chain`` as a variable name. It has many meanings outside
> of iteration tools. Three cheers for namespaces.
>
> As Chris just explained in the message you're replying to, this wouldn't
> affect you. I've used "vars" and "compile" and "dir" as local variables
> many times; as long as I don't need to call the builtin in the same scope,
> it's not a problem. The same would be true if you keep using "chain".
> Unless you want to chain iterables of your chains together, it'll never
> arise.
>
Depends what you mean by "affect". It'll affect how I read my colleagues'
code. I want to see ``from itertools import chain`` at the top of their
modules if they're using chain in the sense of chaining iterables.
> Also, putting things in builtins isn't _always_ bad. It's just a high
> standard that has to be met. I don't think anyone believes any, all, and
> enumerate fail to meet that standard. So the question is just whether chain
> meets it.
>
Part of why ``any`` and ``all`` succeed as builtins is their natural
language meanings are quite close to their Python meaning. ``enumerate``
succeeds because it is unusual in natural language. For many people, Python
might be the most frequent context for using that word.
Some evidence that ``chain`` should be promoted to builtin would be that
``chain`` is used more often than the rest of the itertools library and
that the word "chain" is rarely used except as the itertools chain. Luckily
one could search some public projects on GitHub to provide that evidence.
If no one bothers to do so, I'm guessing the desire isn't so great.
> My problem is that I'm not sure chain really does meet it. It's
> chain.from_iterable that I often see people reaching for and not finding,
> and moving chain to builtins won't help those people find it. (This is
> compounded by the fact that even a novice can figure out how to do chain
> given chain.from_iterable, but not the other way around.) Also, for
> something we expect novices to start using as soon as they discover
> iterators, it seems worrisome that we'd be expecting them to understand the
> idea of a static method on something that looks like a function but is
> actually a type before they can have a clue of what it means.
>
>
> In another thread a year or two ago, someone suggested making
> chain.from_iterable into a builtin with a different name, maybe "flatten".
> But that now means we're adding _two_ builtins rather than one, and they're
> very closely related but don't appear so in the docs, which obviously
> increases the negatives...
>
> Still, I like adding chain (and/or flatten) to builtins a lot more than I
> like adding sequence behavior to some iterators, or adding a whole new kind
> of function-like slicing syntax to all iterables, or any of the other
> proposals here.
>
I like the LazyList you wrote. If the need for operator-style slicing on
iterators were great, I think we'd find a decent amount of usage of such a
wrapper. Its absence is evidence that ``from itertools import islice`` is
more pleasant.
As has been discussed many times, lazy sequences like range cause some
confusion by providing both generator behavior and getitem/slicing. Perhaps
it's better to keep the two varieties of iterables, lazy and non-lazy, more
firmly separated by having ``lst[a:b]`` used exclusively for true sequences
and ``islice(g, a, b)`` for generators.
Just yesterday I got frustrated by the difference of return types from
subscripting bytes versus str while trying to keep my code 2/3 compatible:
``bytestring[0]`` gives an integer while ``textstring[0]`` gives a 1-length
str. I had to resort to the awkwardness of ``s[0:1]`` not knowing whether
I'll have a Python 3 bytes or Python 2 str. I prefer imports to
inconsistency.
A couple questions to help clarify the situation:
1. Do you prefer ``a.union(b)`` or ``a | b``?
2. Do you prefer ``datetime.now() + timedelta(days=5)`` or
``5.days.from_now``?
I think the answer to #2 is clear, we prefer Python to Ruby (more
specifically the Rails sub-community). The answer to #1 is more difficult.
I'm often tempted to say ``a | b`` for its elegance, but I keep coming back
to ``a.union(b)`` as clunky but readable, easy to explain, etc.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20160323/707e7bfb/attachment.html>
More information about the Python-ideas
mailing list