I like your idea with AtomicIterable, but I still don't think it fully solves the problem.
How can one categorize array.array?

Consider an array of bytes.
Sometimes you'll want to iterate over them as ints sometimes as 1 byte objects.
In one case they can be used in order to conserve space as a sequence of ints in which case you'll want them one by one, and in the other case they'll be atomic just like bytes.

Giving the register option on the other hand to the end user would pose a different problem - ABC.register is global. If you're developing a module and register array.array as an AtomicIterable, you might affect other parts of the program in unexpected ways.
I think it's a tough one to solve :-/


On Tue, Jul 26, 2016, 7:52 AM Nick Coghlan <ncoghlan@gmail.com> wrote:
On 26 July 2016 at 03:58, Guido van Rossum <guido@python.org> wrote:
> On Mon, Jul 25, 2016 at 10:19 AM, Gregory P. Smith <greg@krypto.org> wrote:
>> Given how often the str problem comes up in this context, I'm thinking a
>> common special case in type annotation checkers that explicitly exclude str
>> from consideration for iteration.  It's behavior is rarely what the
>> programmer intended and more likely to be a bug.
>
> Should we have IterableButNotString, CollectionExceptString,
> NonStringSequence, or all three? I suppose the runtime check would be
> e.g. isinstance(x, Iterable) and not isinstance(x, (str, bytes,
> bytearray, memoryview))? And the type checker should special-case the
> snot out of it?

For cases where str instances are handled differently from other
iterables, but still within the same function, a convention of writing
"Union[str, Iterable]" as the input parameter type may suffice - while
technically the union is redundant, the implication would be that str
instances are treated as str objects, rather than as an iterable of
length-1 str objects.

The other case where this would come up is in signature overloading,
where one overload may be typed as "str" and the other as "Iterable".
Presumably in those cases the more specific type already wins.

Anything more than that would need to be defined in the context of a
particular algorithm, such as the oft-requested generic "flatten"
operation, where you typically want to consider types like str, bytes,
bytearray and memoryview as atomic objects, rather than as containers,
but then things get blurry around iterable non-builtin types like
array.array, numpy.ndarray, Pandas data frames, image formats with
pixel addressing, etc, as well as builtins with multiple iteration
behaviours like dict.

One possible way to tackle that would be to declare a new
collections.abc.AtomicIterable ABC, as well as a collections.flatten
operation defined as something like:

    def flatten(iterable):
        for item in iterable:
           if isinstance(item, AtomicIterable):
                yield item
                continue
           if isinstance(item, Mapping):
                yield from item.items()
                continue
            try:
                subiter = iter(item)
            except TypeError:
                yield item
            else:
                yield from flatten(subiter)

Over time, different types would get explicitly registered with
AtomicIterable based on what their developers considered the most
appropriate behaviour to be when asked to flatten them.

Cheers,
Nick.

--
Nick Coghlan   |   ncoghlan@gmail.com   |   Brisbane, Australia
_______________________________________________
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/